scholarly journals SNP-based heritability estimation: measurement noise, population stratification, and stability

2016 ◽  
Author(s):  
Eric R. Gamazon ◽  
Danny S. Park

Siddharth Krishna Kumar1 and co-authors claim to have shown that “GCTA applied to current SNP data cannot produce reliable or stable estimates of heritability.” Given the numerous recent studies on the genetic architecture of complex traits that are based on this methodology, these claims have important implications for the field. Through an investigation of the stability of the likelihood function under phenotype perturbation and an analysis of its dependence on the spectral properties of the genetic relatedness matrix, our study characterizes the properties of an important approach to the analysis of GWAS data and identified crucial errors in the authors’ analyses, invalidating their main conclusions.

2018 ◽  
Author(s):  
Yizhen Zhong ◽  
Minoli Perera ◽  
Eric R. Gamazon

AbstractBackgroundUnderstanding the nature of the genetic regulation of gene expression promises to advance our understanding of the genetic basis of disease. However, the methodological impact of use of local ancestry on high-dimensional omics analyses, including most prominently expression quantitative trait loci (eQTL) mapping and trait heritability estimation, in admixed populations remains critically underexplored.ResultsHere we develop a statistical framework that characterizes the relationships among the determinants of the genetic architecture of an important class of molecular traits. We estimate the trait variance explained by ancestry using local admixture relatedness between individuals. Using National Institute of General Medical Sciences (NIGMS) and Genotype-Tissue Expression (GTEx) datasets, we show that use of local ancestry can substantially improve eQTL mapping and heritability estimation and characterize the sparse versus polygenic component of gene expression in admixed and multiethnic populations respectively. Using simulations of diverse genetic architectures to estimate trait heritability and the level of confounding, we show improved accuracy given individual-level data and evaluate a summary statistics based approach. Furthermore, we provide a computationally efficient approach to local ancestry analysis in eQTL mapping while increasing control of type I and type II error over traditional approaches.ConclusionOur study has important methodological implications on genetic analysis of omics traits across a range of genomic contexts, from a single variant to a prioritized region to the entire genome. Our findings highlight the importance of using local ancestry to better characterize the heritability of complex traits and to more accurately map genetic associations.


2020 ◽  
Author(s):  
Zhaotong Lin ◽  
Souvik Seal ◽  
Saonli Basu

AbstractSNP heritability of a trait is measured by the proportion of total variance explained by the additive effects of genome-wide single nucleotide polymorphisms (SNPs). Linear mixed models are routinely used to estimate SNP heritability for many complex traits. The basic concept behind this approach is to model genetic contribution as a random effect, where the variance of this genetic contribution attributes to the heritability of the trait. This linear mixed model approach requires estimation of ‘relatedness’ among individuals in the sample, which is usually captured by estimating a genetic relationship matrix (GRM). Heritability is estimated by the restricted maximum likelihood (REML) or method of moments (MOM) approaches, and this estimation relies heavily on the GRM computed from the genetic data on individuals. Presence of population substructure in the data could significantly impact the GRM estimation and may introduce bias in heritability estimation. The common practice of accounting for such population substructure is to adjust for the top few principal components of the GRM as covariates in the linear mixed model. Here we propose an alternative way of estimating heritability in multi-ethnic studies. Our proposed approach is a MOM estimator derived from the Haseman-Elston regression and gives an asymptotically unbiased estimate of heritability in presence of population stratification. It introduces adjustments for the population stratification in a second-order estimating equation and allows for the total phenotypic variance vary by ethnicity. We study the performance of different MOM and REML approaches in presence of population stratification through extensive simulation studies. We estimate the heritability of height, weight and other anthropometric traits in the UK Biobank cohort to investigate the impact of subtle population substructure on SNP heritability estimation.


2017 ◽  
Vol 114 (32) ◽  
pp. 8602-8607 ◽  
Author(s):  
Loic Yengo ◽  
Zhihong Zhu ◽  
Naomi R. Wray ◽  
Bruce S. Weir ◽  
Jian Yang ◽  
...  

Quantifying the effects of inbreeding is critical to characterizing the genetic architecture of complex traits. This study highlights through theory and simulations the strengths and shortcomings of three SNP-based inbreeding measures commonly used to estimate inbreeding depression (ID). We demonstrate that heterogeneity in linkage disequilibrium (LD) between causal variants and SNPs biases ID estimates, and we develop an approach to correct this bias using LD and minor allele frequency stratified inference (LDMS). We quantified ID in 25 traits measured in ∼140,000 participants of the UK Biobank, using LDMS, and confirmed previously published ID for 4 traits. We find unique evidence of ID for handgrip strength, waist/hip ratio, and visual and auditory acuity (ID between −2.3 and −5.2 phenotypic SDs for complete inbreeding; P<0.001). Our results illustrate that a careful choice of the measure of inbreeding combined with LDMS stratification improves both detection and quantification of ID using SNP data.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Marion Patxot ◽  
Daniel Trejo Banos ◽  
Athanasios Kousathanas ◽  
Etienne J. Orliac ◽  
Sven E. Ojavee ◽  
...  

AbstractWe develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32–44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.


Author(s):  
Xintian Zhu ◽  
Hans Peter Maurer ◽  
Mario Jenz ◽  
Volker Hahn ◽  
Arno Ruckelshausen ◽  
...  

Abstract Key message The phenomic predictive ability depends on the genetic architecture of the target trait, being high for complex traits and low for traits with major QTL. Abstract Genomic selection is a powerful tool to assist breeding of complex traits, but a limitation is the costs required for genotyping. Recently, phenomic selection has been suggested, which uses spectral data instead of molecular markers as predictors. It was shown to be competitive with genomic prediction, as it achieved predictive abilities as high or even higher than its genomic counterpart. The objective of this study was to evaluate the performance of phenomic prediction for triticale and the dependency of the predictive ability on the genetic architecture of the target trait. We found that for traits with a complex genetic architecture, like grain yield, phenomic prediction with NIRS data as predictors achieved high predictive abilities and performed better than genomic prediction. By contrast, for mono- or oligogenic traits, for example, yellow rust, marker-based approaches achieved high predictive abilities, while those of phenomic prediction were very low. Compared with molecular markers, the predictive ability obtained using NIRS data was more robust to varying degrees of genetic relatedness between the training and prediction set. Moreover, for grain yield, smaller training sets were required to achieve a similar predictive ability for phenomic prediction than for genomic prediction. In addition, our results illustrate the potential of using field-based spectral data for phenomic prediction. Overall, our result confirmed phenomic prediction as an efficient approach to improve the selection gain for complex traits in plant breeding.


Author(s):  
Armin P. Schoech ◽  
Omer Weissbrod ◽  
Luke J. O’Connor ◽  
Nick Patterson ◽  
Huwenbo Shi ◽  
...  

AbstractMost models of complex trait genetic architecture assume that signed causal effect sizes of each SNP (defined with respect to the minor allele) are uncorrelated with those of nearby SNPs, but it is currently unknown whether this is the case. We develop a new method, autocorrelation LD regression (ACLR), for estimating the genome-wide autocorrelation of causal minor allele effect sizes as a function of genomic distance. Our method estimates these autocorrelations by regressing the products of summary statistics on distance-dependent LD scores. We determined that ACLR robustly assesses the presence or absence of nonzero autocorrelation, producing unbiased estimates with well-calibrated standard errors in null simulations regardless of genetic architecture; if true autocorrelation is nonzero, ACLR correctly detects its sign, although estimates of the autocorrelation magnitude are susceptible to bias in cases of certain genetic architectures. We applied ACLR to 31 diseases and complex traits from the UK Biobank (average N=331K), meta-analyzing results across traits. We determined that autocorrelations were significantly negative at distances of 1-50bp (P = 8 × 10−6, point estimate −0.35 ±0.08) and 50-100bp (P = 2 × 10−3, point estimate −0.33 ± 0.11). We show that the autocorrelation is primarily driven by pairs of SNPs in positive LD, which is consistent with the expectation that linked SNPs with opposite effects are less impacted by natural selection. Our findings suggest that this mechanism broadly affects complex trait genetic architectures, and we discuss implications for association mapping, heritability estimation, and genetic risk prediction.


Author(s):  
Farshad BahooToroody ◽  
Saeed Khalaj ◽  
Leonardo Leoni ◽  
Filippo De Carlo ◽  
Gianpaolo Di Bona ◽  
...  

Geosynthetics are extensively utilized to improve the stability of geotechnical structures and slopes in urban areas. Among all existing geosynthetics, geotextiles are widely used to reinforce unstable slopes due to their capabilities in facilitating reinforcement and drainage. To reduce settlement and increase the bearing capacity and slope stability, the classical use of geotextiles in embankments has been suggested. However, several catastrophic events have been reported, including failures in slopes in the absence of geotextiles. Many researchers have studied the stability of geotextile-reinforced slopes (GRSs) by employing different methods (analytical models, numerical simulation, etc.). The presence of source-to-source uncertainty in the gathered data increases the complexity of evaluating the failure risk in GRSs since the uncertainty varies among them. Consequently, developing a sound methodology is necessary to alleviate the risk complexity. Our study sought to develop an advanced risk-based maintenance (RBM) methodology for prioritizing maintenance operations by addressing fluctuations that accompany event data. For this purpose, a hierarchical Bayesian approach (HBA) was applied to estimate the failure probabilities of GRSs. Using Markov chain Monte Carlo simulations of likelihood function and prior distribution, the HBA can incorporate the aforementioned uncertainties. The proposed method can be exploited by urban designers, asset managers, and policymakers to predict the mean time to failures, thus directly avoiding unnecessary maintenance and safety consequences. To demonstrate the application of the proposed methodology, the performance of nine reinforced slopes was considered. The results indicate that the average failure probability of the system in an hour is 2.8×10−5 during its lifespan, which shows that the proposed evaluation method is more realistic than the traditional methods.


2016 ◽  
Vol 283 (1835) ◽  
pp. 20160569 ◽  
Author(s):  
M. E. Goddard ◽  
K. E. Kemper ◽  
I. M. MacLeod ◽  
A. J. Chamberlain ◽  
B. J. Hayes

Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.


Author(s):  
Toshiyuki Sakai ◽  
Akira Abe ◽  
Motoki Shimizu ◽  
Ryohei Terauchi

Abstract Characterizing epistatic gene interactions is fundamental for understanding the genetic architecture of complex traits. However, due to the large number of potential gene combinations, detecting epistatic gene interactions is computationally demanding. A simple, easy-to-perform method for sensitive detection of epistasis is required. Due to their homozygous nature, use of recombinant inbred lines (RILs) excludes the dominance effect of alleles and interactions involving heterozygous genotypes, thereby allowing detection of epistasis in a simple and interpretable model. Here, we present an approach called RIL-StEp (recombinant inbred lines stepwise epistasis detection) to detect epistasis using single nucleotide polymorphisms in the genome. We applied the method to reveal epistasis affecting rice (Oryza sativa) seed hull color and leaf chlorophyll content and successfully identified pairs of genomic regions that presumably control these phenotypes. This method has the potential to improve our understanding of the genetic architecture of various traits of crops and other organisms.


Sign in / Sign up

Export Citation Format

Share Document