scholarly journals Genetic Architecture of Complex Traits and Disease Risk Predictors

Author(s):  
Soke Yuen Yong ◽  
Timothy G. Raben ◽  
Louis Lello ◽  
Stephen D.H. Hsu

AbstractGenomic prediction of complex human traits (e.g., height, cognitive ability, bone density) and disease risks (e.g., breast cancer, diabetes, heart disease, atrial fibrillation) has advanced considerably in recent years. Predictors have been constructed using penalized algorithms that favor sparsity: i.e., which use as few genetic variants as possible. We analyze the specific genetic variants (SNPs) utilized in these predictors, which can vary from dozens to as many as thirty thousand. We find that the fraction of SNPs in or near genic regions varies widely by phenotype. For the majority of disease conditions studied, a large amount of the variance is accounted for by SNPs outside of coding regions. The state of these SNPs cannot be determined from exome-sequencing data. This suggests that exome data alone will miss much of the heritability for these traits – i.e., existing PRS cannot be computed from exome data alone. We also study the fraction of SNPs and of variance that is in common between pairs of predictors. The DNA regions used in disease risk predictors so far constructed seem to be largely disjoint (with a few interesting exceptions), suggesting that individual genetic disease risks are largely uncorrelated. It seems possible in theory for an individual to be a low-risk outlier in all conditions simultaneously.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Soke Yuen Yong ◽  
Timothy G. Raben ◽  
Louis Lello ◽  
Stephen D. H. Hsu

2021 ◽  
Vol 118 (15) ◽  
pp. e1922305118
Author(s):  
Brooke Sheppard ◽  
Nadav Rappoport ◽  
Po-Ru Loh ◽  
Stephan J. Sanders ◽  
Noah Zaitlen ◽  
...  

Interactions between genetic variants—epistasis—is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue–trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.


2018 ◽  
Author(s):  
Kyoko Watanabe ◽  
Sven Stringer ◽  
Oleksandr Frei ◽  
Maša Umićević Mirkov ◽  
Tinca J.C. Polderman ◽  
...  

ABSTRACTAfter a decade of genome-wide association studies (GWASs), fundamental questions in human genetics are still unanswered, such as the extent of pleiotropy across the genome, the nature of trait-associated genetic variants and the disparate genetic architecture across human traits. The current availability of hundreds of GWAS results provide the unique opportunity to gain insight into these questions. In this study, we harmonized and systematically analysed 4,155 publicly available GWASs. For a subset of well-powered GWAS on 558 unique traits, we provide an extensive overview of pleiotropy and genetic architecture. We show that trait associated loci cover more than half of the genome, and 90% of those loci are associated with multiple trait domains. We further show that potential causal genetic variants are enriched in coding and flanking regions, as well as in regulatory elements, and how trait-polygenicity is related to an estimate of the required sample size to detect 90% of causal genetic variants. Our results provide novel insights into how genetic variation contributes to trait variation. All GWAS results can be queried and visualized at the GWAS ATLAS resource (http://atlas.ctglab.nl).


Author(s):  
George Wehby

This chapter reviews the main pathways through which genes can relate to health and health determinants, focusing on channels of primary interest to health economics research. Research implications and methodological considerations for using genetic data in health economics applications are also discussed, as is the potential for informing policy-making. As knowledge of the genetic architecture of outcomes and traits of interest to health economics expands, incorporating genetic data in health economics research is likely to become more fruitful, including in providing policy-relevant findings. However, identifying genetic variants and mechanisms that explain a substantial fraction of the heritability of complex human traits will take time. Meanwhile, research can continue to achieve piecewise advances in knowledge on interplays between genes and the environment in shaping health, preferences, and human capital.


2021 ◽  
Author(s):  
Karthik A. Jagadeesh ◽  
Kushal K Dey ◽  
Daniel T. Montoro ◽  
Steven Gazal ◽  
Jesse M Engreitz ◽  
...  

Cellular dysfunction is a hallmark of disease. Genome-wide association studies (GWAS) have provided a powerful means to identify loci and genes contributing to disease risk, but in many cases the related cell types/states through which genes confer disease risk remain unknown. Deciphering such relationships is important both for our understanding of disease, and for developing therapeutic interventions. Here, we introduce a framework for integrating single-cell RNA-seq (scRNA-seq), epigenomic maps and GWAS summary statistics to infer the underlying cell types and processes by which genetic variants influence disease. We analyzed 1.6 million scRNA-seq profiles from 209 individuals spanning 11 tissue types and 6 disease conditions, and constructed gene programs capturing cell types, disease progression in cell types, and cellular processes both within and across cell types. We evaluated these gene programs for disease enrichment by transforming them to SNP annotations with tissue-specific epigenomic maps and computing enrichment scores across 60 diseases and complex traits (average N=297K). The inferred disease enrichments recapitulated known biology and highlighted novel relationships for different conditions, including GABAergic neurons in major depressive disorder (MDD), disease progression programs in M cells in ulcerative colitis, and a disease-specific complement cascade process in multiple sclerosis. Our framework provides a powerful approach for identifying the cell types and cellular processes by which genetic variants influence disease.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Marion Patxot ◽  
Daniel Trejo Banos ◽  
Athanasios Kousathanas ◽  
Etienne J. Orliac ◽  
Sven E. Ojavee ◽  
...  

AbstractWe develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32–44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.


2020 ◽  
Author(s):  
C Prince ◽  
R. E Mitchell ◽  
T. G. Richardson

AbstractBackgroundDeveloping functional understanding into the causal molecular drivers of immunological disease is a critical challenge in genomic medicine. Here we systematically apply Mendelian randomization (MR), genetic colocalization, immune cell-type enrichment and phenome-wide association methods to investigate the effect of genetically predicted gene expression on 12 autoimmune and 4 cancer outcomes.ResultsUsing whole blood derived estimates for regulatory variants from the eQTLGen consortium (n=31,684) we constructed genetic risk scores (r2<0.1) for 10,104 genes. Applying the inverse-variance weighted Mendelian randomization method transcriptome-wide whilst accounting for linkage disequilibrium structure identified 773 unique genes with evidence of a genetically predicted effect on at least one disease outcome (P<4.81 × 10−5). We next undertook genetic colocalization to investigate whether these effects may be confined to specific cell-types using gene expression data derived from 18 types of immune cells. This highlighted many cell-type dependent effects, such as PRKCQ expression and asthma risk (posterior probability of association (PPA)=0.998), which was T-cell specific, as well as TPM3 expression and prostate cancer risk (PPA=0.821), which was restricted to monocytes. Phenome-wide analyses on 320 complex traits allowed us to explore the shared genetic architecture and prioritize key drivers of disease risk, such as CASP10 which provided evidence of an effect on 7 cancer-related outcomes. Similarly, these evaluations of pervasive pleiotropy may be valuable for evaluations of therapeutic targets to help identify potential adverse effects.ConclusionsOur atlas of results can be used to characterize known and novel loci in autoimmune disease and cancer susceptibility, both in terms of developing insight into cell-type dependent effects as well as dissecting shared genetic architecture and disease pathways. As exemplar, we have highlighted several key findings in this study, although similar evaluations can be conducted interactively at http://mrcieu.mrsoftware.org/immuno_MR/.


Nature ◽  
2017 ◽  
Vol 550 (7675) ◽  
pp. 239-243 ◽  
Author(s):  
Xin Li ◽  
◽  
Yungil Kim ◽  
Emily K. Tsang ◽  
Joe R. Davis ◽  
...  

Abstract Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk1,2,3,4. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants1,5. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles1,6,7, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues8,9,10,11, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release12. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Xiang Zhu ◽  
Zhana Duren ◽  
Wing Hung Wong

AbstractGenome-wide association studies (GWAS) have cataloged many significant associations between genetic variants and complex traits. However, most of these findings have unclear biological significance, because they often have small effects and occur in non-coding regions. Integration of GWAS with gene regulatory networks addresses both issues by aggregating weak genetic signals within regulatory programs. Here we develop a Bayesian framework that integrates GWAS summary statistics with regulatory networks to infer genetic enrichments and associations simultaneously. Our method improves upon existing approaches by explicitly modeling network topology to assess enrichments, and by automatically leveraging enrichments to identify associations. Applying this method to 18 human traits and 38 regulatory networks shows that genetic signals of complex traits are often enriched in interconnections specific to trait-relevant cell types or tissues. Prioritizing variants within enriched networks identifies known and previously undescribed trait-associated genes revealing biological and therapeutic insights.


Sign in / Sign up

Export Citation Format

Share Document