scholarly journals Integration of rare large-effect expression variants improves polygenic risk prediction

Author(s):  
Craig Smail ◽  
Nicole M. Ferraro ◽  
Matthew G. Durrant ◽  
Abhiram S. Rao ◽  
Matthew Aguirre ◽  
...  

SummaryPolygenic risk scores (PRS) aim to quantify the contribution of multiple genetic loci to an individual’s likelihood of a complex trait or disease. However, existing PRS estimate genetic liability using common genetic variants, excluding the impact of rare variants. We identified rare, large-effect variants in individuals with outlier gene expression from the GTEx project and then assessed their impact on PRS predictions in the UK Biobank (UKB). We observed large deviations from the PRS-predicted phenotypes for carriers of multiple outlier rare variants; for example, individuals classified as “low-risk” but in the top 1% of outlier rare variant burden had a 6-fold higher rate of severe obesity. We replicated these findings using data from the NHLBI Trans-Omics for Precision Medicine (TOPMed) biobank and the Million Veteran Program, and demonstrated that PRS across multiple traits will significantly benefit from the inclusion of rare genetic variants.

2020 ◽  
pp. 1-16 ◽  
Author(s):  
Jessye M. Maxwell ◽  
Richard A. Russell ◽  
Hei Man Wu ◽  
Natasha Sharapova ◽  
Peter Banthorpe ◽  
...  

Abstract During the past decade, genetics research has allowed scientists and clinicians to explore the human genome in detail and reveal many thousands of common genetic variants associated with disease. Genetic risk scores, known as polygenic risk scores (PRSs), aggregate risk information from the most important genetic variants into a single score that describes an individual’s genetic predisposition to a given disease. This article reviews recent developments in the predictive utility of PRSs in relation to a person’s susceptibility to breast cancer and coronary artery disease. Prognostic models for these disorders are built using data from the UK Biobank, controlling for typical clinical and underwriting risk factors. Furthermore, we explore the possibility of adverse selection where genetic information about multifactorial disorders is available for insurance purchasers but not for underwriters. We demonstrate that prediction of multifactorial diseases, using PRSs, provides population risk information additional to that captured by normal underwriting risk factors. This research using the UK Biobank is in the public interest as it contributes to our understanding of predicting risk of disease in the population. Further research is imperative to understand how PRSs could cause adverse selection if consumers use this information to alter their insurance purchasing behaviour.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. 1528-1528
Author(s):  
Heena Desai ◽  
Anh Le ◽  
Ryan Hausler ◽  
Shefali Verma ◽  
Anurag Verma ◽  
...  

1528 Background: The discovery of rare genetic variants associated with cancer have a tremendous impact on reducing cancer morbidity and mortality when identified; however, rare variants are found in less than 5% of cancer patients. Genome wide association studies (GWAS) have identified hundreds of common genetic variants significantly associated with a number of cancers, but the clinical utility of individual variants or a polygenic risk score (PRS) derived from multiple variants is still unclear. Methods: We tested the ability of polygenic risk score (PRS) models developed from genome-wide significant variants to differentiate cases versus controls in the Penn Medicine Biobank. Cases for 15 different cancers and cancer-free controls were identified using electronic health record billing codes for 11,524 European American and 5,994 African American individuals from the Penn Medicine Biobank. Results: The discriminatory ability of the 15 PRS models to distinguish their respective cancer cases versus controls ranged from 0.68-0.79 in European Americans and 0.74-0.93 in African Americans. Seven of the 15 cancer PRS trended towards an association with their cancer at a p<0.05 (Table), and PRS for prostate, thyroid and melanoma were significantly associated with their cancers at a bonferroni corrected p<0.003 with OR 1.3-1.6 in European Americans. Conclusions: Our data demonstrate that common variants with significant associations from GWAS studies can distinguish cancer cases versus controls for some cancers in an unselected biobank population. Given the small effects, future studies are needed to determine how best to incorporate PRS with other risk factors in the precision prediction of cancer risk. [Table: see text]


2021 ◽  
Author(s):  
Yosuke Tanigawa ◽  
Junyang Qian ◽  
Guhan Ram Venkataraman ◽  
Johanne M. Justesen ◽  
Ruilin Li ◽  
...  

We present a systematic assessment of polygenic risk score (PRS) prediction across more than 1,600 traits using genetic and phenotype data in the UK Biobank. We report 428 sparse PRS models with significant (p < 2.5e-5) incremental predictive performance when compared against the covariate-only model that considers age, sex, and the genotype principal components. We report a significant correlation between the number of genetic variants selected in the sparse PRS model and the incremental predictive performance in quantitative traits (Spearman's ρ = 0.54, p = 1.4e-15), but not in binary traits (ρ = 0.059, p = 0.35). The sparse PRS model trained on European individuals showed limited transferability when evaluated on individuals from non-European individuals in the UK Biobank. We provide the PRS model weights on the Global Biobank Engine (https://biobankengine.stanford.edu/prs).


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Rafik Tadros ◽  
Catherine Francis ◽  
Xiao Xu ◽  
Alexa M Vermeer ◽  
Andrew R Harper ◽  
...  

Introduction: Hypertrophic (HCM) and dilated (DCM) cardiomyopathies are leading causes of sudden death and heart failure requiring transplantation in young individuals. While some cases have a monogenic underlying cause, the majority remain unexplained. Objective: To better understand the contribution of common genetic variants in susceptibility and severity of cardiomyopathy. Methods: We conducted three genome-wide association studies (GWAS) and multi-trait analyses in European-ancestry individuals, including a HCM (1,733 cases) and DCM meta-analyses (5,521 cases), and a GWAS of 9 left ventricular (LV) traits in 19,260 healthy participants from the UK Biobank that underwent cardiac magnetic resonance imaging. We investigated genetic correlations between LV traits, HCM and DCM using LD score regression. We used two-sample mendelian randomization (MR) to assess the causal relationship of increased LV contractility with HCM risk. Lastly, we derived a polygenic risk score and assessed whether it modulates maximal LV wall thickness (maxLVWT) and clinical events in 368 sarcomeric mutation carriers, using linear and Cox mixed effects models, respectively. Results: We identified 16 genetic loci (15 novel) associated with HCM, 13 loci (7 novel) associated with DCM, and 23 loci associated with LV traits. We showed strong genetic correlations between LV volumes and contractility traits in the general population and cardiomyopathies, with opposing effects in HCM and DCM. Using MR, we demonstrated a causal association linking increased LV contractility with HCM risk and estimated that each unit (1%) increase in LV ejection fraction increases the risk of HCM by 37% (95% CI 10%-69%, P=0.004). Lastly, a polygenic risk score (PRS HCM ) derived from the HCM GWAS was associated with maxLVWT (P=0.0001) and clinical events (P=0.009) in carriers of HCM-causing rare variants. Conclusion: Our findings highlight the contribution of common genetic variants in susceptibility for HCM and DCM, and in severity in sarcomeric mutation carriers. Our data also point to increased LV contractility as an important mechanism of HCM independently of sarcomere activating rare variants, and highlight the potential clinical relevance of PRS for risk stratification in HCM.


2019 ◽  
Author(s):  
Hilda Bjork Danielsdottir ◽  
Juulia Jylhävä ◽  
Sara Hägg ◽  
Yi Lu ◽  
Lucía Colodro-Conde ◽  
...  

ABSTRACTObjectiveNeuroticism is associated with poor health outcomes, but its contribution to the accumulation of health deficits in old age, i.e. frailty, is largely unknown. We aimed to explore associations between neuroticism and frailty cross-sectionally and over up to 29 years, and to investigate the contribution of shared genetic influences.MethodData were derived from the UK Biobank (UKB, n=502,631), the Australian Over 50’s Study (AO50, n=3,011) and the Swedish Twin Registry (SALT n=23,744, SATSA n=1,637). Associations between neuroticism and the Frailty Index were investigated using regression analysis cross-sectionally in UKB, AO50 and SATSA, and longitudinally in SALT (25-29y follow-up) and SATSA (6 and 23y follow-up). The co-twin control method was applied to explore the contribution of underlying shared familial factors (SALT, SATSA, AO50). Genome-wide polygenic risk scores for neuroticism in all samples were used to further assess whether common genetic variants associated with neuroticism predict frailty.ResultsHigh neuroticism was consistently associated with greater frailty cross-sectionally (adjusted β, 95% confidence intervals in UKB= 0.32, 0.32-0.33; AO50= 0.35, 0.31-0.39; SATSA= 0.33, 0.27-0.39) and longitudinally up to 29 years (SALT= 0.24; 0.22-0.25; SATSA 6y= 0.31, 0.24-0.38; SATSA 23y= 0.16, 0.07-0.25). When controlling for underlying shared genetic and environmental factors the neuroticism-frailty association remained significant, although decreased. Polygenic risk scores for neuroticism significantly predicted frailty in the two larger samples (meta-analyzed total β= 0.06, 0.05-0.06).ConclusionHigh neuroticism is associated with the development and course of frailty. Both environmental and genetic influences, including neuroticism-associated genetic variants, contribute to this relationship.


Author(s):  
Kristina Rehbach ◽  
Hanwen Zhang ◽  
Debamitra Das ◽  
Sara Abdollahi ◽  
Tim Prorok ◽  
...  

ABSTRACTSchizophrenia (SZ) is a common and debilitating psychiatric disorder with limited effective treatment options. Although highly heritable, risk for this polygenic disorder depends on the complex interplay of hundreds of common and rare variants. Translating the growing list of genetic loci significantly associated with disease into medically actionable information remains an important challenge. Thus, establishing platforms with which to validate the impact of risk variants in cell-type-specific and donor-dependent contexts is critical. Towards this, we selected and characterize a collection of twelve human induced pluripotent stem cell (hiPSC) lines derived from control donors with extremely low and high SZ polygenic risk scores (PRS). These hiPSC lines are publicly available at the California Institute for Regenerative Medicine (CIRM). The suitability of these extreme PRS hiPSCs for CRISPR-based isogenic comparisons of neurons and glia was evaluated across three independent laboratories, identifying 9 out of 12 meeting our criteria. We report a standardized resource of publicly available hiPSCs, with which we collectively commit to conducting future CRISPR-engineering, in order to facilitate comparison and integration of functional validation studies across the field of psychiatric genetics.


2019 ◽  
Vol 71 (6) ◽  
pp. 925-934 ◽  
Author(s):  
George Hindy ◽  
Kristina E. Åkesson ◽  
Olle Melander ◽  
Krishna G. Aragam ◽  
Mary E. Haas ◽  
...  

2020 ◽  
Author(s):  
Kristina Rehbach Dobrinth ◽  
Hanwen Zhang ◽  
Debamitra Das ◽  
Sara Abdollahi ◽  
Tim Prorok ◽  
...  

Schizophrenia (SZ) is a common and debilitating psychiatric disorder with limited effective treatment options. Although highly heritable, risk for this polygenic disorder depends on the complex interplay of hundreds of common and rare variants. Translating the growing list of genetic loci significantly associated with disease into medically actionable information remains an important challenge. Thus, establishing platforms with which to validate the impact of risk variants in cell-type-specific and donor-dependent contexts is critical. Towards this, we selected and characterize a collection of twelve human induced pluripotent stem cell (hiPSC) lines derived from control donors with extremely low and high SZ polygenic risk scores (PRS). These hiPSC lines are publicly available at the California Institute for Regenerative Medicine (CIRM). The suitability of these extreme PRS hiPSCs for CRISPR-based isogenic comparisons of neurons and glia was evaluated across three independent laboratories, identifying 9 out of 12 meeting our criteria. We report a standardized resource of publicly available hiPSCs, with which we collectively commit to conducting future CRISPR-engineering, in order to facilitate comparison and integration of functional validation studies across the field of psychiatric genetics.


Nature ◽  
2017 ◽  
Vol 550 (7675) ◽  
pp. 239-243 ◽  
Author(s):  
Xin Li ◽  
◽  
Yungil Kim ◽  
Emily K. Tsang ◽  
Joe R. Davis ◽  
...  

Abstract Rare genetic variants are abundant in humans and are expected to contribute to individual disease risk1,2,3,4. While genetic association studies have successfully identified common genetic variants associated with susceptibility, these studies are not practical for identifying rare variants1,5. Efforts to distinguish pathogenic variants from benign rare variants have leveraged the genetic code to identify deleterious protein-coding alleles1,6,7, but no analogous code exists for non-coding variants. Therefore, ascertaining which rare variants have phenotypic effects remains a major challenge. Rare non-coding variants have been associated with extreme gene expression in studies using single tissues8,9,10,11, but their effects across tissues are unknown. Here we identify gene expression outliers, or individuals showing extreme expression levels for a particular gene, across 44 human tissues by using combined analyses of whole genomes and multi-tissue RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project v6p release12. We find that 58% of underexpression and 28% of overexpression outliers have nearby conserved rare variants compared to 8% of non-outliers. Additionally, we developed RIVER (RNA-informed variant effect on regulation), a Bayesian statistical model that incorporates expression data to predict a regulatory effect for rare variants with higher accuracy than models using genomic annotations alone. Overall, we demonstrate that rare variants contribute to large gene expression changes across tissues and provide an integrative method for interpretation of rare variants in individual genomes.


2018 ◽  
Author(s):  
Simon Haworth ◽  
Ruth Mitchell ◽  
Laura Corbin ◽  
Kaitlin H Wade ◽  
Tom Dudding ◽  
...  

Introductory paragraphThe inclusion of genetic data in large studies has enabled the discovery of genetic contributions to complex traits and their application in applied analyses including those using genetic risk scores (GRS) for the prediction of phenotypic variance. If genotypes show structure by location and coincident structure exists for the trait of interest, analyses can be biased. Having illustrated structure in an apparently homogeneous collection, we aimed to a) test for geographical stratification of genotypes in UK Biobank and b) assess whether stratification might induce bias in genetic association analysis.We found that single genetic variants are associated with birth location within UK Biobank and that geographic structure in genetic data could not be accounted for using routine adjustment for study centre and principal components (PCs) derived from genotype data. We found that GRS for complex traits do appear geographically structured and analysis using GRS can yield biased associations. We discuss the likely origins of these observations and potential implications for analysis within large-scale population based genetic studies.


Sign in / Sign up

Export Citation Format

Share Document