scholarly journals The power of pathway-based polygenic risk scores

Author(s):  
Paul O’Reilly ◽  
Shing Choi ◽  
Judit Garcia-Gonzalez ◽  
Yunfeng Ruan ◽  
Hei Man Wu ◽  
...  

Abstract Polygenic risk scores (PRSs) have been among the leading advances in biomedicine in recent years. As a proxy of genetic liability, PRSs are utilised across multiple fields and applications. While numerous statistical and machine learning methods have been developed to optimise their predictive accuracy, all of these distil genetic liability to a single number based on aggregation of an individual’s genome-wide alleles. This results in a key loss of information about an individual’s genetic profile, which could be critical given the functional sub-structure of the genome and the heterogeneity of complex disease. Here we evaluate the performance of pathway-based PRSs, in which polygenic scores are calculated across genomic pathways for each individual, and we introduce a software, PRSet, for computing and analysing pathway PRSs. We find that pathway PRSs have similar power for evaluating pathway enrichment of GWAS signal as the leading methods, with the distinct advantage of providing estimates of pathway genetic liability at the individual-level. Exemplifying their utility, we demonstrate that pathway PRSs can stratify diseases into subtypes in the UK Biobank with substantially greater power than genome-wide PRSs. Compared to genome-wide PRSs, we expect pathway-based PRSs to offer greater insights into the heterogeneity of complex disease and treatment response, generate more biologically tractable therapeutic targets, and provide a more powerful path to precision medicine.

2020 ◽  
Author(s):  
Hannah Wand ◽  
Samuel A. Lambert ◽  
Cecelia Tamburro ◽  
Michael A. Iacocca ◽  
Jack W. O’Sullivan ◽  
...  

SummaryIn recent years, polygenic risk scores (PRS) have become an increasingly studied tool to capture the genome-wide liability underlying many human traits and diseases, hoping to better inform an individual’s genetic risk. However, a lack of adherence to previous reporting standards has hindered the translation of this important tool into clinical and public health practice with the heterogeneous underreporting of details necessary for benchmarking and reproducibility. To address this gap, the ClinGen Complex Disease Working Group and Polygenic Score (PGS) Catalog have collaborated to develop the 33-item Polygenic Risk Score Reporting Statement (PRS-RS). This framework provides the minimal information expected of authors to promote the internal validity, transparency, and reproducibility of PRS by requiring authors to detail the study population, statistical methods, and clinical utility of a published score. The widespread adoption of this framework will encourage rigorous methodological consideration and facilitate benchmarking to ensure high quality scores are translated into the clinic.


2018 ◽  
Author(s):  
Tom G. Richardson ◽  
Sean Harrison ◽  
Gibran Hemani ◽  
George Davey Smith

AbstractThe age of large-scale genome-wide association studies (GWAS) has provided us with an unprecedented opportunity to evaluate the genetic liability of complex disease using polygenic risk scores (PRS). In this study, we have analysed 162 PRS (P<5×l0 05) derived from GWAS and 551 heritable traits from the UK Biobank study (N=334,398). Findings can be investigated using a web application (http://mrcieu.mrsoftware.org/PRS_atlas/), which we envisage will help uncover both known and novel mechanisms which contribute towards disease susceptibility.To demonstrate this, we have investigated the results from a phenome-wide evaluation of schizophrenia genetic liability. Amongst findings were inverse associations with measures of cognitive function which extensive follow-up analyses using Mendelian randomization (MR) provided evidence of a causal relationship. We have also investigated the effect of multiple risk factors on disease using mediation and multivariable MR frameworks. Our atlas provides a resource for future endeavours seeking to unravel the causal determinants of complex disease.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jiangming Sun ◽  
Yunpeng Wang ◽  
Lasse Folkersen ◽  
Yan Borné ◽  
Inge Amlien ◽  
...  

AbstractA promise of genomics in precision medicine is to provide individualized genetic risk predictions. Polygenic risk scores (PRS), computed by aggregating effects from many genomic variants, have been developed as a useful tool in complex disease research. However, the application of PRS as a tool for predicting an individual’s disease susceptibility in a clinical setting is challenging because PRS typically provide a relative measure of risk evaluated at the level of a group of people but not at individual level. Here, we introduce a machine-learning technique, Mondrian Cross-Conformal Prediction (MCCP), to estimate the confidence bounds of PRS-to-disease-risk prediction. MCCP can report disease status conditional probability value for each individual and give a prediction at a desired error level. Moreover, with a user-defined prediction error rate, MCCP can estimate the proportion of sample (coverage) with a correct prediction.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Amanda Ly ◽  
Beate Leppert ◽  
Dheeraj Rai ◽  
Hannah Jones ◽  
Christina Dardani ◽  
...  

AbstractHigher prevalence of autism in offspring born to mothers with rheumatoid arthritis has been reported in observational studies. We investigated (a) the associations between maternal and offspring’s own genetic liability for rheumatoid arthritis and autism-related outcomes in the offspring using polygenic risk scores (PRS) and (b) whether the effects were causal using Mendelian randomization (MR). Using the latest genome-wide association (GWAS) summary data on rheumatoid arthritis and individual-level data from the Avon Longitudinal Study of Parents and Children, United Kingdom, we constructed PRSs for maternal and offspring genetic liability for rheumatoid arthritis (single-nucleotide polymorphism [SNP] p-value threshold 0.05). We investigated associations with autism, and autistic traits: social and communication difficulties, coherence, repetitive behaviours and sociability. We used modified Poisson regression with robust standard errors. In two-sample MR analyses, we used 40 genome-wide significant SNPs for rheumatoid arthritis and investigated the causal effects on risk for autism, in 18,381 cases and 27,969 controls of the Psychiatric Genetics Consortium and iPSYCH. Sample size ranged from 4992 to 7849 in PRS analyses. We found little evidence of associations between rheumatoid arthritis PRSs and autism-related phenotypes in the offspring (maternal PRS on autism: RR 0.89, 95%CI 0.73–1.07, p = 0.21; offspring’s own PRS on autism: RR 1.11, 95%CI 0.88–1.39, p = 0.39). MR results provided little evidence for a causal effect (IVW OR 1.01, 95%CI 0.98–1.04, p = 0.56). There was little evidence for associations between genetic liability for rheumatoid arthritis on autism-related outcomes in offspring. Lifetime risk for rheumatoid arthritis has no causal effects on autism.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Tom G Richardson ◽  
Sean Harrison ◽  
Gibran Hemani ◽  
George Davey Smith

The age of large-scale genome-wide association studies (GWAS) has provided us with an unprecedented opportunity to evaluate the genetic liability of complex disease using polygenic risk scores (PRS). In this study, we have analysed 162 PRS (p<5×10−05) derived from GWAS and 551 heritable traits from the UK Biobank study (N = 334,398). Findings can be investigated using a web application (http:‌//‌mrcieu.‌mrsoftware.org/‌PRS‌_atlas/), which we envisage will help uncover both known and novel mechanisms which contribute towards disease susceptibility. To demonstrate this, we have investigated the results from a phenome-wide evaluation of schizophrenia genetic liability. Amongst findings were inverse associations with measures of cognitive function which extensive follow-up analyses using Mendelian randomization (MR) provided evidence of a causal relationship. We have also investigated the effect of multiple risk factors on disease using mediation and multivariable MR frameworks. Our atlas provides a resource for future endeavours seeking to unravel the causal determinants of complex disease.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Jessica K. Dennis ◽  
Julia M. Sealock ◽  
Peter Straub ◽  
Younga H. Lee ◽  
Donald Hucks ◽  
...  

Abstract Background Clinical laboratory (lab) tests are used in clinical practice to diagnose, treat, and monitor disease conditions. Test results are stored in electronic health records (EHRs), and a growing number of EHRs are linked to patient DNA, offering unprecedented opportunities to query relationships between genetic risk for complex disease and quantitative physiological measurements collected on large populations. Methods A total of 3075 quantitative lab tests were extracted from Vanderbilt University Medical Center’s (VUMC) EHR system and cleaned for population-level analysis according to our QualityLab protocol. Lab values extracted from BioVU were compared with previous population studies using heritability and genetic correlation analyses. We then tested the hypothesis that polygenic risk scores for biomarkers and complex disease are associated with biomarkers of disease extracted from the EHR. In a proof of concept analyses, we focused on lipids and coronary artery disease (CAD). We cleaned lab traits extracted from the EHR performed lab-wide association scans (LabWAS) of the lipids and CAD polygenic risk scores across 315 heritable lab tests then replicated the pipeline and analyses in the Massachusetts General Brigham Biobank. Results Heritability estimates of lipid values (after cleaning with QualityLab) were comparable to previous reports and polygenic scores for lipids were strongly associated with their referent lipid in a LabWAS. LabWAS of the polygenic score for CAD recapitulated canonical heart disease biomarker profiles including decreased HDL, increased pre-medication LDL, triglycerides, blood glucose, and glycated hemoglobin (HgbA1C) in European and African descent populations. Notably, many of these associations remained even after adjusting for the presence of cardiovascular disease and were replicated in the MGBB. Conclusions Polygenic risk scores can be used to identify biomarkers of complex disease in large-scale EHR-based genomic analyses, providing new avenues for discovery of novel biomarkers and deeper understanding of disease trajectories in pre-symptomatic individuals. We present two methods and associated software, QualityLab and LabWAS, to clean and analyze EHR labs at scale and perform a Lab-Wide Association Scan.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kit K. Elam ◽  
Thao Ha ◽  
Zoe Neale ◽  
Fazil Aliev ◽  
Danielle Dick ◽  
...  

AbstractGenetic effects on alcohol use can vary over time but are often examined using longitudinal models that predict a distal outcome at a single time point. The vast majority of these studies predominately examine effects using White, European American (EA) samples or examine the etiology of genetic variants identified from EA samples in other racial/ethnic populations, leading to inconclusive findings about genetic effects on alcohol use. The current study examined how genetic influences on alcohol use varied by age across a 15 year period within a diverse ethnic/racial sample of adolescents. Using a multi-ethnic approach, polygenic risk scores were created for African American (AA, n = 192) and EA samples (n = 271) based on racially/ethnically aligned genome wide association studies. Age-varying associations between polygenic scores and alcohol use were examined from age 16 to 30 using time-varying effect models separately for AA and EA samples. Polygenic risk for alcohol use was found to be associated with alcohol use from age 22–27 in the AA sample and from age 24.50 to 29 in the EA sample. Results are discussed relative to the intersection of alcohol use and developmental genetic effects in diverse populations.


2015 ◽  
Author(s):  
Bjarni Vilhjalmsson ◽  
Jian Yang ◽  
Hilary Kiyo Finucane ◽  
Alexander Gusev ◽  
Sara Lindstrom ◽  
...  

Polygenic risk scores have shown great promise in predicting complex disease risk, and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves LD-pruning markers and applying a P-value threshold to association statistics, but this discards information and may reduce predictive accuracy. We introduce a new method, LDpred, which infers the posterior mean causal effect size of each marker using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the pruning/thresholding approach, particularly at large sample sizes. Accordingly, prediction R2 increased from 20.1% to 25.3% in a large schizophrenia data set and from 9.8% to 12.0% in a large multiple sclerosis data set. A similar relative improvement in accuracy was observed for three additional large disease data sets and when predicting in non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.


2018 ◽  
Author(s):  
Roman Teo Oliynyk

AbstractBackgroundGenome-wide association studies and other computational biology techniques are gradually discovering the causal gene variants that contribute to late-onset human diseases. After more than a decade of genome-wide association study efforts, these can account for only a fraction of the heritability implied by familial studies, the so-called “missing heritability” problem.MethodsComputer simulations of polygenic late-onset diseases in an aging population have quantified the risk allele frequency decrease at older ages caused by individuals with higher polygenic risk scores becoming ill proportionately earlier. This effect is most prominent for diseases characterized by high cumulative incidence and high heritability, examples of which include Alzheimer’s disease, coronary artery disease, cerebral stroke, and type 2 diabetes.ResultsThe incidence rate for late-onset diseases grows exponentially for decades after early onset ages, guaranteeing that the cohorts used for genome-wide association studies overrepresent older individuals with lower polygenic risk scores, whose disease cases are disproportionately due to environmental causes such as old age itself. This mechanism explains the decline in clinical predictive power with age and the lower discovery power of familial studies of heritability and genome-wide association studies. It also explains the relatively constant-with-age heritability found for late-onset diseases of lower prevalence, exemplified by cancers.ConclusionsFor late-onset polygenic diseases showing high cumulative incidence together with high initial heritability, rather than using relatively old age-matched cohorts, study cohorts combining the youngest possible cases with the oldest possible controls may significantly improve the discovery power of genome-wide association studies.


Sign in / Sign up

Export Citation Format

Share Document