scholarly journals Polygenic score accuracy in ancient samples: quantifying the effects of allelic turnover

2021 ◽  
Author(s):  
Maryn O. Carlson ◽  
Daniel P. Rice ◽  
Jeremy J. Berg ◽  
Matthias Steinrücken

AbstractPolygenic scores link the genotypes of ancient individuals to their phenotypes, which are often unobservable, offering a tantalizing opportunity to reconstruct complex trait evolution. In practice, however, interpretation of ancient polygenic scores is subject to numerous assumptions. For one, the genome-wide association (GWA) studies from which polygenic scores are derived, can only estimate effect sizes for loci segregating in contemporary populations. Therefore, a GWA study may not correctly identify all loci relevant to trait variation in the ancient population. In addition, the frequencies of trait-associated loci may have changed in the intervening years. Here, we devise a theoretical framework to quantify the effect of this allelic turnover on the statistical properties of polygenic scores as functions of population genetic dynamics, trait architecture, power to detect significant loci, and the age of the ancient sample. We model the allele frequencies of loci underlying trait variation using the Wright-Fisher diffusion, and employ the spectral representation of its transition density to find analytical expressions for several error metrics, including the correlation between an ancient individual’s polygenic score and true phenotype, referred to as polygenic score accuracy. Our theory also applies to a two-population scenario and demonstrates that allelic turnover alone may explain a substantial percentage of the reduced accuracy observed in cross-population predictions, akin to those performed in human genetics. Finally, we use simulations to explore the effects of recent directional selection, a bias-inducing process, on the statistics of interest. We find that even in the presence of bias, weak selection induces minimal deviations from our neutral expectations for the decay of polygenic score accuracy. By quantifying the limitations of polygenic scores in an explicit evolutionary context, our work lays the foundation for the development of more sophisticated statistical procedures to analyze both temporally and geographically resolved polygenic scores.

2017 ◽  
Author(s):  
Amit V. Khera ◽  
Mark Chaffin ◽  
Krishna G. Aragam ◽  
Connor A. Emdin ◽  
Derek Klarin ◽  
...  

AbstractIdentification of individuals at increased genetic risk for a complex disorder such as coronary disease can facilitate treatments or enhanced screening strategies. A rare monogenic mutation associated with increased cholesterol is present in ~1:250 carriers and confers an up to 4-fold increase in coronary risk when compared with non-carriers. Although individual common polymorphisms have modest predictive capacity, their cumulative impact can be aggregated into a polygenic score. Here, we develop a new, genome-wide polygenic score that aggregates information from 6.6 million common polymorphisms and show that this score can similarly identify individuals with a 4-fold increased risk for coronary disease. In >400,000 participants from UK Biobank, the score conforms to a normal distribution and those in the top 2.5% of the distribution are at 4-fold increased risk compared to the remaining 97.5%. Similar patterns are observed with genome-wide polygenic scores for two additional diseases – breast cancer and severe obesity.One Sentence SummaryA genome-wide polygenic score identifies 2.5% of the population born with a 4-fold increased risk for coronary artery disease.


2019 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
Molly Przeworski

AbstractFields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group, the prediction accuracy of polygenic scores depends on characteristics such as the age or sex composition of the individuals in which the GWAS and the prediction were conducted, and on the GWAS study design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


2017 ◽  
Author(s):  
Amy E. Taylor ◽  
Hannah J. Jones ◽  
Hannah Sallis ◽  
Jack Euesden ◽  
Evie Stergiakouli ◽  
...  

AbstractBackgroundIt is often assumed that selection (including participation and dropout) does not represent an important source of bias in genetic studies. However, there is little evidence to date on the effect of genetic factors on participation.MethodsUsing data on mothers (N=7,486) and children (N=7,508) from the Avon Longitudinal Study of Parents and Children, we 1) examined the association of polygenic risk scores for a range of socio-demographic, lifestyle characteristics and health conditions related to continued participation, 2) investigated whether associations of polygenic scores with body mass index (BMI; derived from self-reported weight and height) and self-reported smoking differed in the largest sample with genetic data and a sub-sample who participated in a recent follow-up and 3) determined the proportion of variation in participation explained by common genetic variants using genome-wide data.ResultsWe found evidence that polygenic scores for higher education, agreeableness and openness were associated with higher participation and polygenic scores for smoking initiation, higher BMI, neuroticism, schizophrenia, ADHD and depression were associated with lower participation. Associations between the polygenic score for education and self-reported smoking differed between the largest sample with genetic data (OR for ever smoking per SD increase in polygenic score:0.85, 95% CI:0.81,0.89) and sub-sample (OR:0.95, 95% CI:0.88,1.02). In genome-wide analysis, single nucleotide polymorphism based heritability explained 17-31% of variability in participation.ConclusionsGenetic association studies, including Mendelian randomization, can be biased by selection, including loss to follow-up. Genetic risk for dropout should be considered in all analyses of studies with selective participation.


Author(s):  
Florian Privé ◽  
Julyan Arbel ◽  
Bjarni J. Vilhjálmsson

AbstractPolygenic scores have become a central tool in human genetics research. LDpred is a popular method for deriving polygenic scores based on summary statistics and a matrix of correlation between genetic variants. However, LDpred has limitations that may reduce its predictive performance. Here we present LDpred2, a new version of LDpred that addresses these issues. We also provide two new options in LDpred2: a “sparse” option that can learn effects that are exactly 0, and an “auto” option that directly learns the two LDpred parameters from data. We benchmark predictive performance of LDpred2 against the previous version on simulated and real data, demonstrating substantial improvements in robustness and predictive accuracy compared to LDpred1. We then show that LDpred2 also outperforms other polygenic score methods recently developed, with a mean AUC over the 8 real traits analyzed here of 65.1%, compared to 63.8% for lassosum, 62.9% for PRS-CS and 61.5% for SBayesR. Note that, in contrast to what was recommended in the first version of this paper, we now recommend to run LDpred2 genome-wide instead of per chromosome. LDpred2 is implemented in R package bigsnpr.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Ipsita Agarwal ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
...  

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


2017 ◽  
Author(s):  
Anna R. Docherty ◽  
Andrey A. Shabalin ◽  
Emily DiBlasi ◽  
Eric Monson ◽  
Niamh Mullins ◽  
...  

ABSTRACTObjectiveSuicide death is a highly preventable, yet growing, worldwide health crisis. To date, there has been a lack of adequately powered genomic studies of suicide, with no sizeable suicide death cohorts available for study. To address this limitation, we conducted the first comprehensive genomic analysis of suicide death, using a previously unpublished suicide cohort.MethodsThe analysis sample consisted of 3,413 population-ascertained cases of European ancestry and 14,810 ancestrally matched controls. Analytical methods included principle components analysis for ancestral matching and adjusting for population stratification, linear mixed model genome-wide association testing (conditional on genetic relatedness matrix), gene and gene set enrichment testing, polygenic score analyses, as well as SNP heritability and genetic correlation estimation using LD score regression.ResultsGWAS identified two genome-wide significant loci (6 SNPs, p<5×10−8). Gene-based analyses implicated 19 genes on chromosomes 13, 15, 16, 17, and 19 (q<0.05). Suicide heritability was estimated h2 =0.2463, SE = 0.0356 using summary statistics from a multivariate logistic GWAS adjusting for ancestry. Notably, suicide polygenic scores were robustly predictive of out of sample suicide death, as were polygenic scores for several other psychiatric disorders and psychological traits, particularly behavioral disinhibition and major depressive disorder.ConclusionsIn this report, we identify multiple genome-wide significant loci/genes, and demonstrate robust polygenic score prediction of suicide death case-control status, adjusting for ancestry, in independent training and test sets. Additionally, we report that suicide death cases have increased genetic risk for behavioral disinhibition, major depression, autism spectrum disorder, psychosis, and alcohol use disorder relative to controls. Results demonstrate the ability of polygenic scores to robustly, and multidimensionally, predict suicide death case-control status.


2018 ◽  
Author(s):  
Urmo Võsa ◽  
Annique Claringbould ◽  
Harm-Jan Westra ◽  
Marc Jan Bonder ◽  
Patrick Deelen ◽  
...  

SummaryWhile many disease-associated variants have been identified through genome-wide association studies, their downstream molecular consequences remain unclear.To identify these effects, we performedcis-andtrans-expressionquantitative trait locus (eQTL) analysis in blood from 31,684 individuals through the eQTLGen Consortium.We observed thatcis-eQTLs can be detected for 88% of the studied genes, but that they have a different genetic architecture compared to disease-associated variants, limiting our ability to usecis-eQTLs to pinpoint causal genes within susceptibility loci.In contrast, trans-eQTLs (detected for 37% of 10,317 studied trait-associated variants) were more informative. Multiple unlinked variants, associated to the same complex trait, often converged on trans-genes that are known to play central roles in disease etiology.We observed the same when ascertaining the effect of polygenic scores calculated for 1,263 genome-wide association study (GWAS) traits. Expression levels of 13% of the studied genes correlated with polygenic scores, and many resulting genes are known to drive these traits.


Author(s):  
Florian Privé ◽  
Julyan Arbel ◽  
Bjarni J Vilhjálmsson

Abstract Motivation Polygenic scores have become a central tool in human genetics research. LDpred is a popular method for deriving polygenic scores based on summary statistics and a matrix of correlation between genetic variants. However, LDpred has limitations that may reduce its predictive performance. Results Here, we present LDpred2, a new version of LDpred that addresses these issues. We also provide two new options in LDpred2: a ‘sparse’ option that can learn effects that are exactly 0, and an ‘auto’ option that directly learns the two LDpred parameters from data. We benchmark predictive performance of LDpred2 against the previous version on simulated and real data, demonstrating substantial improvements in robustness and predictive accuracy compared to LDpred1. We then show that LDpred2 also outperforms other polygenic score methods recently developed, with a mean AUC over the 8 real traits analyzed here of 65.1%, compared to 63.8% for lassosum, 62.9% for PRS-CS and 61.5% for SBayesR. Note that LDpred2 provides more accurate polygenic scores when run genome-wide, instead of per chromosome. Availability and implementation LDpred2 is implemented in R package bigsnpr. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Reut Avinun ◽  
Adam Nevo ◽  
Annchen R. Knodt ◽  
Maxwell L. Elliott ◽  
Ahmad R. Hariri

AbstractAccumulating research suggests that the pro-inflammatory cytokine interleukin-1β (IL-1β) has a modulatory effect on the hippocampus, a brain structure important for learning and memory as well as linked with both psychiatric and neurodegenerative disorders. Here, we use an imaging genetics strategy to test an association between an IL-1β polygenic score, derived from summary statistics of a recent genome-wide association study (GWAS) of circulating cytokines, and hippocampal volume, in two independent samples. In the first sample of 512 non-Hispanic Caucasian university students (274 women, mean age 19.78 ± 1.24 years) from the Duke Neurogenetics Study, we identified a significant positive correlation between higher polygenic scores, which presumably reflect higher circulating IL-1β levels, and average hippocampal volume. This positive association was successfully replicated in a second sample of 7,960 white British volunteers (4,158 women, mean age 62.63±7.45 years) from the UK Biobank. Collectively, our results suggest that a functional GWAS-derived score of IL-1β blood circulating levels affects hippocampal volume, and lend further support in humans, to the link between IL-1β and the structure of the hippocampus.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Avina K. Hunjan ◽  
Christopher Hübel ◽  
Yuhao Lin ◽  
Thalia C. Eley ◽  
Gerome Breen

AbstractDespite the observed associations between psychiatric disorders and nutrient intake, genetic studies are limited. We examined whether polygenic scores for psychiatric disorders are associated with nutrient intake in UK Biobank (N = 163,619) using linear mixed models. We found polygenic scores for attention-deficit/hyperactivity disorder, bipolar disorder, and schizophrenia showed the highest number of associations, while a polygenic score for autism spectrum disorder showed no association. The relatively weaker obsessive-compulsive disorder polygenic score showed the greatest effect sizes suggesting its association with diet traits may become more apparent with larger genome-wide analyses. A higher alcohol dependence polygenic score was associated with higher alcohol intake and individuals with higher persistent thinness polygenic scores reported their food to weigh less, both independent of socioeconomic status. Our findings suggest that polygenic propensity for a psychiatric disorder is associated with dietary behaviour. Note, nutrient intake was self-reported and findings must therefore be interpreted mindfully.


Sign in / Sign up

Export Citation Format

Share Document