scholarly journals Decision letter: Variable prediction accuracy of polygenic scores within an ancestry group

2019 ◽  
Author(s):  
Paul O'Reilly
2019 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
Molly Przeworski

AbstractFields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group, the prediction accuracy of polygenic scores depends on characteristics such as the age or sex composition of the individuals in which the GWAS and the prediction were conducted, and on the GWAS study design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Ipsita Agarwal ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
...  

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Ipsita Agarwal ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
...  

2016 ◽  
Author(s):  
Timothy Shin Heng Mak ◽  
Robert Milan Porsch ◽  
Shing Wan Choi ◽  
Xueya Zhou ◽  
Pak Chung Sham

AbstractPolygenic scores (PGS) summarize the genetic contribution of a person’s genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating polygenic scores have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can make use of LD information available elsewhere to supplement such analyses. To answer this question we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and p-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred.


2021 ◽  
Author(s):  
Etienne J Orliac ◽  
Daniel Trejo Banos ◽  
Sven Erik Ojavee ◽  
Kristi Läll ◽  
Reedik Mägi ◽  
...  

Across 21 heritable traits in the UK and Estonian Biobank data, a Bayesian grouped mixture of regressions model (GMRM) obtains the highest genomic prediction accuracy reported to date, 15% (SD 10%) greater than a baseline model without MAF-LD-annotation groups, and 106% (SD 50%) greater than mixed-linear model association (MLMA) estimate polygenic scores. Prediction accuracy was up to 13% (mean 4%, SD 3%) higher than theoretical expectations, at 76% of the h2SNP for height (R2 of 47%) and over 50% of the h2SNP for 12 traits. Using these predictors in MLMA, increased the independent GWAS loci detected from 16,899 using standard approaches to 18,837 using GMRM, an 11.5% increase. Modelling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for large-scale individual-level biobank-scale analyses and is facilitated by our scalable highly parallel open source GMRM software.


2019 ◽  
Author(s):  
Luigi A. Maglanoc ◽  
Tobias Kaufmann ◽  
Dennis van der Meer ◽  
Andre F. Marquand ◽  
Thomas Wolfers ◽  
...  

AbstractCognitive abilities and mental disorders are complex traits sharing a largely unknown neuronal basis and aetiology. Their genetic architectures are highly polygenic and overlapping, which is supported by heterogeneous phenotypic expression and substantial clinical overlap. Brain network analysis provides a non-invasive means of dissecting biological heterogeneity yet its sensitivity, specificity and validity in clinical applications remains a major challenge. We used machine learning on static and dynamic temporal synchronization between all brain network nodes in 10,343 healthy individuals from the UK Biobank to predict (i) cognitive and mental health traits and (ii) their genetic underpinnings. We predicted age and sex to serve as our reference point. The traits of interest included individual level educational attainment and fluid intelligence (cognitive) and dimensional measures of depression, anxiety, and neuroticism (mental health). We predicted polygenic scores for educational attainment, fluid intelligence, depression, anxiety, and different neuroticism traits, in addition to schizophrenia. Beyond high accuracy for age and sex, permutation tests revealed above chance-level prediction accuracy for educational attainment and fluid intelligence. Educational attainment and fluid intelligence were mainly negatively associated with static brain connectivity in frontal and default mode networks, whereas age showed positive correlations with a more widespread pattern. In comparison, prediction accuracy for polygenic scores was at chance level across traits, which may serve as a benchmark for future studies aiming to link genetic factors and fMRI-based brain connectomics.SignificanceAlthough cognitive abilities and susceptibility to mental disorders reflect individual differences in brain function, neuroimaging is yet to provide a coherent account of the neuronal underpinnings. Here, we aimed to map the brain functional connectome of (i) cognitive and mental health traits and (ii) their polygenic architecture in a large population-based sample. We discovered high prediction accuracy for age and sex, and above-chance accuracy for educational attainment and intelligence (cognitive). In contrast, accuracies for dimensional measures of depression, anxiety and neuroticism (mental health), and polygenic scores across traits, were at chance level. These findings support the link between cognitive abilities and brain connectomics and provide a reference for studies mapping the brain connectomics of mental disorders and their genetic architectures.


2009 ◽  
Author(s):  
Benjamin Scheibehenne ◽  
Andreas Wilke ◽  
Peter M. Todd
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document