Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort

2022 ◽  
Vol 109 (1) ◽  
pp. 12-23
Author(s):  
Florian Privé ◽  
Hugues Aschard ◽  
Shai Carmi ◽  
Lasse Folkersen ◽  
Clive Hoggart ◽  
...  
Keyword(s):  
2020 ◽  
Author(s):  
John E. McGeary ◽  
Chelsie Benca-Bachman ◽  
Victoria Risner ◽  
Christopher G Beevers ◽  
Brandon Gibb ◽  
...  

Twin studies indicate that 30-40% of the disease liability for depression can be attributed to genetic differences. Here, we assess the explanatory ability of polygenic scores (PGS) based on broad- (PGSBD) and clinical- (PGSMDD) depression summary statistics from the UK Biobank using independent cohorts of adults (N=210; 100% European Ancestry) and children (N=728; 70% European Ancestry) who have been extensively phenotyped for depression and related neurocognitive phenotypes. PGS associations with depression severity and diagnosis were generally modest, and larger in adults than children. Polygenic prediction of depression-related phenotypes was mixed and varied by PGS. Higher PGSBD, in adults, was associated with a higher likelihood of having suicidal ideation, increased brooding and anhedonia, and lower levels of cognitive reappraisal; PGSMDD was positively associated with brooding and negatively related to cognitive reappraisal. Overall, PGS based on both broad and clinical depression phenotypes have modest utility in adult and child samples of depression.


2018 ◽  
Author(s):  
Timothy Shin Heng Mak ◽  
Robert Milan Porsch ◽  
Shing Wan Choi ◽  
Pak Chung Sham

AbstractPolygenic scores (PGS) are estimated scores representing the genetic tendency of an individual for a disease or trait and have become an indispensible tool in a variety of analyses. Typically they are linear combination of the genotypes of a large number of SNPs, with the weights calculated from an external source, such as summary statistics from large meta-analyses. Recently cohorts with genetic data have become very large, such that it would be a waste if the raw data were not made use of in constructing PGS. Making use of raw data in calculating PGS, however, presents us with problems of overfitting. Here we discuss the essence of overfitting as applied in PGS calculations and highlight the difference between overfitting due to the overlap between the target and the discovery data (OTD), and overfitting due to the overlap between the target the the validation data (OTV). We propose two methods — cross prediction and split validation — to overcome OTD and OTV respectively. Using these two methods, PGS can be calculated using raw data without overfitting. We show that PGSs thus calculated have better predictive power than those using summary statistics alone for six phenotypes in the UK Biobank data.


2019 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
Molly Przeworski

AbstractFields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group, the prediction accuracy of polygenic scores depends on characteristics such as the age or sex composition of the individuals in which the GWAS and the prediction were conducted, and on the GWAS study design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


2019 ◽  
Author(s):  
Rosa Cheesman ◽  
Avina Hunjan ◽  
Jonathan R. I. Coleman ◽  
Yasmin Ahmadzadeh ◽  
Robert Plomin ◽  
...  

AbstractIndividual-level polygenic scores can now explain ∼10% of the variation in number of years of completed education. However, associations between polygenic scores and education capture not only genetic propensity but information about the environment that individuals are exposed to. This is because individuals passively inherit effects of parental genotypes, since their parents typically also provide the rearing environment. In other words, the strong correlation between offspring and parent genotypes results in an association between the offspring genotypes and the rearing environment. This is termed passive gene-environment correlation. We present an approach to test for the extent of passive gene-environment correlation for education without requiring intergenerational data. Specifically, we use information from 6311 individuals in the UK Biobank who were adopted in childhood to compare genetic influence on education between adoptees and non-adopted individuals. Adoptees’ rearing environments are less correlated with their genotypes, because they do not share genes with their adoptive parents. We find that polygenic scores are twice as predictive of years of education in non-adopted individuals compared to adoptees (R2= 0.074 vs 0.037, difference test p= 8.23 × 10−24). We provide another kind of evidence for the influence of parental behaviour on offspring education: individuals in the lowest decile of education polygenic score attain significantly more education if they are adopted, possibly due to educationally supportive adoptive environments. Overall, these results suggest that genetic influences on education are mediated via the home environment. As such, polygenic prediction of educational attainment represents gene-environment correlations just as much as it represents direct genetic effects.


2017 ◽  
Author(s):  
Jeremy J. Berg ◽  
Xinjun Zhang ◽  
Graham Coop

AbstractOur understanding of the genetic basis of human adaptation is biased toward loci of large pheno-typic effect. Genome wide association studies (GWAS) now enable the study of genetic adaptation in polygenic phenotypes. We test for polygenic adaptation among 187 world-wide human populations using polygenic scores constructed from GWAS of 34 complex traits. We identify signals of polygenic adaptation for anthropometric traits including height, infant head circumference (IHC), hip circumference and waist-to-hip ratio (WHR). Analysis of ancient DNA samples indicates that a north-south cline of height within Europe and and a west-east cline across Eurasia can be traced to selection for increased height in two late Pleistocene hunter gatherer populations living in western and west-central Eurasia. Our observation that IHC and WHR follow a latitudinal cline in Western Eurasia support the role of natural selection driving Bergmann’s Rule in humans, consistent with thermoregulatory adaptation in response to latitudinal temperature variation.Author’s Note on Failure to ReplicateAfter this preprint was posted, the UK Biobank dataset was released, providing a new and open GWAS resource. When attempting to replicate the height selection results from this preprint using GWAS data from the UK Biobank, we discovered that we could not. In subsequent analyses, we determined that both the GIANT consortium height GWAS data, as well as another dataset that was used for replication, were impacted by stratification issues that created or at a minimum substantially inflated the height selection signals reported here. The results of this second investigation, written together with additional coauthors, have now been published (https://elifesciences.org/articles/39725 along with another paper by a separate group of authors, showing similar issues https://elifesciences.org/articles/39702). A preliminary investigation shows that the other non-height based results may suffer from similar issues. We stand by the theory and statistical methods reported in this paper, and the paper can be cited for these results. However, we have shown that the data on which the major empirical results were based are not sound, and so should be treated with caution until replicated.


2020 ◽  
Vol 31 (5) ◽  
pp. 582-591 ◽  
Author(s):  
Rosa Cheesman ◽  
Avina Hunjan ◽  
Jonathan R. I. Coleman ◽  
Yasmin Ahmadzadeh ◽  
Robert Plomin ◽  
...  

Polygenic scores now explain approximately 10% of the variation in educational attainment. However, they capture not only genetic propensity but also information about the family environment. This is because of passive gene–environment correlation, whereby the correlation between offspring and parent genotypes results in an association between offspring genotypes and the rearing environment. We measured passive gene–environment correlation using information on 6,311 adoptees in the UK Biobank. Adoptees’ genotypes were less correlated with their rearing environments because they did not share genes with their adoptive parents. We found that polygenic scores were twice as predictive of years of education in nonadopted individuals compared with adoptees ( R2s = .074 vs. .037, p = 8.23 × 10−24). Individuals in the lowest decile of polygenic scores for education attained significantly more education if they were adopted, possibly because of educationally supportive adoptive environments. Overall, these results suggest that genetic influences on education are mediated via the home environment.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Hakhamanesh Mostafavi ◽  
Arbel Harpak ◽  
Ipsita Agarwal ◽  
Dalton Conley ◽  
Jonathan K Pritchard ◽  
...  

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.


2019 ◽  
Author(s):  
Adrián I. Campos ◽  
Luis M. García-Marín ◽  
Enda M. Byrne ◽  
Nicholas G. Martin ◽  
Gabriel Cuéllar-Partida ◽  
...  

ABSTRACTWe conducted the largest study of snoring using data from the UK Biobank (n ∼ 408,000; snorers ∼152,000). A genome-wide association study (GWAS) identified 42 genome-wide significant loci, with a SNP-based heritability estimate of ∼10% on the liability scale. Genetic correlations with body mass index, alcohol intake, smoking, schizophrenia, anorexia nervosa and neuroticism were observed. Gene-based associations identified 173 genes, including DLEU7, MSRB3 and POC5 highlighting genes expressed in brain, cerebellum, lungs, blood, and oesophagus tissues. We used polygenic scores (PGS) to predict recent snoring and probable obstructive sleep apnoea (OSA) in an independent Australian sample (n∼8,000). Mendelian randomisation analyses provided evidence that larger whole body fat mass causes snoring. Altogether, our results uncover new insights into the aetiology of snoring as a complex sleep-related trait and its role in health and disease beyond being a cardinal symptom of OSA.


Sign in / Sign up

Export Citation Format

Share Document