scholarly journals Polygenic scores for height in admixed populations

Author(s):  
Bárbara D. Bitarello ◽  
Iain Mathieson

AbstractPolygenic risk scores (PRS) use the results of genome-wide association studies (GWAS) to predict quantitative phenotypes or disease risk at an individual level. This provides a potential route to the use of genetic data in personalized medical care. However, a major barrier to the use of PRS is that the majority of GWAS come from cohorts of European ancestry. The predictive power of PRS constructed from these studies is substantially lower in non-European ancestry cohorts, although the reasons for this are unclear. To address this question, we investigate the performance of PRS for height in cohorts with admixed African and European ancestry, allowing us to evaluate ancestry-related differences in PRS predictive accuracy while controlling for environment and cohort differences. We first show that that the predictive accuracy of height PRS increases linearly with European ancestry and is largely explained by European ancestry segments of the admixed genomes. We show that differences in allele frequencies, recombination rate, and marginal effect sizes across ancestries all contribute to the decrease in predictive power, but none of these effects explain the decrease on its own. Finally, we demonstrate that prediction for admixed individuals can be improved by using a linear combination of PRS that includes ancestry-specific effect sizes, although this approach is at present limited by the small size of non-European ancestry discovery cohorts.

2020 ◽  
Vol 10 (11) ◽  
pp. 4027-4036 ◽  
Author(s):  
Bárbara D. Bitarello ◽  
Iain Mathieson

Polygenic risk scores (PRS) use the results of genome-wide association studies (GWAS) to predict quantitative phenotypes or disease risk at an individual level, and provide a potential route to the use of genetic data in personalized medical care. However, a major barrier to the use of PRS is that the majority of GWAS come from cohorts of European ancestry. The predictive power of PRS constructed from these studies is substantially lower in non-European ancestry cohorts, although the reasons for this are unclear. To address this question, we investigate the performance of PRS for height in cohorts with admixed African and European ancestry, allowing us to evaluate ancestry-related differences in PRS predictive accuracy while controlling for environment and cohort differences. We first show that the predictive accuracy of height PRS increases linearly with European ancestry and is partially explained by European ancestry segments of the admixed genomes. We show that recombination rate, differences in allele frequencies, and differences in marginal effect sizes across ancestries all contribute to the decrease in predictive power, but none of these effects explain the decrease on its own. Finally, we demonstrate that prediction for admixed individuals can be improved by using a linear combination of PRS that includes ancestry-specific effect sizes, although this approach is at present limited by the small size of non-European ancestry discovery cohorts.


2021 ◽  
Author(s):  
Roshni A. Patel ◽  
Shaila A. Musharoff ◽  
Jeffrey P. Spence ◽  
Harold Pimentel ◽  
Catherine Tcheandjieu ◽  
...  

Despite the growing number of genome-wide association studies (GWAS) for complex traits, it remains unclear whether effect sizes of causal genetic variants differ between populations. In principle, effect sizes of causal variants could differ between populations due to gene-by-gene or gene-by-environment interactions. However, comparing causal variant effect sizes is challenging: it is difficult to know which variants are causal, and comparisons of variant effect sizes are confounded by differences in linkage disequilibrium (LD) structure between ancestries. Here, we develop a method to assess causal variant effect size differences that overcomes these limitations. Specifically, we leverage the fact that segments of European ancestry shared between European-American and admixed African-American individuals have similar LD structure, allowing for unbiased comparisons of variant effect sizes in European ancestry segments. We apply our method to two types of traits: gene expression and low-density lipoprotein cholesterol (LDL-C). We find that causal variant effect sizes for gene expression are significantly different between European-Americans and African-Americans; for LDL-C, we observe a similar point estimate although this is not significant, likely due to lower statistical power. Cross-population differences in variant effect sizes highlight the role of genetic interactions in trait architecture and will contribute to the poor portability of polygenic scores across populations, reinforcing the importance of conducting GWAS on individuals of diverse ancestries and environments.


2021 ◽  
Author(s):  
VT Nguyen ◽  
A Braun ◽  
J Kraft ◽  
TMT Ta ◽  
GM Panagiotaropoulou ◽  
...  

AbstractObjectivesGenome-Wide Association Studies (GWAS) of Schizophrenia (SCZ) have provided new biological insights; however, most cohorts are of European ancestry. As a result, derived polygenic risk scores (PRS) show decreased predictive power when applied to populations of different ancestries. We aimed to assess the feasibility of a large-scale data collection in Hanoi, Vietnam, contribute to international efforts to diversify ancestry in SCZ genetic research and examine the transferability of SCZ-PRS to individuals of Vietnamese Kinh ancestry.MethodsIn a pilot study, 368 individuals (including 190 SCZ cases) were recruited at the Hanoi Medical University’s associated psychiatric hospitals and outpatient facilities. Data collection included sociodemographic data, baseline clinical data, clinical interviews assessing symptom severity and genome-wide SNP genotyping. SCZ-PRS were generated using different training data sets: i) European, ii) East-Asian and iii) trans-ancestry GWAS summary statistics from the latest SCZ GWAS meta-analysis.ResultsSCZ-PRS significantly predicted case status in Vietnamese individuals using mixed-ancestry (R2 liability=4.9%, p=6.83*10−8), East-Asian (R2 liability=4.5%, p=2.73*10−7) and European (R2 liability=3.8%, p = 1.79*10−6) discovery samples.DiscussionOur results corroborate previous findings of reduced PRS predictive power across populations, highlighting the importance of ancestral diversity in GWA studies.


2018 ◽  
Author(s):  
A.G. Allegrini ◽  
S. Selzam ◽  
K. Rimfeld ◽  
S. von Stumm ◽  
J.B. Pingault ◽  
...  

AbstractRecent advances in genomics are producing powerful DNA predictors of complex traits, especially cognitive abilities. Here, we leveraged summary statistics from the most recent genome-wide association studies of intelligence and educational attainment to build prediction models of general cognitive ability and educational achievement. To this end, we compared the performances of multi-trait genomic and polygenic scoring methods. In a representative UK sample of 7,026 children at age 12 and 16, we show that we can now predict up to 11 percent of the variance in intelligence and 16 percent in educational achievement. We also show that predictive power increases from age 12 to age 16 and that genomic predictions do not differ for girls and boys. Multivariate genomic methods were effective in boosting predictive power and, even though prediction accuracy varied across polygenic scores approaches, results were similar using different multivariate and polygenic score methods. Polygenic scores for educational attainment and intelligence are the most powerful predictors in the behavioural sciences and exceed predictions that can be made from parental phenotypes such as educational attainment and occupational status.


2019 ◽  
Author(s):  
Yan Zhang ◽  
Amber N. Wilcox ◽  
Haoyu Zhang ◽  
Parichoy Pal Choudhury ◽  
Douglas F. Easton ◽  
...  

AbstractWe analyzed summary-level data from genome-wide association studies (GWAS) of European ancestry across fourteen cancer sites to estimate the number of common susceptibility variants (polygenicity) contributing to risk, as well as the distribution of their associated effect sizes. All cancers evaluated showed polygenicity, involving at a minimum thousands of independent susceptibility variants. For some malignancies, particularly chronic lymphoid leukemia (CLL) and testicular cancer, there are a larger proportion of variants with larger effect sizes than those for other cancers. In contrast, most variants for lung and breast cancers have very small associated effect sizes. For different cancer sites, we estimate a wide range of GWAS sample sizes, required to explain 80% of GWAS heritability, varying from 60,000 cases for CLL to over 1,000,000 cases for lung cancer. The maximum relative risk achievable for subjects at the 99th risk percentile of underlying polygenic risk scores, compared to average risk, ranges from 12 for testicular to 2.5 for ovarian cancer. We show that polygenic risk scores have substantial potential for risk stratification for relatively common cancers such as breast, prostate and colon, but limited potential for other cancer sites because of modest heritability and lower disease incidence.


2018 ◽  
Author(s):  
LE Duncan ◽  
H Shen ◽  
B Gelaye ◽  
KJ Ressler ◽  
MW Feldman ◽  
...  

AbstractStudies examining relationships between genotypic and phenotypic variation have historically been carried out on people of European ancestry. Efforts are underway to address this limitation, but until they succeed, the legacy of a Euro-centric bias will continue to hinder research, including the use of polygenic scores, which are individual-level metrics of genetic risk. Ongoing debate surrounds the generalizability of polygenic scores based on genome-wide association studies (GWAS) conducted in European ancestry samples, to non-European ancestry samples. We analyzed the first decade of polygenic scoring studies (2008-2017, inclusive), and found that 67% of studies included exclusively European ancestry participants and another 19% included only East Asian ancestry participants. Only 3.8% of studies were carried out on samples of African, Hispanic, or Indigenous peoples. We find that effect sizes for European ancestry-derived polygenic scores are only 36% as large in African ancestry samples, as in European ancestry samples (t=−10.056, df=22, p=5.5×10−10). Analyzing global populations, we show that relationships between height polygenic scores and height are highly dependent on methodological choices in polygenic score construction, highlighting the need for caution in interpreting population level differences in distributions of polygenic scores, as currently calculated. These findings bolster the rationale for large-scale GWAS in diverse human populations and highlight the need for better handling of linkage disequilibrium and variant frequencies when applying scores to non-European samples.


2021 ◽  
Author(s):  
Agnieszka Gidziela ◽  
Kaili Rimfeld ◽  
Margherita Malanchini ◽  
Andrea G. Allegrini ◽  
Andrew McMillan ◽  
...  

AbstractBackgroundOne goal of the DNA revolution is to predict problems in order to prevent them. We tested here if the prediction of behaviour problems from genome-wide polygenic scores (GPS) can be improved by creating composites across ages and across raters and by using a multi-GPS approach that includes GPS for adult psychiatric disorders as well as for childhood behaviour problems.MethodOur sample included 3,065 genotyped unrelated individuals from the Twins Early Development Study who were assessed longitudinally for hyperactivity, conduct, emotional problems and peer problems as rated by parents, teachers and children themselves. GPS created from 15 genome-wide association studies were used separately and jointly to test the prediction of behaviour problems composites (general behaviour problems, externalizing and internalizing) across ages (from age 2 to age 21) and across raters in penalized regression models. Based on the regression weights, we created multi-trait GPS reflecting the best prediction of behaviour problems. We compared GPS prediction to twin heritability using the same sample and measures.ResultsMulti-GPS prediction of behaviour problems increased from less than 2% of the variance for observed traits to up to 6% for cross-age and cross-rater composites. Twin study estimates of heritability mirrored patterns of multi-GPS prediction as they increased from less than 40% to up to 83%.ConclusionsThe ability of GPS to predict behaviour problems can be improved by using multiple GPS, cross-age composites and cross-rater composites, although the effect sizes remain modest, up to 6%. Our results can be used in any genotyped sample to create multi-trait GPS predictors of behaviour problems that will be more predictive than polygenic scores based on a single age, rater or GPS.Key pointsGenome-wide polygenic scores (GPS) can be used to predict behaviour problems in childhood, but the effect sizes are generally less than 3.5%.DNA-based prediction models of achieve greater accuracy if holistic approaches are employed, that is cross-trait, longitudinal and trans-situational approaches.The prediction of childhood behaviour problems can be improved by using multiple GPS to predict composites that aggregate behaviour problems across ages and across raters.Our results yield weights that can be applied to GPS in any study to create multi-trait GPS predictors of behaviour problems based on cross-age and cross-rater composites.As compared to individuals in the lowest multi-trait GPS decile, nearly three times as many individuals in the highest internalizing multi-trait GPS decile were diagnosed with anxiety disorder and 25% more individuals in the highest general behaviour problems and externalizing multi-trait GPS deciles have taken medication for mental health.


Author(s):  
Davide Piffer

Background: The genetic variants identified by three large genome-wide association studies (GWAS) of educational attainment were used to test a polygenic selection model. Methods: Average frequencies of alleles with positive effect (polygenic scores or PS) were compared across populations (N=26) using data from 1000 Genomes. A null model was created using frequencies of random SNPs. Results: Polygenic selection signal of educational attainment GWAS hits is high among a handful of SNPs within genomic regions replicated across GWAS publications. A polygenic score comprising 9 SNPs predicts population IQ (r=0.88), outperforming 99% of the polygenic scores obtained from sets of random SNPs (Monte Carlo p= 0.011). Its predictive power remains unaffected after controlling for spatial autocorrelation (Beta= 0.83). The largest polygenic score (161 SNPs) exhibits similar predictive power (Beta=0.8). Random polygenic scores are moderate predictors of population IQ (thanks to spatial autocorrelation), and their predictive power increases logarithmically with the number of SNPs, indicating an exponential reduction in noise. Conclusion: This study provides guidance for using GWAS hits together with random SNPs for testing polygenic selection using Monte Carlo simulations.


Sign in / Sign up

Export Citation Format

Share Document