scholarly journals Comparing within- and between-family polygenic score prediction

2019 ◽  
Author(s):  
Saskia Selzam ◽  
Stuart J. Ritchie ◽  
Jean-Baptiste Pingault ◽  
Chandra A. Reynolds ◽  
Paul F. O’Reilly ◽  
...  

AbstractPolygenic scores are a popular tool for prediction of complex traits. However, prediction estimates in samples of unrelated participants can include effects of population stratification, assortative mating and environmentally mediated parental genetic effects, a form of genotype-environment correlation (rGE). Comparing genome-wide polygenic score (GPS) predictions in unrelated individuals with predictions between siblings in a within-family design is a powerful approach to identify these different sources of prediction. Here, we compared within- to between-family GPS predictions of eight life outcomes (anthropometric, cognitive, personality and health) for eight corresponding GPSs. The outcomes were assessed in up to 2,366 dizygotic (DZ) twin pairs from the Twins Early Development Study from age 12 to age 21. To account for family clustering, we used mixed-effects modelling, simultaneously estimating within- and between-family effects for target- and cross-trait GPS prediction of the outcomes. There were three main findings: (1) DZ twin GPS differences predicted DZ differences in height, BMI, intelligence, educational achievement and ADHD symptoms; (2) target and cross-trait analyses indicated that GPS prediction estimates for cognitive traits (intelligence and educational achievement) were on average 60% greater between families than within families, but this was not the case for non-cognitive traits; and (3) this within- and between-family difference for cognitive traits disappeared after controlling for family socio-economic status (SES), suggesting that SES is a source of between-family prediction through rGE mechanisms. These results provide novel insights into the patterns by which rGE contributes to GPS prediction, while ruling out confounding due to population stratification and assortative mating.

2020 ◽  
Author(s):  
Yongkang Kim ◽  
Jared V. Balbona ◽  
Matthew C. Keller

AbstractIn a companion paper Balbona et al. (Behav Genet, in press), we introduced a series of causal models that use polygenic scores from transmitted and nontransmitted alleles, the offspring trait, and parental traits to estimate the variation due to the environmental influences the parental trait has on the offspring trait (vertical transmission) as well as additive genetic effects. These models also estimate and account for the gene-gene and gene-environment covariation that arises from assortative mating and vertical transmission respectively. In the current study, we simulated polygenic scores and phenotypes of parents and offspring under genetic and vertical transmission scenarios, assuming two types of assortative mating. We instantiated the models from our companion paper in the OpenMx software, and compared the true values of parameters to maximum likelihood estimates from models fitted on the simulated data to quantify the bias and precision of estimates. We show that parameter estimates from these models are unbiased when assumptions are met, but as expected, they are biased to the degree that assumptions are unmet. Standard errors of the estimated variances due to vertical transmission and to genetic effects decrease with increasing sample sizes and with increasing $$r^2$$ r 2 values of the polygenic score. Even when the polygenic score explains a modest amount of trait variation ($$r^2=.05$$ r 2 = . 05 ), standard errors of these standardized estimates are reasonable ($$< .05$$ < . 05 ) for $$n=16K$$ n = 16 K trios, and can even be reasonable for smaller sample sizes (e.g., down to 4K) when the polygenic score is more predictive. These causal models offer a novel approach for understanding how parents influence their offspring, but their use requires polygenic scores on relevant traits that are modestly predictive (e.g., $$r^2>.025)$$ r 2 > . 025 ) as well as datasets with genomic and phenotypic information on parents and offspring. The utility of polygenic scores for elucidating parental influences should thus serve as additional motivation for large genomic biobanks to perform GWAS’s on traits that may be relevant to parenting and to oversample close relatives, particularly parents and offspring.


Author(s):  
Arslan A. Zaidi ◽  
Iain Mathieson

AbstractLarge genome-wide association studies (GWAS) have identified many loci exhibiting small but statistically significant associations with complex traits and disease risk. However, control of population stratification continues to be a limiting factor, particularly when calculating polygenic scores where subtle biases can cumulatively lead to large errors. We simulated GWAS under realistic models of demographic history to study the effect of residual stratification in large GWAS. We show that when population structure is recent, it cannot be fully corrected using principal components based on common variants—the standard approach—because common variants are uninformative about recent demographic history. Consequently, polygenic scores calculated from such GWAS results are biased in that they recapitulate non-genetic environmental structure. Principal components calculated from rare variants or identity-by-descent segments largely correct for this structure if environmental effects are smooth. However, even these corrections are not effective for local or batch effects. While sibling-based association tests are immune to stratification, the hybrid approach of ascertaining variants in a standard GWAS and then re-estimating effect sizes in siblings reduces but does not eliminate bias. Finally, we show that rare variant burden tests are relatively robust to stratification. Our results demonstrate that the effect of population stratification on GWAS and polygenic scores depends not only on the frequencies of tested variants and the distribution of environmental effects but also on the demographic history of the population.


2018 ◽  
Author(s):  
A.G. Allegrini ◽  
S. Selzam ◽  
K. Rimfeld ◽  
S. von Stumm ◽  
J.B. Pingault ◽  
...  

AbstractRecent advances in genomics are producing powerful DNA predictors of complex traits, especially cognitive abilities. Here, we leveraged summary statistics from the most recent genome-wide association studies of intelligence and educational attainment to build prediction models of general cognitive ability and educational achievement. To this end, we compared the performances of multi-trait genomic and polygenic scoring methods. In a representative UK sample of 7,026 children at age 12 and 16, we show that we can now predict up to 11 percent of the variance in intelligence and 16 percent in educational achievement. We also show that predictive power increases from age 12 to age 16 and that genomic predictions do not differ for girls and boys. Multivariate genomic methods were effective in boosting predictive power and, even though prediction accuracy varied across polygenic scores approaches, results were similar using different multivariate and polygenic score methods. Polygenic scores for educational attainment and intelligence are the most powerful predictors in the behavioural sciences and exceed predictions that can be made from parental phenotypes such as educational attainment and occupational status.


2015 ◽  
Vol 18 (6) ◽  
pp. 738-745 ◽  
Author(s):  
Michelle Luciano ◽  
Riccardo E. Marioni ◽  
Maria Valdés Hernández ◽  
Susana Muñoz Maniega ◽  
Iona F. Hamilton ◽  
...  

Structural brain magnetic resonance imaging (MRI) traits share part of their genetic variance with cognitive traits. Here, we use genetic association results from large meta-analytic studies of genome-wide association (GWA) for brain infarcts (BI), white matter hyperintensities, intracranial, hippocampal, and total brain volumes to estimate polygenic scores for these traits in three Scottish samples: Generation Scotland: Scottish Family Health Study (GS:SFHS), and the Lothian Birth Cohorts of 1936 (LBC1936) and 1921 (LBC1921). These five brain MRI trait polygenic scores were then used to: (1) predict corresponding MRI traits in the LBC1936 (numbers ranged 573 to 630 across traits), and (2) predict cognitive traits in all three cohorts (in 8,115–8,250 persons). In the LBC1936, all MRI phenotypic traits were correlated with at least one cognitive measure, and polygenic prediction of MRI traits was observed for intracranial volume. Meta-analysis of the correlations between MRI polygenic scores and cognitive traits revealed a significant negative correlation (maximal r = 0.08) between the HV polygenic score and measures of global cognitive ability collected in childhood and in old age in the Lothian Birth Cohorts. The lack of association to a related general cognitive measure when including the GS:SFHS points to either type 1 error or the importance of using prediction samples that closely match the demographics of the GWA samples from which prediction is based. Ideally, these analyses should be repeated in larger samples with data on both MRI and cognition, and using MRI GWA results from even larger meta-analysis studies.


Author(s):  
Robert Plomin ◽  
Sophie von Stumm

AbstractDuring the past decade, polygenic scores have become a fast-growing area of research in the behavioural sciences. The ability to directly assess people’s genetic propensities has transformed research by making it possible to add genetic predictors of traits to any study. The value of polygenic scores in the behavioural sciences rests on using inherited DNA differences to predict, from birth, common disorders and complex traits in unrelated individuals in the population. This predictive power of polygenic scores does not require knowing anything about the processes that lie between genes and behaviour. It also does not mandate disentangling the extent to which the prediction is due to assortative mating, genotype–environment correlation, or even population stratification. Although bottom-up explanation from genes to brain to behaviour will remain the long-term goal of the behavioural sciences, prediction is also a worthy achievement because it has immediate practical utility for identifying individuals at risk and is the necessary first step towards explanation. A high priority for research must be to increase the predictive power of polygenic scores to be able to use them as an early warning system to prevent problems.


2021 ◽  
Author(s):  
Robert Plomin ◽  
Sophie von Stumm

During the past decade, polygenic scores have become the fastest-growing area of research in the behavioural sciences. The ability to predict genetic propensities has transformed research by making it possible to add genetic predictors of traits to any study. The value of polygenic scores in the behavioural sciences rests in using inherited DNA differences to predict, from birth, common disorders and complex traits in unrelated individuals in the population. This predictive power of polygenic scores does not require knowing anything about the processes that lie between genes and behaviour. It also does not mandate disentangling the extent to which the prediction is due to assortative mating, genotype-environment correlation, or even population stratification. Although bottom-up explanation from genes to brain to behaviour will remain the long-term goal of the behavioural sciences, prediction is also a worthy achievement because it has immediate practical utility for identifying individuals at risk and is the necessary first step towards explanation. A high priority for research must be to increase the predictive power of polygenic scores to be able to use them as an early warning system to prevent problems.


2020 ◽  
Vol 6 (16) ◽  
pp. eaay0328 ◽  
Author(s):  
Tim T. Morris ◽  
Neil M. Davies ◽  
Gibran Hemani ◽  
George Davey Smith

Heritability, genetic correlation, and genetic associations estimated from samples of unrelated individuals are often perceived as confirmation that genotype causes the phenotype(s). However, these estimates can arise from indirect mechanisms due to population phenomena including population stratification, dynastic effects, and assortative mating. We introduce these, describe how they can bias or inflate genotype-phenotype associations, and demonstrate methods that can be used to assess their presence. Using data on educational achievement and parental socioeconomic position as an exemplar, we demonstrate that both heritability and genetic correlation may be biased estimates of the causal contribution of genotype. These results highlight the limitations of genotype-phenotype estimates obtained from samples of unrelated individuals. Use of these methods in combination with family-based designs may offer researchers greater opportunities to explore the mechanisms driving genotype-phenotype associations and identify factors underlying bias in estimates.


2020 ◽  
Author(s):  
Yongkang Kim ◽  
Jared V. Balbona ◽  
Matthew C. Keller

AbstractIn a companion paper (Balbona et al. (2020)), we introduced a series of causal models that use polygenic scores from transmitted and nontransmitted alleles, the offspring trait, and parental traits to estimate the variation due to the environmental influences the parental trait has on the offspring trait (vertical transmission) as well as additive genetic effects. These models also estimate and account for the gene-gene and gene-environment covariation that arises from assortative mating and vertical transmission respectively. In the current study, we simulated polygenic scores and phenotypes of parents and offspring under genetic and vertical transmission scenarios, assuming two types of assortative mating. We instantiated the models from our companion paper in the OpenMx software, and compared the true values of parameters to maximum likelihood estimates from models fitted on the simulated data to quantify the bias and precision of estimates. We show that parameter estimates from these models are unbiased when assumptions are met, but as expected, they are biased to the degree that assumptions are unmet. Standard errors of the estimated variances due to vertical transmission and to genetic effects decrease with increasing sample sizes and with increasing r2 values of the polygenic score. Even when the polygenic score explains a modest amount of trait variation (r2 = .05), standard errors of these standardized estimates were reasonable (< .05) for n = 16K trios, and smaller sample sizes (e.g., down to 4K) when the polygenic score is more predictive. These causal models offer a novel approach for understanding how parents influence their offspring, but their use requires polygenic scores on relevant traits that are modestly predictive (e.g., r2 > .025) as well as datasets with genomic and phenotypic information on parents and offspring. The utility of polygenic scores for elucidating parental influences should thus serve as additional motivation for large genomic biobanks to perform GWAS’s on traits that may be relevant to parenting and to oversample close relatives, particularly parents and offspring.


2021 ◽  
Author(s):  
Lianne P. de Vries ◽  
Toos C. E. M. van Beijsterveldt ◽  
Hermine Maes ◽  
Lucía Colodro-Conde ◽  
Meike Bartels

AbstractThe distinction between genetic influences on the covariance (or bivariate heritability) and genetic correlations in bivariate twin models is often not well-understood or only one is reported while the results show distinctive information about the relation between traits. We applied bivariate twin models in a large sample of adolescent twins, to disentangle the association between well-being (WB) and four complex traits (optimism, anxious-depressed symptoms (AD), aggressive behaviour (AGG), and educational achievement (EA)). Optimism and AD showed respectively a strong positive and negative phenotypic correlation with WB, the negative correlation of WB and AGG is lower and the correlation with EA is nearly zero. All four traits showed a large genetic contribution to the covariance with well-being. The genetic correlations of well-being with optimism and AD are strong and smaller for AGG and EA. We used the results of the models to explain what information is retrieved based on the bivariate heritability versus the genetic correlations and the (clinical) implications.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Lauren L. Schmitz ◽  
Julia Goodwin ◽  
Jiacheng Miao ◽  
Qiongshi Lu ◽  
Dalton Conley

AbstractUnemployment shocks from the COVID-19 pandemic have reignited concerns over the long-term effects of job loss on population health. Past research has highlighted the corrosive effects of unemployment on health and health behaviors. This study examines whether the effects of job loss on changes in body mass index (BMI) are moderated by genetic predisposition using data from the U.S. Health and Retirement Study (HRS). To improve detection of gene-by-environment (G × E) interplay, we interacted layoffs from business closures—a plausibly exogenous environmental exposure—with whole-genome polygenic scores (PGSs) that capture genetic contributions to both the population mean (mPGS) and variance (vPGS) of BMI. Results show evidence of genetic moderation using a vPGS (as opposed to an mPGS) and indicate genome-wide summary measures of phenotypic plasticity may further our understanding of how environmental stimuli modify the distribution of complex traits in a population.


Sign in / Sign up

Export Citation Format

Share Document