scholarly journals A tool for translating polygenic scores onto the absolute scale using summary statistics

Author(s):  
Oliver Pain ◽  
Alexandra C. Gillett ◽  
Jehannine C. Austin ◽  
Lasse Folkersen ◽  
Cathryn M. Lewis

AbstractThere is growing interest in the clinical application of polygenic scores as their predictive utility increases for a range of health-related phenotypes. However, providing polygenic score predictions on the absolute scale is an important step for their safe interpretation. We have developed a method to convert polygenic scores to the absolute scale for binary and normally distributed phenotypes. This method uses summary statistics, requiring only the area-under-the-ROC curve (AUC) or variance explained (R2) by the polygenic score, and the prevalence of binary phenotypes, or mean and standard deviation of normally distributed phenotypes. Polygenic scores are converted using normal distribution theory. We also evaluate methods for estimating polygenic score AUC/R2 from genome-wide association study (GWAS) summary statistics alone. We validate the absolute risk conversion and AUC/R2 estimation using data for eight binary and three continuous phenotypes in the UK Biobank sample. When the AUC/R2 of the polygenic score is known, the observed and estimated absolute values were highly concordant. Estimates of AUC/R2 from the lassosum pseudovalidation method were most similar to the observed AUC/R2 values, though estimated values deviated substantially from the observed for autoimmune disorders. This study enables accurate interpretation of polygenic scores using only summary statistics, providing a useful tool for educational and clinical purposes. Furthermore, we have created interactive webtools implementing the conversion to the absolute (https://opain.github.io/GenoPred/PRS_to_Abs_tool.html). Several further barriers must be addressed before clinical implementation of polygenic scores, such as ensuring target individuals are well represented by the GWAS sample.

2021 ◽  
Author(s):  
Oliver Pain ◽  
Alexandra C Gillett ◽  
Jehannine C Austin ◽  
Lasse Folkersen ◽  
Cathryn M Lewis

Background: There is growing interest in the clinical application of polygenic scores as their predictive utility increases for a range of health-related phenotypes. However, providing polygenic score predictions on the absolute scale is an important step for their safe interpretation. Currently, polygenic scores can only be converted to the absolute scale when a validation sample is available, presenting a major limitation in the interpretability and clinical utility of polygenic scores. Methods: We have developed a method to convert polygenic scores to the absolute scale for binary and normally distributed phenotypes. This method uses summary statistics, requiring only the area-under-the-ROC curve (AUC) or variance explained (R2) by the polygenic score, and the prevalence of binary phenotypes, or mean and standard deviation of normally distributed phenotypes. Polygenic scores are converted using normal distribution theory. Given the AUC/R2 of polygenic scores may be unknown, we also evaluate two methods (AVENGEME, lassosum) for estimating these values from genome-wide association study (GWAS) summary statistics alone. We validate the absolute risk conversion and AUC/R2 estimation using data for eight binary and three continuous phenotypes in the UK Biobank sample. Results: When the AUC/R2 of the polygenic score is known, the observed and estimated absolute values were highly concordant. Across binary phenotypes, the mean absolute difference between the observed and estimated proportion of cases was 5%. For continuous phenotypes, the mean absolute difference between observed and estimated means was <0.3%. Estimates of AUC/R2 from the lassosum pseudovalidation method were most similar to the observed AUC/R2 values, though estimated values deviated substantially from the observed for autoimmune disorders. Conclusion: This study enables accurate interpretation of polygenic scores using only summary statistics, providing a useful tool for educational and clinical purposes. Furthermore, we have created interactive webtools implementing the conversion to the absolute scale for binary and normally distributed phenotypes (https://opain.github.io/GenoPred/PRS_to_Abs_tool.html). Several further barriers must be addressed before clinical implementation of polygenic scores, such as ensuring target individuals are well represented by the GWAS sample.


2019 ◽  
Author(s):  
Reut Avinun ◽  
Adam Nevo ◽  
Annchen R. Knodt ◽  
Maxwell L. Elliott ◽  
Ahmad R. Hariri

AbstractAccumulating research suggests that the pro-inflammatory cytokine interleukin-1β (IL-1β) has a modulatory effect on the hippocampus, a brain structure important for learning and memory as well as linked with both psychiatric and neurodegenerative disorders. Here, we use an imaging genetics strategy to test an association between an IL-1β polygenic score, derived from summary statistics of a recent genome-wide association study (GWAS) of circulating cytokines, and hippocampal volume, in two independent samples. In the first sample of 512 non-Hispanic Caucasian university students (274 women, mean age 19.78 ± 1.24 years) from the Duke Neurogenetics Study, we identified a significant positive correlation between higher polygenic scores, which presumably reflect higher circulating IL-1β levels, and average hippocampal volume. This positive association was successfully replicated in a second sample of 7,960 white British volunteers (4,158 women, mean age 62.63±7.45 years) from the UK Biobank. Collectively, our results suggest that a functional GWAS-derived score of IL-1β blood circulating levels affects hippocampal volume, and lend further support in humans, to the link between IL-1β and the structure of the hippocampus.


2020 ◽  
Author(s):  
John E. McGeary ◽  
Chelsie Benca-Bachman ◽  
Victoria Risner ◽  
Christopher G Beevers ◽  
Brandon Gibb ◽  
...  

Twin studies indicate that 30-40% of the disease liability for depression can be attributed to genetic differences. Here, we assess the explanatory ability of polygenic scores (PGS) based on broad- (PGSBD) and clinical- (PGSMDD) depression summary statistics from the UK Biobank using independent cohorts of adults (N=210; 100% European Ancestry) and children (N=728; 70% European Ancestry) who have been extensively phenotyped for depression and related neurocognitive phenotypes. PGS associations with depression severity and diagnosis were generally modest, and larger in adults than children. Polygenic prediction of depression-related phenotypes was mixed and varied by PGS. Higher PGSBD, in adults, was associated with a higher likelihood of having suicidal ideation, increased brooding and anhedonia, and lower levels of cognitive reappraisal; PGSMDD was positively associated with brooding and negatively related to cognitive reappraisal. Overall, PGS based on both broad and clinical depression phenotypes have modest utility in adult and child samples of depression.


2018 ◽  
Author(s):  
Timothy Shin Heng Mak ◽  
Robert Milan Porsch ◽  
Shing Wan Choi ◽  
Pak Chung Sham

AbstractPolygenic scores (PGS) are estimated scores representing the genetic tendency of an individual for a disease or trait and have become an indispensible tool in a variety of analyses. Typically they are linear combination of the genotypes of a large number of SNPs, with the weights calculated from an external source, such as summary statistics from large meta-analyses. Recently cohorts with genetic data have become very large, such that it would be a waste if the raw data were not made use of in constructing PGS. Making use of raw data in calculating PGS, however, presents us with problems of overfitting. Here we discuss the essence of overfitting as applied in PGS calculations and highlight the difference between overfitting due to the overlap between the target and the discovery data (OTD), and overfitting due to the overlap between the target the the validation data (OTV). We propose two methods — cross prediction and split validation — to overcome OTD and OTV respectively. Using these two methods, PGS can be calculated using raw data without overfitting. We show that PGSs thus calculated have better predictive power than those using summary statistics alone for six phenotypes in the UK Biobank data.


2016 ◽  
Author(s):  
Benjamin W. Domingue ◽  
Hexuan Liu ◽  
Aysu Okbay ◽  
Daniel W. Belsky

AbstractExperience of stressful life events is associated with risk of depression. Yet many exposed individuals do not become depressed. A controversial hypothesis is that genetic factors influence vulnerability to depression following stress. This hypothesis is most commonly tested with a “diathesis-stress” model, in which genes confer excess vulnerability. We tested an alternative model, in which genes may buffer against the depressogenic effects of life stress. We measured the hypothesized genetic buffer using a polygenic score derived from a published genome-wide association study (GWAS) of subjective wellbeing. We tested if married older adults who had higher polygenic scores were less vulnerable to depressive symptoms following the death of their spouse as compared to age-peers who had also lost their spouse and who had lower polygenic scores. We analyzed data from N=9,453 non-Hispanic white adults in the Health and Retirement Study (HRS), a population-representative longitudinal study of older adults in the United States. HRS adults with higher wellbeing polygenic scores experienced fewer depressive symptoms during follow-up. Those who survived death of their spouses during follow-up (n=1,829) experienced a sharp increase in depressive symptoms following the death and returned toward baseline over the following two years. Having a higher polygenic score buffered against increased depressive symptoms following a spouse's death. Effects were small and clinical relevance is uncertain, although polygenic score analyses may provide clues to behavioral pathways that can serve as therapeutic targets. Future studies of gene-environment interplay in depression may benefit from focus on genetics discovered for putative protective factors.


2018 ◽  
Vol 115 (31) ◽  
pp. E7275-E7284 ◽  
Author(s):  
Daniel W. Belsky ◽  
Benjamin W. Domingue ◽  
Robbee Wedow ◽  
Louise Arseneault ◽  
Jason D. Boardman ◽  
...  

A summary genetic measure, called a “polygenic score,” derived from a genome-wide association study (GWAS) of education can modestly predict a person’s educational and economic success. This prediction could signal a biological mechanism: Education-linked genetics could encode characteristics that help people get ahead in life. Alternatively, prediction could reflect social history: People from well-off families might stay well-off for social reasons, and these families might also look alike genetically. A key test to distinguish biological mechanism from social history is if people with higher education polygenic scores tend to climb the social ladder beyond their parents’ position. Upward mobility would indicate education-linked genetics encodes characteristics that foster success. We tested if education-linked polygenic scores predicted social mobility in >20,000 individuals in five longitudinal studies in the United States, Britain, and New Zealand. Participants with higher polygenic scores achieved more education and career success and accumulated more wealth. However, they also tended to come from better-off families. In the key test, participants with higher polygenic scores tended to be upwardly mobile compared with their parents. Moreover, in sibling-difference analysis, the sibling with the higher polygenic score was more upwardly mobile. Thus, education GWAS discoveries are not mere correlates of privilege; they influence social mobility within a life. Additional analyses revealed that a mother’s polygenic score predicted her child’s attainment over and above the child’s own polygenic score, suggesting parents’ genetics can also affect their children’s attainment through environmental pathways. Education GWAS discoveries affect socioeconomic attainment through influence on individuals’ family-of-origin environments and their social mobility.


2018 ◽  
Author(s):  
Alicia R. Martin ◽  
Masahiro Kanai ◽  
Yoichiro Kamatani ◽  
Yukinori Okada ◽  
Benjamin M. Neale ◽  
...  

AbstractPolygenic risk scores (PRS) are poised to improve biomedical outcomes via precision medicine. However, the major ethical and scientific challenge surrounding clinical implementation is that they are many-fold more accurate in European ancestry individuals than others. This disparity is an inescapable consequence of Eurocentric genome-wide association study biases. This highlights that—unlike clinical biomarkers and prescription drugs, which may individually work better in some populations but do not ubiquitously perform far better in European populations—clinical uses of PRS today would systematically afford greater improvement to European descent populations. Early diversifying efforts show promise in levelling this vast imbalance, even when non-European sample sizes are considerably smaller than the largest studies to date. To realize the full and equitable potential of PRS, we must prioritize greater diversity in genetic studies and public dissemination of summary statistics to ensure that health disparities are not increased for those already most underserved.


BJGP Open ◽  
2020 ◽  
Vol 4 (1) ◽  
pp. bjgpopen20X101016 ◽  
Author(s):  
Julian Stephen Treadwell ◽  
Geoff Wong ◽  
Coral Milburn-Curtis ◽  
Benjamin Feakins ◽  
Trisha Greenhalgh

BackgroundGPs prescribe multiple long-term treatments to their patients. For shared clinical decision-making, understanding of the absolute benefits and harms of individual treatments is needed. International evidence shows that doctors’ knowledge of treatment effects is poor but, to the authors knowledge, this has not been researched among GPs in the UK.AimTo measure the level and range of the quantitative understanding of the benefits and harms of treatments for common long-term conditions (LTCs) among GPs.Design & settingAn online cross-sectional survey was distributed to GPs in the UK.MethodParticipants were asked to estimate the percentage absolute risk reduction or increase conferred by 13 interventions across 10 LTCs on 17 important outcomes. Responses were collated and presented in a novel graphic format to allow detailed visualisation of the findings. Descriptive statistical analysis was performed.ResultsA total of 443 responders were included in the analysis. Most demonstrated poor (and in some cases very poor) knowledge of the absolute benefits and harms of treatments. Overall, an average of 10.9% of responses were correct allowing for ±1% margin in absolute risk estimates and 23.3% allowing a ±3% margin. Eighty-seven point seven per cent of responses overestimated and 8.9% of responses underestimated treatment effects. There was no tendency to differentially overestimate benefits and underestimate harms. Sixty-four point eight per cent of GPs self-reported ‘low’ to ‘very low’ confidence in their knowledge.ConclusionGPs’ knowledge of the absolute benefits and harms of treatments is poor, with inaccuracies of a magnitude likely to meaningfully affect clinical decision-making and impede conversations with patients regarding treatment choices.


Author(s):  
John P.A. Ioannidis ◽  
Cathrine Axfors ◽  
Despina G. Contopoulos-Ioannidis

AbstractOBJECTIVETo provide estimates of the relative risk of COVID-19 death in people <65 years old versus older individuals in the general population, the absolute risk of COVID-19 death at the population level during the first epidemic wave, and the proportion of COVID-19 deaths in non-elderly people without underlying diseases in epicenters of the pandemic.ELIGIBLE DATACountries and US states with at least 800 COVID-19 deaths as of April 24, 2020 and with information on the number of deaths in people with age <65. Data were available for 11 European countries (Belgium, France, Germany, Ireland, Italy, Netherlands, Portugal, Spain, Sweden, Switzerland, UK), Canada, and 12 US states (California, Connecticut, Florida, Georgia, Illinois, Indiana, Louisiana, Maryland, Massachusetts, Michigan, New Jersey and New York) We also examined available data on COVID-19 deaths in people with age <65 and no underlying diseases.MAIN OUTCOME MEASURESProportion of COVID-19 deaths in people <65 years old; relative risk of COVID-19 death in people <65 versus ≥65 years old; absolute risk of COVID-19 death in people <65 and in those ≥80 years old in the general population as of May 1, 2020; absolute COVID-19 death risk expressed as equivalent of death risk from driving a motor vehicle.RESULTSIndividuals with age <65 account for 4.8-9.3% of all COVID-19 deaths in 10 European countries and Canada, 13.0% in the UK, and 7.8-23.9% in the US locations. People <65 years old had 36- to 84-fold lower risk of COVID-19 death than those ≥65 years old in 10 European countries and Canada and 14- to 56-fold lower risk in UK and US locations. The absolute risk of COVID-19 death as of May 1, 2020 for people <65 years old ranged from 6 (Canada) to 249 per million (New York City). The absolute risk of COVID-19 death for people ≥80 years old ranged from 0.3 (Florida) to 10.6 per thousand (New York). The COVID-19 death risk in people <65 years old during the period of fatalities from the epidemic was equivalent to the death risk from driving between 13 and 101 miles per day for 11 countries and 6 states, and was higher (equivalent to the death risk from driving 143-668 miles per day) for 6 other states and the UK. People <65 years old without underlying predisposing conditions accounted for only 0.7-2.6% of all COVID-19 deaths (data available from France, Italy, Netherlands, Sweden, Georgia, and New York City).CONCLUSIONSPeople <65 years old have very small risks of COVID-19 death even in pandemic epicenters and deaths for people <65 years without underlying predisposing conditions are remarkably uncommon. Strategies focusing specifically on protecting high-risk elderly individuals should be considered in managing the pandemic.


2019 ◽  
Author(s):  
Sam Trejo ◽  
Benjamin W. Domingue

AbstractSummary statistics from a genome-wide association study (GWAS) can be used to generate a polygenic score (PGS). For complex, behavioral traits, the correlation between an individual’s PGS and their phenotype may contain bias alongside the causal effect of the individual’s genes (due to geographic, ancestral, and/or socioeconomic confounding). We formalize the recent introduction of a different source of bias in regression models using PGSs: the effects of parental genes on offspring outcomes, also known as genetic nurture. GWAS do not discriminate between the various pathways through which genes influence outcomes, meaning existing PGSs capture both direct genetic effects and genetic nurture effects. We construct a theoretical model for genetic effects and show that, unlike other sources of bias in PGSs, the presence of genetic nurture biases PGS coefficients from both naïve OLS (between-family) and family fixed effects (within-family) regressions. This bias is in opposite directions; while naïve OLS estimates are biased upwards, family fixed effects estimates are biased downwards. We quantify this bias for a given trait using two novel parameters that we identify and discuss: (1) the genetic correlation between the direct and nurture effects and (2) the ratio of the SNP heritabilities for the direct and nurture effects.


Sign in / Sign up

Export Citation Format

Share Document