scholarly journals Multi-ethnic polygenic risk scores improve risk prediction in diverse populations

2016 ◽  
Author(s):  
Carla Márquez-Luna ◽  
Po-Ru Loh ◽  
Alkes L. Price ◽  
◽  

AbstractMethods for genetic risk prediction have been widely investigated in recent years. However, most available training data involves European samples, and it is currently unclear how to accurately predict disease risk in other populations. Previous studies have used either training data from European samples in large sample size or training data from the target population in small sample size, but not both. Here, we introduce a multi-ethnic polygenic risk score that combines training data from European samples and training data from the target population. We applied this approach to predict type 2 diabetes (T2D) in a Latino cohort using both publicly available European summary statistics in large sample size and Latino training data in small sample size. We attained a >70% relative improvement in prediction accuracy (from R2=0.027 to R2=0.047) compared to methods that use only one source of training data, consistent with large relative improvements in simulations. We observed a systematically lower load of T2D risk alleles in Latino individuals with more European ancestry, which could be explained by polygenic selection in ancestral European and/or Native American populations. Application of our approach to predict T2D in a South Asian UK Biobank cohort attained a >70% relative improvement in prediction accuracy, and application to predict height in an African UK Biobank cohort attained a 30% relative improvement. Our work reduces the gap in polygenic risk prediction accuracy between European and non-European target populations.Author SummaryThe use of genetic information to predict disease risk is of great interest because of its potential clinical application. Prediction is performed via the construction of polygenic risk scores, which separate individuals into different risk categories. Polygenic risk scores can also be applied to improve our understanding of the genetic architecture of complex diseases. The ideal training data set would be a large cohort from the same population as the target sample, but this is generally unavailable for non-European populations. Thus, we propose a summary statistics based polygenic risk score that leverages both a large European training sample and a training sample from the same population as the target population. This approach produces a substantial relative improvement in prediction accuracy compared to methods that use a single training population when applied to predict type 2 diabetes in a Latino cohort, consistent with simulation results. We observed similar relative improvements in applications to predict type 2 diabetes in a South Asian cohort and height in an African cohort.

2020 ◽  
Vol 5 ◽  
pp. 206
Author(s):  
Mathilde Boecker ◽  
Alvina G. Lai

Over the past three decades, the number of people globally with diabetes mellitus has more than doubled. It is estimated that by 2030, 439 million people will be suffering from the disease, 90-95% of whom will have type 2 diabetes (T2D). In 2017, 5 million deaths globally were attributable to T2D, placing it in the top 10 global causes of death. Because T2D is a result of both genetic and environmental factors, identification of individuals with high genetic risk can help direct early interventions to prevent progression to more serious complications. Genome-wide association studies have identified ~400 variants associated with T2D that can be used to calculate polygenic risk scores (PRS). Although PRSs are not currently more accurate than clinical predictors and do not yet predict risk with equal accuracy across all ethnic populations, they have several potential clinical uses. Here, we discuss potential usages of PRS for predicting T2D and for informing and optimising interventions. We also touch on possible health inequality risks of PRS and the feasibility of large-scale implementation of PRS in clinical practice. Before PRSs can be used as a therapeutic tool, it is important that further polygenic risk models are derived using non-European genome-wide association studies to ensure that risk prediction is accurate for all ethnic groups. Furthermore, it is essential that the ethical, social and legal implications of PRS are considered before their implementation in any context.


Diabetes Care ◽  
2021 ◽  
pp. dc202049
Author(s):  
Yixuan He ◽  
Chirag M. Lakhani ◽  
Danielle Rasooly ◽  
Arjun K. Manrai ◽  
Ioanna Tzoulaki ◽  
...  

2016 ◽  
Vol 19 (3) ◽  
pp. 322-329 ◽  
Author(s):  
Kristi Läll ◽  
Reedik Mägi ◽  
Andrew Morris ◽  
Andres Metspalu ◽  
Krista Fischer

2020 ◽  
Vol 21 (5) ◽  
pp. 1703 ◽  
Author(s):  
Felipe Padilla-Martínez ◽  
Francois Collin ◽  
Miroslaw Kwasniewski ◽  
Adam Kretowski

Recent studies have led to considerable advances in the identification of genetic variants associated with type 1 and type 2 diabetes. An approach for converting genetic data into a predictive measure of disease susceptibility is to add the risk effects of loci into a polygenic risk score. In order to summarize the recent findings, we conducted a systematic review of studies comparing the accuracy of polygenic risk scores developed during the last two decades. We selected 15 risk scores from three databases (Scopus, Web of Science and PubMed) enrolled in this systematic review. We identified three polygenic risk scores that discriminate between type 1 diabetes patients and healthy people, one that discriminate between type 1 and type 2 diabetes, two that discriminate between type 1 and monogenic diabetes and nine polygenic risk scores that discriminate between type 2 diabetes patients and healthy people. Prediction accuracy of polygenic risk scores was assessed by comparing the area under the curve. The actual benefits, potential obstacles and possible solutions for the implementation of polygenic risk scores in clinical practice were also discussed. Develop strategies to establish the clinical validity of polygenic risk scores by creating a framework for the interpretation of findings and their translation into actual evidence, are the way to demonstrate their utility in medical practice.


2021 ◽  
Author(s):  
Sam Hodgson ◽  
Qin Qin Huang ◽  
Neneh Sallah ◽  
Chris J Griffiths ◽  
William Newman ◽  
...  

Background: Type 2 diabetes is a heterogeneous condition highly prevalent in British Pakistanis and Bangladeshis (BPB). The Genes & Health (G&H) cohort offers means to explore genetic determinants of disease in BPBs, combining genetic and lifelong health record data. Methods: We assessed whether common genetic loci associated with type 2 diabetes in European-ancestry individuals (EUR) replicate in G&H. We constructed a type 2 diabetes polygenic risk score (PRS) and combined it with a clinical risk instrument (QDiabetes) to build a novel, integrated risk tool (IRT). We compared IRT performance using net reclassification index (NRI) versus QDiabetes alone. We assessed the ability of the PRS to predict type 2 diabetes following gestational diabetes (GDM). We compared PRS distribution between type 2 diabetes subgroups identified by clinical features at diagnosis. Findings: Accounting for power, we replicated fewer loci associated with type 2 diabetes in G&H (n = 76/338, 22%) than would be expected if all EUR-ascertained loci were transferable (n = 95, 28%) (binomial p value = 0.01). In 13,648 patients free from type 2 diabetes followed up for 10 years, NRI was 3.2% for IRT versus QDiabetes (95% confidence interval 2.0 - 4.4%). IRT performance was best in reclassification of young adults deemed low risk by QDiabetes as high risk. PRS was independently associated with progression to type 2 diabetes after GDM (p = 0.028). Mean type 2 diabetes PRS differed between phenotypically-defined type 2 diabetes subgroups (p = 0.002). Interpretation: The type 2 diabetes PRS has broad potential clinical application in BPB, improving identification of type 2 diabetes risk (especially in the young), and characterisation of type 2 diabetes subgroups at diagnosis. Funding: Wellcome Trust, MRC, NIHR, and others. Full funding disclosed within.


2020 ◽  
Author(s):  
K Dziopa ◽  
F W Asselbergs ◽  
J Gratton ◽  
N Chaturvedi ◽  
A F Schmidt

AbstractObjectiveTo compare performance of general and diabetes specific cardiovascular risk prediction scores in type 2 diabetes patients (T2DM).DesignCohort study.SettingScores were identified through a systematic review and included irrespective of predicted outcome, or inclusion of T2DM patients. Performance was assessed using data from routine practice.ParticipantsA contemporary representative sample of 203,172 UK T2DM patients (age ≥ 18 years).Main outcome measuresCardiovascular disease (CVD i.e., coronary heart disease and stroke) and CVD+ (including atrial fibrillation and heart failure).ResultsWe identified 22 scores: 11 derived in the general population, 9 in only T2DM patients, and 2 that excluded T2DM patients. Over 10 years follow-up, 63,000 events occurred. The RECODE score, derived in people with T2DM, performed best for both CVD (c-statistic 0.731 (0.728,0.734), and CVD+ (0.732 (0.729,0.735)). Overall, neither derivation population, nor original predicted outcome influenced performance. Calibration slopes (1 indicates perfect calibration) ranged from 0.38 (95%CI 0.37;0.39) to 1.05 (95%CI 1.03;1.07). A simple, population specific recalibration process considerably improved performance, ranging between 0.98 and 1.03. Risk scores performed badly in people with pre-existing CVD (c-statistic ∼0.55). Scores with more predictors did not perform scores better: for CVD+ QRISK3 (19 variables) c-statistic 0.69 (95%CI 0.68;0.69), compared to CHD Basic (8 variables) 0.71 (95%CI 0.70; 0.71).ConclusionsCVD risk prediction scores performed well in T2DM, irrespective of derivation population and of original predicted outcome. Scores performed poorly in patients with established CVD. Complex scores with multiple variables did not outperform simple scores. A simple population specific recalibration markedly improved score performance and is recommended for future use.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Yixuan Ye ◽  
Xi Chen ◽  
James Han ◽  
Wei Jiang ◽  
Pradeep Natarajan ◽  
...  

Background: Both lifestyle and genetic factors confer risk for cardiovascular diseases, type 2 diabetes, and dyslipidemia. However, the interactions between these 2 groups of risk factors were not comprehensively understood due to previous poor estimation of genetic risk. Here we set out to develop enhanced polygenic risk scores (PRS) and systematically investigate multiplicative and additive interactions between PRS and lifestyle for coronary artery disease, atrial fibrillation, type 2 diabetes, total cholesterol, triglyceride, and LDL-cholesterol. Methods: Our study included 276 096 unrelated White British participants from the UK Biobank. We investigated several PRS methods (P+T, LDpred, PRS continuous shrinkage, and AnnoPred) and showed that AnnoPred achieved consistently improved prediction accuracy for all 6 diseases/traits. With enhanced PRS and combined lifestyle status categorized by smoking, body mass index, physical activity, and diet, we investigated both multiplicative and additive interactions between PRS and lifestyle using regression models. Results: We observed that healthy lifestyle reduced disease incidence by similar multiplicative magnitude across different PRS groups. The absolute risk reduction from lifestyle adherence was, however, significantly greater in individuals with higher PRS. Specifically, for type 2 diabetes, the absolute risk reduction from lifestyle adherence was 12.4% (95% CI, 10.0%–14.9%) in the top 1% PRS versus 2.8% (95% CI, 2.3%–3.3%) in the bottom PRS decile, leading to a ratio of >4.4. We also observed a significant interaction effect between PRS and lifestyle on triglyceride level. Conclusions: By leveraging functional annotations, AnnoPred outperforms state-of-the-art methods on quantifying genetic risk through PRS. Our analyses based on enhanced PRS suggest that individuals with high genetic risk may derive similar relative but greater absolute benefit from lifestyle adherence.


JAMIA Open ◽  
2020 ◽  
Author(s):  
Jackie Szymonifka ◽  
Sarah Conderino ◽  
Christine Cigolle ◽  
Jinkyung Ha ◽  
Mohammed Kabeto ◽  
...  

Abstract Objective Electronic health records (EHRs) have become a common data source for clinical risk prediction, offering large sample sizes and frequently sampled metrics. There may be notable differences between hospital-based EHR and traditional cohort samples: EHR data often are not population-representative random samples, even for particular diseases, as they tend to be sicker with higher healthcare utilization, while cohort studies often sample healthier subjects who typically are more likely to participate. We investigate heterogeneities between EHR- and cohort-based inferences including incidence rates, risk factor identifications/quantifications, and absolute risks. Materials and methods This is a retrospective cohort study of older patients with type 2 diabetes using EHR from New York University Langone Health ambulatory care (NYULH-EHR, years 2009–2017) and from the Health and Retirement Survey (HRS, 1995–2014) to study subsequent cardiovascular disease (CVD) risks. We used the same eligibility criteria, outcome definitions, and demographic covariates/biomarkers in both datasets. We compared subsequent CVD incidence rates, hazard ratios (HRs) of risk factors, and discrimination/calibration performances of CVD risk scores. Results The estimated subsequent total CVD incidence rate was 37.5 and 90.6 per 1000 person-years since T2DM onset in HRS and NYULH-EHR respectively. HR estimates were comparable between the datasets for most demographic covariates/biomarkers. Common CVD risk scores underestimated observed total CVD risks in NYULH-EHR. Discussion and conclusion EHR-estimated HRs of demographic and major clinical risk factors for CVD were mostly consistent with the estimates from a national cohort, despite high incidences and absolute risks of total CVD outcome in the EHR samples.


2021 ◽  
Author(s):  
Tian Ge ◽  
Amit Patki ◽  
Vinodh Srinivasasainagendra ◽  
Yen-Feng Lin ◽  
Marguerite Ryan Irvin ◽  
...  

ABSTRACTType 2 diabetes (T2D) is a worldwide scourge caused by both genetic and environmental risk factors that disproportionately afflicts communities of color. Leveraging existing large-scale genome-wide association studies (GWAS), polygenic risk scores (PRS) have shown promise to complement established clinical risk factors and intervention paradigms, and improve early diagnosis and prevention of T2D. However, to date, T2D PRS have been most widely developed and validated in individuals of European descent. Comprehensive assessment of T2D PRS in non-European populations is critical for an equitable deployment of PRS to clinical practice that benefits global populations. Here we integrate T2D GWAS in European, African American and East Asian populations to construct a trans-ancestry T2D PRS using a newly developed Bayesian polygenic modeling method, and evaluate the PRS in the multi-ethnic eMERGE study, four African American cohorts, and the Taiwan Biobank. The trans-ancestry PRS was significantly associated with T2D status across the ancestral groups examined, and the top 2% of the PRS distribution can identify individuals with an approximately 2.5-4.5 fold of increase in T2D risk, suggesting the potential of using the trans-ancestry PRS as a meaningful index of risk among diverse patients in clinical settings. Our efforts represent the first step towards the implementation of the T2D PRS into routine healthcare.


Sign in / Sign up

Export Citation Format

Share Document