Predicting Type 2 Diabetes Through Machine Learning: Performance Analysis in Balanced and Imbalanced Data

Author(s):  
Francisco Mesquita ◽  
Gonçalo Marques
Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1158-P
Author(s):  
LI CHEN ◽  
LINGGE FENG ◽  
CUI TANG ◽  
YI ZHANG

Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1311-P
Author(s):  
XIN CHEN ◽  
GAIL FERNANDES ◽  
JIE CHEN ◽  
ZHIWEN LIU ◽  
RICHARD BAUMGARTNER

Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 1552-P
Author(s):  
KAZUYA FUJIHARA ◽  
MAYUKO H. YAMADA ◽  
YASUHIRO MATSUBAYASHI ◽  
MASAHIKO YAMAMOTO ◽  
TOSHIHIRO IIZUKA ◽  
...  

2016 ◽  
Vol 11 (4) ◽  
pp. 791-799 ◽  
Author(s):  
Rina Kagawa ◽  
Yoshimasa Kawazoe ◽  
Yusuke Ida ◽  
Emiko Shinohara ◽  
Katsuya Tanaka ◽  
...  

Background: Phenotyping is an automated technique that can be used to distinguish patients based on electronic health records. To improve the quality of medical care and advance type 2 diabetes mellitus (T2DM) research, the demand for T2DM phenotyping has been increasing. Some existing phenotyping algorithms are not sufficiently accurate for screening or identifying clinical research subjects. Objective: We propose a practical phenotyping framework using both expert knowledge and a machine learning approach to develop 2 phenotyping algorithms: one is for screening; the other is for identifying research subjects. Methods: We employ expert knowledge as rules to exclude obvious control patients and machine learning to increase accuracy for complicated patients. We developed phenotyping algorithms on the basis of our framework and performed binary classification to determine whether a patient has T2DM. To facilitate development of practical phenotyping algorithms, this study introduces new evaluation metrics: area under the precision-sensitivity curve (AUPS) with a high sensitivity and AUPS with a high positive predictive value. Results: The proposed phenotyping algorithms based on our framework show higher performance than baseline algorithms. Our proposed framework can be used to develop 2 types of phenotyping algorithms depending on the tuning approach: one for screening, the other for identifying research subjects. Conclusions: We develop a novel phenotyping framework that can be easily implemented on the basis of proper evaluation metrics, which are in accordance with users’ objectives. The phenotyping algorithms based on our framework are useful for extraction of T2DM patients in retrospective studies.


Circulation ◽  
2017 ◽  
Vol 135 (suppl_1) ◽  
Author(s):  
Samantha E Berger ◽  
Gordon S Huggins ◽  
Jeanne M McCaffery ◽  
Alice H Lichtenstein

Introduction: The development of type 2 diabetes is strongly associated with excess weight gain and can often be partially ameliorated or reversed by weight loss. While many lifestyle interventions have resulted in successful weight loss, strategies to maintain the weight loss have been considerably less successful. Prior studies have identified multiple predictors of weight regain, but none have synthesized them into one analytic stream. Methods: We developed a prediction model of 4-year weight regain after a one-year lifestyle-induced weight loss intervention followed by a 3 year maintenance intervention in 1791 overweight or obese adults with type 2 diabetes from the Action for Health in Diabetes (Look AHEAD) trial who lost ≥3% of initial weight by the end of year 1. Weight regain was defined as regaining <50% of the weight lost during the intervention by year 4. Using machine learning we integrated factors from several domains, including demographics, psychosocial metrics, health status and behaviors (e.g. physical activity, self-monitoring, medication use and intervention adherence). We used classification trees and stochastic gradient boosting with 10-fold cross validation to develop and internally validate the prediction model. Results: At the end of four years, 928 individuals maintained ≥50% of their initial weight lost (maintainers), whereas 863 did not met that criterion (regainers). We identified an interaction between age and several variables in the model, as well as percent initial weight loss. Several factors were significant predictors of weight regain based on variable importance plots, regardless of age or initial weight loss, such as insurance status, physical function score, baseline BMI, meal replacement use and minutes of exercise recorded during year 1. We also identified several factors that were significant predictors depending on age group (45-55y/ 56-65y/66-76y) and initial weight loss (lost 3-9% vs. ≥10% of initial weight). When the variables identified from machine learning were added to a logistic regression model stratified by age and initial weight loss groups, the models showed good prediction (3-9% initial weight loss, ages 45-55y (n=293): ROC AUC=0.78; ≥10% initial weight loss, ages 45-55y (n=242): ROC AUC=0.78; (3-9% initial weight loss, ages 56-65y (n=484): ROC AUC=0.70; ≥10% initial weight loss, ages 56-65y (n=455): ROC AUC = 0.74; 3-9% initial weight loss, ages 66-76y (n=150): ROC AUC=0.84; ≥10% initial weight loss, ages 66-76y (n=167): ROC AUC=0.86). Conclusion: The combination of machine learning methodology and logistic regression generates a prediction model that can consider numerous factors simultaneously, can be used to predict weight regain in other populations and can assist in the development of better strategies to prevent post-loss regain.


Author(s):  
Erika Severeyn ◽  
Sara Wong ◽  
Jesús Velásquez ◽  
Gilberto Perpiñán ◽  
Héctor Herrera ◽  
...  

Author(s):  
Muhammad Younus ◽  
Md Tahsir Ahmed Munna ◽  
Mirza Mohtashim Alam ◽  
Shaikh Muhammad Allayear ◽  
Sheikh Joly Ferdous Ara

2020 ◽  
Author(s):  
Ada Admin ◽  
Jialing Huang ◽  
Cornelia Huth ◽  
Marcela Covic ◽  
Martina Troll ◽  
...  

Early and precise identification of individuals with pre-diabetes and type 2 diabetes (T2D) at risk of progressing to chronic kidney disease (CKD) is essential to prevent complications of diabetes. Here, we identify and evaluate prospective metabolite biomarkers and the best set of predictors of CKD in the longitudinal, population-based Cooperative Health Research in the Region of Augsburg (KORA) cohort by targeted metabolomics and machine learning approaches. Out of 125 targeted metabolites, sphingomyelin (SM) C18:1 and phosphatidylcholine diacyl (PC aa) C38:0 were identified as candidate metabolite biomarkers of incident CKD specifically in hyperglycemic individuals followed during 6.5 years. Sets of predictors for incident CKD developed from 125 metabolites and 14 clinical variables showed highly stable performances in all three machine learning approaches and outperformed the currently established clinical algorithm for CKD. The two metabolites in combination with five clinical variables were identified as the best set of predictors and their predictive performance yielded a mean area value under the receiver operating characteristic curve of 0.857. The inclusion of metabolite variables in the clinical prediction of future CKD may thus improve the risk prediction in persons with pre- and T2D. The metabolite link with hyperglycemia-related early kidney dysfunction warrants further investigation.


Sign in / Sign up

Export Citation Format

Share Document