scholarly journals Predicting Type 2 Diabetes Using Logistic Regression and Machine Learning Approaches

Author(s):  
Ram D. Joshi ◽  
Chandra K. Dhakal

Diabetes mellitus is one of the most common human diseases worldwide and may cause several health-related complications. It is responsible for considerable morbidity, mortality, and economic loss. A timely diagnosis and prediction of this disease could provide patients with an opportunity to take the appropriate preventive and treatment strategies. To improve the understanding of risk factors, we predict type 2 diabetes for Pima Indian women utilizing a logistic regression model and decision tree—a machine learning algorithm. Our analysis finds five main predictors of type 2 diabetes: glucose, pregnancy, body mass index (BMI), diabetes pedigree function, and age. We further explore a classification tree to complement and validate our analysis. The six-fold classification tree indicates glucose, BMI, and age are important factors, while the ten-node tree implies glucose, BMI, pregnancy, diabetes pedigree function, and age as the significant predictors. Our preferred specification yields a prediction accuracy of 78.26% and a cross-validation error rate of 21.74%. We argue that our model can be applied to make a reasonable prediction of of type 2 diabetes, and could potentially be used to complement existing preventive measures to curb the incidence of diabetes and reduce associated costs.

Circulation ◽  
2017 ◽  
Vol 135 (suppl_1) ◽  
Author(s):  
Samantha E Berger ◽  
Gordon S Huggins ◽  
Jeanne M McCaffery ◽  
Alice H Lichtenstein

Introduction: The development of type 2 diabetes is strongly associated with excess weight gain and can often be partially ameliorated or reversed by weight loss. While many lifestyle interventions have resulted in successful weight loss, strategies to maintain the weight loss have been considerably less successful. Prior studies have identified multiple predictors of weight regain, but none have synthesized them into one analytic stream. Methods: We developed a prediction model of 4-year weight regain after a one-year lifestyle-induced weight loss intervention followed by a 3 year maintenance intervention in 1791 overweight or obese adults with type 2 diabetes from the Action for Health in Diabetes (Look AHEAD) trial who lost ≥3% of initial weight by the end of year 1. Weight regain was defined as regaining <50% of the weight lost during the intervention by year 4. Using machine learning we integrated factors from several domains, including demographics, psychosocial metrics, health status and behaviors (e.g. physical activity, self-monitoring, medication use and intervention adherence). We used classification trees and stochastic gradient boosting with 10-fold cross validation to develop and internally validate the prediction model. Results: At the end of four years, 928 individuals maintained ≥50% of their initial weight lost (maintainers), whereas 863 did not met that criterion (regainers). We identified an interaction between age and several variables in the model, as well as percent initial weight loss. Several factors were significant predictors of weight regain based on variable importance plots, regardless of age or initial weight loss, such as insurance status, physical function score, baseline BMI, meal replacement use and minutes of exercise recorded during year 1. We also identified several factors that were significant predictors depending on age group (45-55y/ 56-65y/66-76y) and initial weight loss (lost 3-9% vs. ≥10% of initial weight). When the variables identified from machine learning were added to a logistic regression model stratified by age and initial weight loss groups, the models showed good prediction (3-9% initial weight loss, ages 45-55y (n=293): ROC AUC=0.78; ≥10% initial weight loss, ages 45-55y (n=242): ROC AUC=0.78; (3-9% initial weight loss, ages 56-65y (n=484): ROC AUC=0.70; ≥10% initial weight loss, ages 56-65y (n=455): ROC AUC = 0.74; 3-9% initial weight loss, ages 66-76y (n=150): ROC AUC=0.84; ≥10% initial weight loss, ages 66-76y (n=167): ROC AUC=0.86). Conclusion: The combination of machine learning methodology and logistic regression generates a prediction model that can consider numerous factors simultaneously, can be used to predict weight regain in other populations and can assist in the development of better strategies to prevent post-loss regain.


2020 ◽  
Author(s):  
Ada Admin ◽  
Jialing Huang ◽  
Cornelia Huth ◽  
Marcela Covic ◽  
Martina Troll ◽  
...  

Early and precise identification of individuals with pre-diabetes and type 2 diabetes (T2D) at risk of progressing to chronic kidney disease (CKD) is essential to prevent complications of diabetes. Here, we identify and evaluate prospective metabolite biomarkers and the best set of predictors of CKD in the longitudinal, population-based Cooperative Health Research in the Region of Augsburg (KORA) cohort by targeted metabolomics and machine learning approaches. Out of 125 targeted metabolites, sphingomyelin (SM) C18:1 and phosphatidylcholine diacyl (PC aa) C38:0 were identified as candidate metabolite biomarkers of incident CKD specifically in hyperglycemic individuals followed during 6.5 years. Sets of predictors for incident CKD developed from 125 metabolites and 14 clinical variables showed highly stable performances in all three machine learning approaches and outperformed the currently established clinical algorithm for CKD. The two metabolites in combination with five clinical variables were identified as the best set of predictors and their predictive performance yielded a mean area value under the receiver operating characteristic curve of 0.857. The inclusion of metabolite variables in the clinical prediction of future CKD may thus improve the risk prediction in persons with pre- and T2D. The metabolite link with hyperglycemia-related early kidney dysfunction warrants further investigation.


2020 ◽  
Vol 4 (Supplement_2) ◽  
pp. 1559-1559
Author(s):  
Wanglong Gou ◽  
Chu-Wen Ling ◽  
Yan He ◽  
Zengliang Jiang ◽  
Yuanqing Fu ◽  
...  

Abstract Objectives The gut microbiome-type 2 diabetes (T2D) relationship among human cohorts have been controversial. We hypothesized that this limitation could be addressed by integrating the cutting-edge interpretable machine learning framework and large-scale human cohort studies. Methods 3 independent cohorts with &gt;9000 participants were included in this study. We proposed a new machine learning-based analytic framework — using LightGBM to infer the relationship between incorporated features and T2D, and SHapley Additive explanation(SHAP) to identified microbiome features associated with the risk of T2D. We then generated a microbiome risk score (MRS) integrating the threshold and direction of the identified microbiome features to predict T2D risk. Results We finally identified 15 microbiome features (two of them are indicators of microbial diversity, others are taxa-related features) associated with the risk of T2D. The identified T2D-related gut microbiome features showed superior T2D prediction accuracy compared to host genetics or traditional risk factors. Furthermore, we found that the MRS (per unit change in MRS) consistently showed positive association with T2D risk in the discovery cohort (RR 1.28, 95%CI 1.23-1.33), external validation cohort 1 (RR 1.23, 95%CI 1.13-1.34) and external validation cohort 2 (GGMP, RR 1.12, 95%CI 1.06-1.18). The MRS could also predict future glucose increment. We subsequently identified dietary and lifestyle factors which could prospectively modulate the microbiome features, and found that body fat distribution may be the key factor modulating the gut microbiome-T2D relationship. Conclusions Taken together, we proposed a new analytical framework for the investigation of microbiome-disease relationship. The identified microbiome features may serve as potential drug targets for T2D in future. Funding Sources This study was funded by National Natural Science Foundation of China (81903316, 81773416), Westlake University (101396021801) and the 5010 Program for Clinical Researches (2007032) of the Sun Yat-sen University (Guangzhou, China).


2020 ◽  
Vol 222 (1) ◽  
pp. S228
Author(s):  
Ohad houri ◽  
Yotam Gil ◽  
Alexandra Berezowsky ◽  
Arnon Wiznitzer ◽  
Eran Hadar ◽  
...  

2020 ◽  
Author(s):  
Ada Admin ◽  
Jialing Huang ◽  
Cornelia Huth ◽  
Marcela Covic ◽  
Martina Troll ◽  
...  

Early and precise identification of individuals with pre-diabetes and type 2 diabetes (T2D) at risk of progressing to chronic kidney disease (CKD) is essential to prevent complications of diabetes. Here, we identify and evaluate prospective metabolite biomarkers and the best set of predictors of CKD in the longitudinal, population-based Cooperative Health Research in the Region of Augsburg (KORA) cohort by targeted metabolomics and machine learning approaches. Out of 125 targeted metabolites, sphingomyelin (SM) C18:1 and phosphatidylcholine diacyl (PC aa) C38:0 were identified as candidate metabolite biomarkers of incident CKD specifically in hyperglycemic individuals followed during 6.5 years. Sets of predictors for incident CKD developed from 125 metabolites and 14 clinical variables showed highly stable performances in all three machine learning approaches and outperformed the currently established clinical algorithm for CKD. The two metabolites in combination with five clinical variables were identified as the best set of predictors and their predictive performance yielded a mean area value under the receiver operating characteristic curve of 0.857. The inclusion of metabolite variables in the clinical prediction of future CKD may thus improve the risk prediction in persons with pre- and T2D. The metabolite link with hyperglycemia-related early kidney dysfunction warrants further investigation.


2021 ◽  
Author(s):  
Shula Shazman

Intermittent fasting (IF) is the cycling between periods of eating and fasting. The two most popular forms of IER are: the 5: 2 diet characterized by two consecutive or non-consecutive “fast” days and the alternate-day energy restriction, commonly called alternate-day fasting (ADF). The second form is time-restricted feeding (TRF), eating within specific time frames such as the most prevalent 16: 8 diet, with 16 hours of fasting and 8 hours for eating. It is already known that IF can bring about changes in metabolic parameters related with type 2 diabetes (T2D). Furthermore, IF can be effective in improving health by reducing metabolic disorders and age-related diseases. However, it is not clear yet whether the age at which fasting begins, gender and severity of T2D influence on the effectiveness of the different types of IF in reducing metabolic disorders. In this chapter I will present the risk factors of T2D, the different types of IF interventions and the research-based knowledge regarding the effect of IF on T2D. Furthermore, I will describe several machine learning approaches to provide a recommendation system which reveals a set of rules that can assist selecting a successful IF intervention for a personal case. Finally, I will discuss the question: Can we predict the optimal IF intervention for a prediabetes patient?


Author(s):  
M. Lincy ◽  
A. Meena Kowshalya

Data privacy and security are incredibly important in the healthcare industry. Federated learning is a new way of training a machine learning algorithm using distributed data which is not hosted in a centralized server. Numerous centralized machine learning models exists in literature but none offers privacy to users’ data. This paper proposes a federated learning approach for early detection of Type-2 Diabetes among patients. A simple federated architecture is exploited for early detection of Type-2 diabetes. We compare the proposed federated learning model against our centralised approach. Experimental results prove that the federated learning model ensures significant privacy over centralised learning model whereas compromising accuracy for a subtle extend.


Sign in / Sign up

Export Citation Format

Share Document