scholarly journals Logistic regression has similar performance to optimised machine learning algorithms in a clinical setting: application to the discrimination between type 1 and type 2 diabetes in young adults

2020 ◽  
Vol 4 (1) ◽  
Author(s):  
Anita L. Lynam ◽  
John M. Dennis ◽  
Katharine R. Owen ◽  
Richard A. Oram ◽  
Angus G. Jones ◽  
...  
Author(s):  
Sushma Jaiswal ◽  
Tarun Jaiswal

Introduction: The expansion of an actual diabetes judgement structure by the fascinating improvement of computational intellect is observed as a chief objective currently. Numerous tactics based on the artificial network and machine-learning procedures have been established and verified alongside diabetes datasets, which remained typically associated with the entities of Pima Indian derivation. Nevertheless, extraordinary accuracy up to 99-100% in forecasting the precise diabetes judgement, none of these methods has touched scientific presentation so far. Various tools such as Machine Learning (ML) and Data Mining are used for correct identification of diabetes. These tools improve the diagnosis process associated with T2DM. Diabetes mellitus type 2 (DMT2) is a major problem in several developing countries but its early diagnosis can provide enhanced treatment and can save several people life. Accordingly, we have to develop a structure that diagnoses type 2 diabetes. In this paper, a fuzzy expert system is proposed that present the Mamdani fuzzy inference structure (MFIS) to diagnose type 2 diabetes meritoriously. For necessary evaluation of the proposed structure, a proportional revision has been originated, that provide the anticipated structure with Machine Learning algorithms, specifically J48 Decision-tree (DT), multilayer perceptron (MLP), support-vector-machine (SVM), and Naïve- Bayes (NB), fusion and mixed fusion-based methods. The advanced fuzzy expert system (FES) and the machine learning algorithms are authenticated with actual data commencing the UCI machine learning datasets. Furthermore, the concert of the fuzzy expert structure is appraised by equating it to connected work that used the MFIS to detect the occurrence of type 2 diabetes. Objective: This survey paper presents a review of recent advances in the area of machine learning based classification models for diagnosis of diabetes. Methods: This paper presents an extensive work done in the field of machine learning based classification models for diagnosis of type 2 diabetes where modified fusion of machine learning methods are compared to the basic models i.e. Radial basis function, K-nearest neighbor, support vector machine, J48, logistic regression, classification and regression tress etc. based on training and testing. Results: Fig. 3 and Fig. 4 summarizes the result based on prediction accurateness for each classifier of training and testing. Conclusion: The fuzzy expert system is the best among its rival classifiers; SVM performs very poorly with a very low true positive rate, i.e. a very high number of positive cases misclassified as (Non-diabetic) negative. Based on the evaluation it is clear that the fuzzy expert system has the highest precision value. However, J48 is the least accurate classifier. It has the highest number of false positives relative to the other classifiers mentioned in the testing part. The results show that the fuzzy expert system has the uppermost cost for both precision and recall. Thus, it has the uppermost value for F-measure in the training and testing datasets. J48 is considered the second-best classifier for the training dataset, whereas Naïve Bayes comes in the second rank in the testing dataset.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 1290-P
Author(s):  
GIUSEPPE D’ANNUNZIO ◽  
ROBERTO BIASSONI ◽  
MARGHERITA SQUILLARIO ◽  
ELISABETTA UGOLOTTI ◽  
ANNALISA BARLA ◽  
...  

Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 279-OR
Author(s):  
ALLISON SHAPIRO ◽  
DANA DABELEA ◽  
JEANETTE M. STAFFORD ◽  
RALPH DAGOSTINO ◽  
CATHERINE PIHOKER ◽  
...  

Circulation ◽  
2017 ◽  
Vol 135 (suppl_1) ◽  
Author(s):  
Samantha E Berger ◽  
Gordon S Huggins ◽  
Jeanne M McCaffery ◽  
Alice H Lichtenstein

Introduction: The development of type 2 diabetes is strongly associated with excess weight gain and can often be partially ameliorated or reversed by weight loss. While many lifestyle interventions have resulted in successful weight loss, strategies to maintain the weight loss have been considerably less successful. Prior studies have identified multiple predictors of weight regain, but none have synthesized them into one analytic stream. Methods: We developed a prediction model of 4-year weight regain after a one-year lifestyle-induced weight loss intervention followed by a 3 year maintenance intervention in 1791 overweight or obese adults with type 2 diabetes from the Action for Health in Diabetes (Look AHEAD) trial who lost ≥3% of initial weight by the end of year 1. Weight regain was defined as regaining <50% of the weight lost during the intervention by year 4. Using machine learning we integrated factors from several domains, including demographics, psychosocial metrics, health status and behaviors (e.g. physical activity, self-monitoring, medication use and intervention adherence). We used classification trees and stochastic gradient boosting with 10-fold cross validation to develop and internally validate the prediction model. Results: At the end of four years, 928 individuals maintained ≥50% of their initial weight lost (maintainers), whereas 863 did not met that criterion (regainers). We identified an interaction between age and several variables in the model, as well as percent initial weight loss. Several factors were significant predictors of weight regain based on variable importance plots, regardless of age or initial weight loss, such as insurance status, physical function score, baseline BMI, meal replacement use and minutes of exercise recorded during year 1. We also identified several factors that were significant predictors depending on age group (45-55y/ 56-65y/66-76y) and initial weight loss (lost 3-9% vs. ≥10% of initial weight). When the variables identified from machine learning were added to a logistic regression model stratified by age and initial weight loss groups, the models showed good prediction (3-9% initial weight loss, ages 45-55y (n=293): ROC AUC=0.78; ≥10% initial weight loss, ages 45-55y (n=242): ROC AUC=0.78; (3-9% initial weight loss, ages 56-65y (n=484): ROC AUC=0.70; ≥10% initial weight loss, ages 56-65y (n=455): ROC AUC = 0.74; 3-9% initial weight loss, ages 66-76y (n=150): ROC AUC=0.84; ≥10% initial weight loss, ages 66-76y (n=167): ROC AUC=0.86). Conclusion: The combination of machine learning methodology and logistic regression generates a prediction model that can consider numerous factors simultaneously, can be used to predict weight regain in other populations and can assist in the development of better strategies to prevent post-loss regain.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.


2019 ◽  
Author(s):  
Matthijs Blankers ◽  
Louk F. M. van der Post ◽  
Jack J. M. Dekker

Abstract Background: It is difficult to accurately predict whether a patient on the verge of a potential psychiatric crisis will need to be hospitalized. Machine learning may be helpful to improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate and compare the accuracy of ten machine learning algorithms including the commonly used generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact, and explore the most important predictor variables of hospitalization. Methods: Data from 2,084 patients with at least one reported psychiatric crisis care contact included in the longitudinal Amsterdam Study of Acute Psychiatry were used. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared. We also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis. Target variable for the prediction models was whether or not the patient was hospitalized in the 12 months following inclusion in the study. The 39 predictor variables were related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts. Results: We found Gradient Boosting to perform the best (AUC=0.774) and K-Nearest Neighbors performing the least (AUC=0.702). The performance of GLM/logistic regression (AUC=0.76) was above average among the tested algorithms. Gradient Boosting outperformed GLM/logistic regression and K-Nearest Neighbors, and GLM outperformed K-Nearest Neighbors in a Net Reclassification Improvement analysis, although the differences between Gradient Boosting and GLM/logistic regression were small. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was modest. Future studies may consider to combine multiple algorithms in an ensemble model for optimal performance and to mitigate the risk of choosing suboptimal performing algorithms.


Sign in / Sign up

Export Citation Format

Share Document