Machine Learning Model to Diagnose Diabetes Type 2 Based on Health Behavior

Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods

BMC Bioinformatics ◽

10.1186/s12859-021-04077-9 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Muhammad Muneeb ◽

Andreas Henschel

Keyword(s):

Machine Learning ◽

Type 2 Diabetes ◽

Learning Model ◽

Machine Learning Algorithms ◽

Statistical Techniques ◽

Human Beings ◽

Eye Color ◽

Machine Learning Model ◽

Extreme Gradient Boosting

Abstract Background Genotype–phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning. Results The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96% respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC = 0.98 for brown eyes, and AUC = 0.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97%. Conclusion Genotype–phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification.

Download Full-text

Machine Learning Model for Prediction of Diabetes Mellitus

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e5916.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 2376-2381

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Learning Model ◽

Endocrine Disease ◽

Machine Learning Model ◽

Novel Method ◽

Prediction Of Diabetes ◽

Accuracy Of Prediction

Today world is extensively affected by endocrine disease Diabetes Mellitus which is commonly known as diabetes. There is a need for an effective model which can predict diabetes and its types at the early stages with accuracy. To improve the accuracy of prediction and to achieve better efficiency, a new Machine Learning based Model (MLM) is proposed. This Machine Learning Model (MLM) has ability to predict the diabetes and its categories as type 1, type 2 and Gestational diabetic with which the patient is suffering from. The proposed Machine Learning Model is innovative for diagnosis of diabetes is more accurate as compared to other existing approaches.This is a novel method from which one can combine power of an expert system with the machine learning environment.

Download Full-text

Eye-Color and Type-2 Diabetes Phenotype Prediction From Genotype Data Using Deep Learning Methods

10.21203/rs.3.rs-125397/v1 ◽

2020 ◽

Author(s):

Muhammad Muneeb ◽

Andreas Henschel

Keyword(s):

Machine Learning ◽

Type 2 Diabetes ◽

Learning Model ◽

Machine Learning Algorithms ◽

Statistical Techniques ◽

Human Beings ◽

Eye Color ◽

Machine Learning Model ◽

Extreme Gradient Boosting

Abstract Background: Genotype-Phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning. Results: The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96 percent respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC = 0.98 for brown eyes, and AUC = 0.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97 percent. Conclusion: Genotype-phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification.

Download Full-text

Author Correction: Geographically weighted machine learning model for untangling spatial heterogeneity of type 2 diabetes mellitus (T2D) prevalence in the USA

Scientific Reports ◽

10.1038/s41598-021-97279-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sarah Quiñones ◽

Aditya Goyal ◽

Zia U. Ahmed

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Type 2 Diabetes ◽

Type 2 Diabetes Mellitus ◽

Spatial Heterogeneity ◽

Learning Model ◽

Machine Learning Model ◽

The Usa

Download Full-text

Geographically weighted machine learning model for untangling spatial heterogeneity of type 2 diabetes mellitus (T2D) prevalence in the USA

Scientific Reports ◽

10.1038/s41598-021-85381-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Sarah Quiñones ◽

Aditya Goyal ◽

Zia U. Ahmed

Keyword(s):

Diabetes Mellitus ◽

Machine Learning ◽

Risk Factors ◽

Type 2 Diabetes ◽

Type 2 Diabetes Mellitus ◽

Spatial Heterogeneity ◽

Learning Model ◽

Machine Learning Model ◽

Non Parametric

AbstractType 2 diabetes mellitus (T2D) prevalence in the United States varies substantially across spatial and temporal scales, attributable to variations of socioeconomic and lifestyle risk factors. Understanding these variations in risk factors contributions to T2D would be of great benefit to intervention and treatment approaches to reduce or prevent T2D. Geographically-weighted random forest (GW-RF), a tree-based non-parametric machine learning model, may help explore and visualize the relationships between T2D and risk factors at the county-level. GW-RF outputs are compared to global (RF and OLS) and local (GW-OLS) models between the years of 2013–2017 using low education, poverty, obesity, physical inactivity, access to exercise, and food environment as inputs. Our results indicate that a non-parametric GW-RF model shows a high potential for explaining spatial heterogeneity of, and predicting, T2D prevalence over traditional local and global models when inputting six major risk factors. Some of these predictions, however, are marginal. These findings of spatial heterogeneity using GW-RF demonstrate the need to consider local factors in prevention approaches. Spatial analysis of T2D and associated risk factor prevalence offers useful information for targeting the geographic area for prevention and disease interventions.

Download Full-text

Development and Validation of a Machine Learning Model Using Administrative Health Data to Predict Onset of Type 2 Diabetes

JAMA Network Open ◽

10.1001/jamanetworkopen.2021.11315 ◽

2021 ◽

Vol 4 (5) ◽

pp. e2111315

Author(s):

Mathieu Ravaut ◽

Vinyas Harish ◽

Hamed Sadeghi ◽

Kin Kwan Leung ◽

Maksims Volkovs ◽

...

Keyword(s):

Machine Learning ◽

Type 2 Diabetes ◽

Learning Model ◽

Health Data ◽

Administrative Health Data ◽

Machine Learning Model ◽

Development And Validation

Download Full-text

Predicting and Staging Chronic Kidney Disease of Diabetes (Type-2) Patient using Machine Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.l3572.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 206-209 ◽

Cited By ~ 1

Keyword(s):

Machine Learning ◽

Chronic Kidney Disease ◽

Kidney Disease ◽

Diabetes Type 2 ◽

Diabetes Type ◽

Machine Learning Algorithms ◽

Urine Test ◽

Negative Impacts

Mortality because of unending kidney disease increments essentially in recent years. Nowadays, about 422 million patients are suffering from diabetes among them around 30 percent of patients with Type 1 (adolescent beginning) diabetes and around 10 to 40 percent of those with Type 2 (grown-up beginning) diabetes in the end will experience the negative impacts of kidney damage. It is evident, that early detection of Chronic Kidney Disease (CKD) can mitigate the level of damage in the adulthood. In this paper, we have presented a comparative analysis based on the performance of five different algorithms-Naive Bayes (NB), In-stance Based Learning (IBK), Random Forest (RF), Decision Stump (DS) and Decision Tree (J48) for predicting CKD of diabetes patients only by urine test. Among all the algorithms the IBK gives the best result. Our comparison of different algorithms will help people with diabetes to find out if they are having CKD or not.

Download Full-text