Abstract
Purpose: To develop and verify an early prediction model of gestational diabetes mellitus (GDM) using machine learning algorithm.Methods: The dataset collected from a pregnant cohort study in eastern China, from 2017 to 2019. It was randomly divided into 75% as the training dataset and 25% as the test dataset using the train_test_split function. Based on Python, four classic machine learning algorithm and a New-Stacking algorithm were first trained by the training dataset, and then verified by the test dataset. The four models were Logical Regression (LR), Random Forest (RT), Artificial Neural Network (ANN) and Support Vector Machine (SVM). The sensitivity, specificity, accuracy, and area under the Receiver Operating Characteristic Curve (AUC) were used to analyse the performance of models.Results: Valid information from a total of 2811 pregnant women were obtained. The accuracies of the models ranged from 80.09% to 86.91% (RF), sensitivities ranged from 63.30% to 81.65% (SVM), specificities ranged from 79.38% to 97.53% (RF), and AUCs ranged from 0.80 to 0.82 (New-Stacking).Conclusion: This paper successfully constructed a New-Stacking model theoretically, for its better performance in specificity, accuracy and AUC. But the SVM model got the highest sensitivity, the SVM model was recommends as the prediction model for clinical.