Machine Learning Models of Acute Kidney Injury Prediction in Acute Pancreatitis Patients

Background. Acute kidney injury (AKI) has long been recognized as a common and important complication of acute pancreatitis (AP). In the study, machine learning (ML) techniques were used to establish predictive models for AKI in AP patients during hospitalization. This is a retrospective review of prospectively collected data of AP patients admitted within one week after the onset of abdominal pain to our department from January 2014 to January 2019. Eighty patients developed AKI after admission (AKI group) and 254 patients did not (non-AKI group) in the hospital. With the provision of additional information such as demographic characteristics or laboratory data, support vector machine (SVM), random forest (RF), classification and regression tree (CART), and extreme gradient boosting (XGBoost) were used to build models of AKI prediction and compared to the predictive performance of the classic model using logistic regression (LR). XGBoost performed best in predicting AKI with an AUC of 91.93% among the machine learning models. The AUC of logistic regression analysis was 87.28%. Present findings suggest that compared to the classical logistic regression model, machine learning models using features that can be easily obtained at admission had a better performance in predicting AKI in the AP patients.

Download Full-text

Prediction of Post-Intubation Tachycardia Using Machine-Learning Models

Applied Sciences ◽

10.3390/app10031151 ◽

2020 ◽

Vol 10 (3) ◽

pp. 1151

Author(s):

Hanna Kim ◽

Young-Seob Jeong ◽

Ah Reum Kang ◽

Woohyun Jung ◽

Yang Hoon Chung ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Characteristic Curve ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Feature Sets ◽

Extreme Gradient Boosting ◽

Post Intubation ◽

Machine Learning Models

Tachycardia is defined as a heart rate greater than 100 bpm for more than 1 min. Tachycardia often occurs after endotracheal intubation and can cause serious complication in patients with cardiovascular disease. The ability to predict post-intubation tachycardia would help clinicians by notifying a potential event to pre-treat. In this paper, we predict the potential post-intubation tachycardia. Given electronic medical record and vital signs collected before tracheal intubation, we predict whether post-intubation tachycardia will occur within 10 min. Of 1931 available patient datasets, 257 remained after filtering those with inappropriate data such as outliers and inappropriate annotations. Three feature sets were designed using feature selection algorithms, and two additional feature sets were defined by statistical inspection or manual examination. The five feature sets were compared with various machine learning models such as naïve Bayes classifiers, logistic regression, random forest, support vector machines, extreme gradient boosting, and artificial neural networks. Parameters of the models were optimized for each feature set. By 10-fold cross validation, we found that an logistic regression model with eight-dimensional hand-crafted features achieved an accuracy of 80.5%, recall of 85.1%, precision of 79.9%, an F1 score of 79.9%, and an area under the receiver operating characteristic curve of 0.85.

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

Machine Learning for the Prediction of Progression in Patients with Acute Kidney Injury in Critical Care

10.21203/rs.3.rs-412422/v1 ◽

2021 ◽

Author(s):

Lifan Zhang ◽

Canzheng Wei ◽

Yunxia Feng ◽

Aijia Ma ◽

Yan Kang

Keyword(s):

Machine Learning ◽

Acute Kidney Injury ◽

Logistic Regression ◽

Critical Care ◽

Intensive Care ◽

Logistic Regression Model ◽

Kidney Injury ◽

Gradient Boosting ◽

Extreme Gradient Boosting ◽

Stage 1

Abstract Background: Acute kidney injury (AKI) is a serve and harmful syndrome in the intensive care unit. The purpose of this study is to develop a prediction model that predict whether patients with AKI stage 1/2 will progress to AKI stage 3. Methods: Patients with AKI stage 1/2, when they were ﬁrst diagnosed with AKI in the Medical Information Mart for Intensive Care (MIMIC-III), were included. We excluded patients who had underwent RRT or progressed to AKI stage 3 within 72 hours of the ﬁrst AKI diagnosis. We also excluded patients with chronic kidney disease (CKD). We used the Logistic regression and machine learning extreme gradient boosting (XGBoost) to build two models which can predict patients who will progress to AKI stage 3. Established models were evaluated by cross-validation, receiver operating characteristic curve (ROC), and precision-recall curves (PRC). Results: We included 25711 patients, of whom 2130 (8.3%) progressed to AKI stage 3. Creatinine, multiple organ failure syndromes (MODS), blood urea nitrogen (BUN), sepsis, and respiratory failure were the most important in AKI progression prediction. The XGBoost model has a better performance than the Logistic regression model on predicting AKI stage 3 progression (AU-ROC, 0.926; 95%CI, 0.917 to 0.931 vs. 0.784; 95%CI, 0.771 to 0.796, respectively). Conclusions: The XGboost model can better identify patients with AKI progression than Logistic regression model. Machine learning techniques may improve predictive modeling in medical research. Keywords: Acute kidney injury; Critical care; Logistic Models; Extreme gradient boosting

Download Full-text

A Comparative Analysis of Machine Learning Models for Prediction of Insurance Uptake in Kenya

10.20944/preprints202010.0186.v1 ◽

2020 ◽

Author(s):

Nelson Yego ◽

Juma Kasozi ◽

Joseph Nkrunziza

Keyword(s):

Machine Learning ◽

Random Forest ◽

Characteristic Curve ◽

Confusion Matrix ◽

Gradient Boosting ◽

Support Vector ◽

Sampled Data ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

The role of insurance in financial inclusion as well as in economic growth is immense. However, low uptake seems to impede the growth of the sector hence the need for a model that robustly predicts uptake of insurance among potential clients. In this research, we compared the performances of eight (8) machine learning models in predicting the uptake of insurance. The classifiers considered were Logistic Regression, Gaussian Naive Bayes, Support Vector Machines, K Nearest Neighbors, Decision Tree, Random Forest, Gradient Boosting Machines and Extreme Gradient boosting. The data used in the classification was from the 2016 Kenya FinAccess Household Survey. Comparison of performance was done for both upsampled and downsampled data due to data imbalance. For upsampled data, Random Forest classifier showed highest accuracy and precision compared to other classifiers but for down sampled data, gradient boosting was optimal. It is noteworthy that for both upsampled and downsampled data, tree-based classifiers were more robust than others in insurance uptake prediction. However, in spite of hyper-parameter optimization, the area under receiver operating characteristic curve remained highest for Random Forest as compared to other tree-based models. Also, the confusion matrix for Random Forest showed least false positives, and highest true positives hence could be construed as the most robust model for predicting the insurance uptake. Finally, the most important feature in predicting uptake was having a bank product hence bancassurance could be said to be a plausible channel of distribution of insurance products.

Download Full-text

Machine learning models for screening carotid atherosclerosis in asymptomatic adults

Scientific Reports ◽

10.1038/s41598-021-01456-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jian Yu ◽

Yan Zhou ◽

Qiong Yang ◽

Xiaoling Liu ◽

Lili Huang ◽

...

Keyword(s):

Machine Learning ◽

Physical Examination ◽

Carotid Atherosclerosis ◽

Gradient Boosting ◽

Support Vector ◽

Learning Models ◽

Cerebrovascular Events ◽

Extreme Gradient Boosting ◽

Routine Physical Examination ◽

Machine Learning Models

AbstractCarotid atherosclerosis (CAS) is a risk factor for cardiovascular and cerebrovascular events, but duplex ultrasonography isn’t recommended in routine screening for asymptomatic populations according to medical guidelines. We aim to develop machine learning models to screen CAS in asymptomatic adults. A total of 2732 asymptomatic subjects for routine physical examination in our hospital were included in the study. We developed machine learning models to classify subjects with or without CAS using decision tree, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM) and multilayer perceptron (MLP) with 17 candidate features. The performance of models was assessed on the testing dataset. The model using MLP achieved the highest accuracy (0.748), positive predictive value (0.743), F1 score (0.742), area under receiver operating characteristic curve (AUC) (0.766) and Kappa score (0.445) among all classifiers. It’s followed by models using XGBoost and SVM. In conclusion, the model using MLP is the best one to screen CAS in asymptomatic adults based on the results from routine physical examination, followed by using XGBoost and SVM. Those models may provide an effective and applicable method for physician and primary care doctors to screen asymptomatic CAS without risk factors in general population, and improve risk predictions and preventions of cardiovascular and cerebrovascular events in asymptomatic adults.

Download Full-text

Application of Machine Learning to Predict Acute Kidney Disease in Patients With Sepsis Associated Acute Kidney Injury

Frontiers in Medicine ◽

10.3389/fmed.2021.792974 ◽

2021 ◽

Vol 8 ◽

Author(s):

Jiawei He ◽

Jin Lin ◽

Meili Duan

Keyword(s):

Machine Learning ◽

Acute Kidney Injury ◽

Logistic Regression ◽

Intensive Care ◽

Kidney Injury ◽

Learning Models ◽

Short Term ◽

Acute Kidney Disease ◽

Mimic Iii ◽

Machine Learning Models

Background: Sepsis-associated acute kidney injury (AKI) is frequent in patients admitted to intensive care units (ICU) and may contribute to adverse short-term and long-term outcomes. Acute kidney disease (AKD) reflects the adverse events developing after AKI. We aimed to develop and validate machine learning models to predict the occurrence of AKD in patients with sepsis-associated AKI.Methods: Using clinical data from patients with sepsis in the ICU at Beijing Friendship Hospital (BFH), we studied whether the following three machine learning models could predict the occurrence of AKD using demographic, laboratory, and other related variables: Recurrent Neural Network-Long Short-Term Memory (RNN-LSTM), decision trees, and logistic regression. In addition, we externally validated the results in the Medical Information Mart for Intensive Care III (MIMIC III) database. The outcome was the diagnosis of AKD when defined as AKI prolonged for 7–90 days according to Acute Disease Quality Initiative-16.Results: In this study, 209 patients from BFH were included, with 55.5% of them diagnosed as having AKD. Furthermore, 509 patients were included from the MIMIC III database, of which 46.4% were diagnosed as having AKD. Applying machine learning could successfully achieve very high accuracy (RNN-LSTM AUROC = 1; decision trees AUROC = 0.954; logistic regression AUROC = 0.728), with RNN-LSTM showing the best results. Further analyses revealed that the change of non-renal Sequential Organ Failure Assessment (SOFA) score between the 1st day and 3rd day (Δnon-renal SOFA) is instrumental in predicting the occurrence of AKD.Conclusion: Our results showed that machine learning, particularly RNN-LSTM, can accurately predict AKD occurrence. In addition, Δ SOFAnon−renal plays an important role in predicting the occurrence of AKD.

Download Full-text

Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model

PLoS ONE ◽

10.1371/journal.pone.0246306 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0246306

Author(s):

Jialin Liu ◽

Jinfa Wu ◽

Siru Liu ◽

Mengdie Li ◽

Kunchang Hu ◽

...

Keyword(s):

Machine Learning ◽

Acute Kidney Injury ◽

Kidney Injury ◽

The Other ◽

Receiver Operating Curve ◽

Support Vector ◽

Research Database ◽

Learning Models ◽

Risk Of Death ◽

Machine Learning Models

Purpose The goal of this study is to construct a mortality prediction model using the XGBoot (eXtreme Gradient Boosting) decision tree model for AKI (acute kidney injury) patients in the ICU (intensive care unit), and to compare its performance with that of three other machine learning models. Methods We used the eICU Collaborative Research Database (eICU-CRD) for model development and performance comparison. The prediction performance of the XGBoot model was compared with the other three machine learning models. These models included LR (logistic regression), SVM (support vector machines), and RF (random forest). In the model comparison, the AUROC (area under receiver operating curve), accuracy, precision, recall, and F1 score were used to evaluate the predictive performance of each model. Results A total of 7548 AKI patients were analyzed in this study. The overall in-hospital mortality of AKI patients was 16.35%. The best performing algorithm in this study was XGBoost with the highest AUROC (0.796, p < 0.01), F1(0.922, p < 0.01) and accuracy (0.860). The precision (0.860) and recall (0.994) of the XGBoost model rank second among the four models. Conclusion XGBoot model had obvious advantages of performance compared to the other machine learning models. This will be helpful for risk identification and early intervention for AKI patients at risk of death.

Download Full-text

Performance of Statistical and Machine Learning-Based Methods for Predicting Biogeographical Patterns of Fungal Productivity in Forest Ecosystems

10.21203/rs.3.rs-122045/v1 ◽

2020 ◽

Author(s):

Albert Morera ◽

Juan Martínez de Aragón ◽

José Antonio Bonet ◽

Jingjing Liang ◽

Sergio de-Miguel

Keyword(s):

Machine Learning ◽

Random Forest ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Support Vector ◽

Learning Approaches ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models ◽

Modelling Approaches

Abstract BackgroundThe prediction of biogeographical patterns from a large number of driving factors with complex interactions, correlations and non-linear dependences require advanced analytical methods and modelling tools. This study compares different statistical and machine learning models for predicting fungal productivity biogeographical patterns as a case study for the thorough assessment of the performance of alternative modelling approaches to provide accurate and ecologically-consistent predictions.MethodsWe evaluated and compared the performance of two statistical modelling techniques, namely, generalized linear mixed models and geographically weighted regression, and four machine learning models, namely, random forest, extreme gradient boosting, support vector machine and deep learning to predict fungal productivity. We used a systematic methodology based on substitution, random, spatial and climatic blocking combined with principal component analysis, together with an evaluation of the ecological consistency of spatially-explicit model predictions.ResultsFungal productivity predictions were sensitive to the modelling approach and complexity. Moreover, the importance assigned to different predictors varied between machine learning modelling approaches. Decision tree-based models increased prediction accuracy by ~7% compared to other machine learning approaches and by more than 25% compared to statistical ones, and resulted in higher ecological consistence at the landscape level.ConclusionsWhereas a large number of predictors are often used in machine learning algorithms, in this study we show that proper variable selection is crucial to create robust models for extrapolation in biophysically differentiated areas. When dealing with spatial-temporal data in the analysis of biogeographical patterns, climatic blocking is postulated as a highly informative technique to be used in cross-validation to assess the prediction error over larger scales. Random forest was the best approach for prediction both in sampling-like environments as well as in extrapolation beyond the spatial and climatic range of the modelling data.

Download Full-text

Derivation and Validation of Machine Learning Approaches to Predict Acute Kidney Injury after Cardiac Surgery

Journal of Clinical Medicine ◽

10.3390/jcm7100322 ◽

2018 ◽

Vol 7 (10) ◽

pp. 322 ◽

Cited By ~ 31

Author(s):

Hyung-Chul Lee ◽

Hyun-Kyu Yoon ◽

Karam Nam ◽

Youn Cho ◽

Tae Kim ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Acute Kidney Injury ◽

Logistic Regression ◽

Regression Analysis ◽

Logistic Regression Analysis ◽

Kidney Injury ◽

Gradient Boosting ◽

Support Vector ◽

Learning Approaches

Machine learning approaches were introduced for better or comparable predictive ability than statistical analysis to predict postoperative outcomes. We sought to compare the performance of machine learning approaches with that of logistic regression analysis to predict acute kidney injury after cardiac surgery. We retrospectively reviewed 2010 patients who underwent open heart surgery and thoracic aortic surgery. Baseline medical condition, intraoperative anesthesia, and surgery-related data were obtained. The primary outcome was postoperative acute kidney injury (AKI) defined according to the Kidney Disease Improving Global Outcomes criteria. The following machine learning techniques were used: decision tree, random forest, extreme gradient boosting, support vector machine, neural network classifier, and deep learning. The performance of these techniques was compared with that of logistic regression analysis regarding the area under the receiver-operating characteristic curve (AUC). During the first postoperative week, AKI occurred in 770 patients (38.3%). The best performance regarding AUC was achieved by the gradient boosting machine to predict the AKI of all stages (0.78, 95% confidence interval (CI) 0.75–0.80) or stage 2 or 3 AKI. The AUC of logistic regression analysis was 0.69 (95% CI 0.66–0.72). Decision tree, random forest, and support vector machine showed similar performance to logistic regression. In our comprehensive comparison of machine learning approaches with logistic regression analysis, gradient boosting technique showed the best performance with the highest AUC and lower error rate. We developed an Internet–based risk estimator which could be used for real-time processing of patient data to estimate the risk of AKI at the end of surgery.

Download Full-text

Machine Learning Models for Sarcopenia Identification Based on Radiomic Features of Muscles in Computed Tomography

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18168710 ◽

2021 ◽

Vol 18 (16) ◽

pp. 8710

Author(s):

Young Jae Kim

Keyword(s):

Machine Learning ◽

Computed Tomography ◽

Ct Images ◽

Erector Spinae ◽

Support Vector ◽

Gray Level ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Nsclc Patients ◽

Machine Learning Models

The diagnosis of sarcopenia requires accurate muscle quantification. As an alternative to manual muscle mass measurement through computed tomography (CT), artificial intelligence can be leveraged for the automation of these measurements. Although generally difficult to identify with the naked eye, the radiomic features in CT images are informative. In this study, the radiomic features were extracted from L3 CT images of the entire muscle area and partial areas of the erector spinae collected from non-small cell lung carcinoma (NSCLC) patients. The first-order statistics and gray-level co-occurrence, gray-level size zone, gray-level run length, neighboring gray-tone difference, and gray-level dependence matrices were the radiomic features analyzed. The identification performances of the following machine learning models were evaluated: logistic regression, support vector machine (SVM), random forest, and extreme gradient boosting (XGB). Sex, coarseness, skewness, and cluster prominence were selected as the relevant features effectively identifying sarcopenia. The XGB model demonstrated the best performance for the entire muscle, whereas the SVM was the worst-performing model. Overall, the models demonstrated improved performance for the entire muscle compared to the erector spinae. Although further validation is required, the radiomic features presented here could become reliable indicators for quantifying the phenomena observed in the muscles of NSCLC patients, thus facilitating the diagnosis of sarcopenia.

Download Full-text