scholarly journals Development and Validation of an Interpretable 3-day Intensive Care Unit Readmission Prediction Model Using Explainable Boosting Machines

Author(s):  
Stefan Hegselmann ◽  
Christian Ertmer ◽  
Thomas Volkert ◽  
Antje Gottschalk ◽  
Martin Dugas ◽  
...  

Intensive care unit readmissions are associated with mortality and bad outcomes. Machine learning could help to identify patients at risk to improve discharge decisions. However, many models are black boxes, so that dangerous properties might remain unnoticed. In this study, an inherently interpretable model for 3-day ICU readmission prediction was developed. We used a retrospective cohort of 15,589 ICU stays and 169 variables collected between 2006 and 2019. A team of doctors inspected the model, checked the plausibility of each component, and removed problematic parts. Qualitative feedback revealed several challenges for interpretable machine learning in healthcare. The resulting model used 67 features and showed an area under the precision-recall curve of 0.119+/-0.020 and an area under the receiver operating characteristic curve of 0.680+/-0.025. This is on par with state-of-the-art gradient boosting machines and outperforms the Simplified Acute Physiology Score II. External validation with the Medical Information Mart for Intensive Care database version IV confirmed our findings. Hence, a machine learning model for readmission prediction with a high level of human control is feasible without sacrificing performance.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bongjin Lee ◽  
Kyunghoon Kim ◽  
Hyejin Hwang ◽  
You Sun Kim ◽  
Eun Hee Chung ◽  
...  

AbstractThe aim of this study was to develop a predictive model of pediatric mortality in the early stages of intensive care unit (ICU) admission using machine learning. Patients less than 18 years old who were admitted to ICUs at four tertiary referral hospitals were enrolled. Three hospitals were designated as the derivation cohort for machine learning model development and internal validation, and the other hospital was designated as the validation cohort for external validation. We developed a random forest (RF) model that predicts pediatric mortality within 72 h of ICU admission, evaluated its performance, and compared it with the Pediatric Index of Mortality 3 (PIM 3). The area under the receiver operating characteristic curve (AUROC) of RF model was 0.942 (95% confidence interval [CI] = 0.912–0.972) in the derivation cohort and 0.906 (95% CI = 0.900–0.912) in the validation cohort. In contrast, the AUROC of PIM 3 was 0.892 (95% CI = 0.878–0.906) in the derivation cohort and 0.845 (95% CI = 0.817–0.873) in the validation cohort. The RF model in our study showed improved predictive performance in terms of both internal and external validation and was superior even when compared to PIM 3.


BMJ Open ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. e053548
Author(s):  
Xie Wu ◽  
Qipeng Luo ◽  
Zhanhao Su ◽  
Yinan Li ◽  
Hongbai Wang ◽  
...  

ObjectivesIdentifying high-risk patients in the intensive care unit (ICU) is important given the high mortality rate. However, existing scoring systems lack easily accessible, low-cost and effective inflammatory markers. We aimed to identify inflammatory markers in routine blood tests to predict mortality in ICU patients and evaluate their predictive power.DesignRetrospective case–control study.SettingSingle secondary care centre.ParticipantsWe analysed data from the Medical Information Mart for Intensive Care III database. A total of 21 822 ICU patients were enrolled and divided into survival and death groups based on in-hospital mortality.Primary and secondary outcome measuresThe predictive values of potential inflammatory markers were evaluated and compared using receiver operating characteristic curve analysis. After identifying the neutrophil-to-lymphocyte ratio (NLR) as having the best predictive ability, patients were redivided into low (≤1), medium (1–6) and high (>6) NLR groups. Univariate and multivariate logistic regression analyses were performed to evaluate the association between the NLR and mortality. The area under the curve (AUC), net reclassification improvement (NRI) and integrated discrimination improvement (IDI) were used to assess whether incorporating the NLR could improve the predictive power of existing scoring systems.ResultsThe NLR had the best predictive ability (AUC: 0.609; p<0.001). In-hospital mortality rates were significantly higher in the low (OR (OR): 2.09; 95% CI 1.64 to 2.66) and high (OR 1.64; 95% CI 1.50 to 1.80) NLR groups than in the medium NLR group. Adding the NLR to the Simplified Acute Physiology Score II improved the AUC from 0.789 to 0.798, with an NRI and IDI of 16.64% and 0.27%, respectively.ConclusionsThe NLR predicted mortality in ICU patients well. Both low and high NLRs were associated with elevated mortality rates, including the NLR may improve the predictive power of the Simplified Acute Physiology Score II.


2021 ◽  
Vol 2 (4) ◽  
Author(s):  
Y Jones ◽  
J Cleland ◽  
C Li ◽  
P Pellicori ◽  
J Friday

Abstract Background The number of publications using machine learning (ML) to predict cardiovascular outcomes and identify clusters of patients at greater risk has risen dramatically in recent years. However, research papers which use ML often fail to provide sufficient information about their algorithms to enable results to be replicated by others in the same or different datasets. Aim To test the reproducibility of results from ML algorithms given three different levels of information commonly found in publications: model type alone, a description of the model, and complete algorithm. Methods MIMIC-III is a healthcare dataset comprising detailed information from over 60,000 intensive care unit (ICU) admissions from the Beth Israel Deaconess Medical Centre between 2001 and 2012. Access is available to everyone pending approval and completion of a short training course. Using this dataset, three models for predicting all-cause in-hospital mortality were created, two from a PhD student working in ML, and one from an existing research paper which used the same dataset and provided complete model information. A second researcher (a PhD student in ML and cardiology) was given the same dataset and was tasked with reproducing their results. Initially, this second researcher was told what type of model was created in each case, followed by a brief description of the algorithms. Finally, the complete algorithms from each participant were provided. In all three scenarios, recreated models were compared to original models using Area Under the Receiver Operating Characteristic Curve (AUC). Results After excluding those younger than 18 years and events with missing or invalid entries, 21,139 ICU admissions remained from 18,094 patients between 2001 and 2012, including 2,797 in-hospital deaths. Three models were produced: two Recurrent Neural Networks (RNNs) which differed significantly in internal weights and variables, and a Boosted Tree Classifier (BTC). The AUC of the first reproduced RNN matched that of the original RNN (Figure 1), however the second RNN and the BTC could not be reproduced given model type alone. As more information was provided about these algorithms, the results from the reproduced models matched the original results more closely. Conclusions In order to create clinically useful ML tools with results that are reproducible and consistent, it is vital that researchers share enough detail about their models. Model type alone is not enough to guarantee reproducibility. Although some models can be recreated with limited information, this is not always the case, and the best results are found when the complete algorithm is shared. These findings have huge relevance when trying to apply ML in clinical practice. Funding Acknowledgement Type of funding sources: None.


Circulation ◽  
2021 ◽  
Vol 144 (Suppl_2) ◽  
Author(s):  
Tsung-Chien Lu ◽  
Eric H Chou ◽  
CHIH-HUNG WANG ◽  
Amir Mostafavi ◽  
Mario Tovar ◽  
...  

Introduction: There are only scarce models developed for stratifying the risk of cardiac arrest from COVID-19 patients presenting to the ED with suspected pneumonia. By using the machine learning (ML) approach, we aimed to develop and validate the ML models to predict in-hospital cardiac arrest (IHCA) in patients admitted from the ED. Hypothesis: We hypothesized that ML approach can serve as a valuable tool in identifying patients at risk of IHCA in a timely fashion. Methods: We included the COVID-19 patients admitted from the EDs of five hospitals in Texas between March and November 2020. All adult (≥ 18 years) patients were included if they had positive RT-PCR for SARS-CoV-2 and also received CXR examination for suspected pneumonia. Patients’ demographic, past medical history, vital signs at ED triage, CXR findings, and laboratory results were retrieved from the EMR system. The primary outcome (IHCA) was identified via a resuscitation code. Patients presented as OHCA or without any blood testing were excluded. Nonrandom splitting strategy based on different location was used to divide the dataset into the training (one urban and two suburban hospitals) and testing cohort (one urban and one suburban hospital) at around 2-to-1 ratio. Three supervised ML models were trained and performances were evaluated and compared with the National Early Warning Score (NEWS) by the area under the receiver operating characteristic curve (AUC). Results: We included 1,485 records for analysis. Of them, 190 (12.8%) developed IHCA. Of the constructed ML models, Random Forest outperformed the others with the best AUC result (0.930, 95% CI: 0.896-0.958), followed by Gradient Boosting (0.929, 95% CI: 0.891-0.959) and Extra Trees classifier (0.909, 95% CI: 0.875-0.943). All constructed ML models performed significantly better than by using the NEWS scoring system (AUC: 0.787, 95% CI: 0.725-0.840). The top six important features selected were age, oxygen saturation at triage, and lab data of APTT, lactic acid, and LDH. Conclusions: The ML approach showed excellent discriminatory performance to identify IHCA for patients with COVID-19 and suspected pneumonia. It has the potential to save more life or provide end-of-life decision making if successfully implemented in the EMR system.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Wenjuan Luo ◽  
Rui Xing ◽  
Canmin Wang

Abstract Background Mechanical ventilation (MV) is often applied in critically ill patients in intensive care unit (ICU) to protect the airway from aspiration, and supplement more oxygen. MV may result in ventilator-associated pneumonia (VAP) in ICU patients. This study was to estimate the 90-day and 180-day mortalities of ICU patients with VAP, and to explore the influence of VAP on the outcomes of ICU patients. Methods Totally, 8182 patients who aged ≥18 years and received mechanical ventilation (MV) in ICU from Medical Information Mart for Intensive Care III (MIMIC III) database were involved in this study. All subjects were divided into the VAP group (n = 537) and the non-VAP group (n = 7626) based on the occurrence of VAP. Clinical data of all participants were collected. The effect of VAP on the prognosis of ICU patients was explored by binary logistic regression analysis. Results The results delineated that the 90-day mortality of VAP patients in ICU was 33.33% and 180-day mortality was 37.62%. The 90-day and 180-day mortality rates were higher in the VAP group than in the non-VAP group. After adjusting the confounders including age, ethnicity, heart failure, septicemia, simplified acute physiology score II (SAPSII) score, sequential organ failure assessment (SOFA) score, serum lactate, white blood cell (WBC), length of ICU stay, length of hospital stay, length of ventilation, antibiotic treatment, Pseudomonas aeruginosa (P.aeruginosa), methicillin-resistant Staphylococcus aureus (MRSA), other pathogens, the risk of 90-day and 180-day mortalities in VAP patients were 1.465 times (OR = 1.465, 95%CI: 1.188–1.807, P < 0.001) and 1.635 times (OR = 1.635, 95%CI: 1.333–2.005, P < 0.001) higher than those in non-VAP patients, respectively. Conclusions Our study revealed that ICU patients with VAP had poorer prognosis than those without VAP. The results of this study might offer a deeper insight into preventing the occurrence of VAP.


2020 ◽  
Author(s):  
Joo Heung Yoon ◽  
Vincent Jeanselme ◽  
Artur Dubrawski ◽  
Marilyn Hravnak ◽  
Michael R. Pinsky ◽  
...  

Abstract Background. Even brief hypotension is associated with increased morbidity and mortality. We developed a machine learning model to predict the initial hypotension event among intensive care unit (ICU) patients, and designed an alert system for bedside implementation. Materials and Methods. From the Medical Information Mart for Intensive Care III (MIMIC-3) dataset minute-by-minute vital signs were extracted. A hypotension event was defined as at least 5 measurements within a 10-minute period of systolic blood pressure ≤ 90 mmHg and mean arterial pressure ≤ 60 mmHg. A random forest (RF) classifier was used to predict hypotension, and performance was measured with area under the receiver operating characteristic curve (AUROC) and area under the precision recall curve (AUPRC). Hypotension alerts were generated using risk score thresholds, then a stacked RF model and a lock-out time were applied for real-life implementation. Results. We identified 1307 subjects (1580 ICU stays) as the case (hypotension) group and 1619 subjects (2279 ICU stays) as the control group. The RF model showed AUROC of 0.93 and 0.88 at 15 and 60 minutes respectively before hypotension, and AUPRC of 0.77 at 60 minutes before. Risk score trajectories revealed 80% and > 60% of cases predicted at 15 and 60 minutes before the hypotension, respectively. The stacked model with 15-minute lock-out produced on average 0.79 alerts/subject/hour (sensitivity 92.4%). Conclusion. Clinically significant hypotension events in the ICU can be predicted at least 1 hour before the initial hypotension episode. Developing a high-sensitive and reliable practical alert system is feasible, with low rate of alerts.


2019 ◽  
Vol 2019 ◽  
pp. 1-10 ◽  
Author(s):  
Jing Xia ◽  
Su Pan ◽  
Min Zhu ◽  
Guolong Cai ◽  
Molei Yan ◽  
...  

In intensive care unit (ICU), it is essential to predict the mortality of patients and mathematical models aid in improving the prognosis accuracy. Recently, recurrent neural network (RNN), especially long short-term memory (LSTM) network, showed advantages in sequential modeling and was promising for clinical prediction. However, ICU data are highly complex due to the diverse patterns of diseases; therefore, instead of single LSTM model, an ensemble algorithm of LSTM (eLSTM) is proposed, utilizing the superiority of the ensemble framework to handle the diversity of clinical data. The eLSTM algorithm was evaluated by the acknowledged database of ICU admissions Medical Information Mart for Intensive Care III (MIMIC-III). The investigation in total of 18415 cases shows that compared with clinical scoring systems SAPS II, SOFA, and APACHE II, random forests classification algorithm, and the single LSTM classifier, the eLSTM model achieved the superior performance with the largest value of area under the receiver operating characteristic curve (AUROC) of 0.8451 and the largest area under the precision-recall curve (AUPRC) of 0.4862. Furthermore, it offered an early prognosis of ICU patients. The results demonstrate that the eLSTM is capable of dynamically predicting the mortality of patients in complex clinical situations.


2021 ◽  
Author(s):  
Xuze Zhao ◽  
Bo Qu

Abstract Background: Sepsis is one of the dominating causes of mortality and morbidity in-hospital especially in intensive care units (ICU) patients. Therefore, a reliable decision-making model for predicting sepsis is of great importance. The purpose of this study was to develop an eXtreme Gradient Boosting (XGBoost) based model and explore whether it performs better in predicting sepsis from the time of admission in intensive care units (ICU) than other machine learning (ML) methods. Methods: The source data used for model establishment in this study were from a retrospective medical information mart for intensive care (MIMIC) III dataset, restricted to intensive care units (ICUs) patients aged between 18 and 89. Model performance of the XGBoost model was compared to logistic regression (LR), recursive neural network (RNN), and support vector machine (SVM). Then, the performances of the models were evaluated and compared by the area under the curve (AUC) of the receiver operating characteristic (ROC) curves.Results: A total of 6430 MIMIC-III cases are included in this article, in which, 3021 cases have encountered sepsis while 3409 cases have not, respectively. As for the AUC (0.808 (95% CI): 0.767-0.848,DT), 0.802 (95%CI: 0.762-0.842,RNN), 0.790 (95%CI: 0.751-0.830,SVM), 0.775 (95%CI: 0.736-0.813,LR) , results of the models, XGBoost performs best in predicting sepsis.Conclusions: By using the DT algorithm, a more accurate prediction model can be established. Amongst other ML methods, the XGBoost model demonstrated the best ability in detecting the sepsis of the patients in ICU.


2019 ◽  
Author(s):  
Peng Yan ◽  
Siwan Huang ◽  
Tiange Chen ◽  
Xiang Li ◽  
Guoxin Mo ◽  
...  

Abstract Background and objectives : Intensive care unit (ICU) resources are inadequate for the large population in China, so it is essential for physicians to evaluate the condition of patients at admission. In this study, our objective was to construct a machine learning risk prediction model for mortality in respiratory intensive care units (RICUs). Methods: This study involved 817 patients who made 1,063 visits and who were admitted to the RICU from January 1, 2012, to December 31, 2017. Potential predictors such as demographic information, laboratory results, vital signs and clinical characteristics were considered. Among the 1,063 visits, the RICU mortality rate was 13.5%. We constructed eXtreme Gradient Boosting (XGBoost) models and compared predictive performances with random forest models, logistic regression models and clinical scores such as Acute Physiology and Chronic Health Evaluation II (APACHE II) and the sequential organ failure assessment (SOFA) system. Results : For this dataset, XGBoost models achieved the best performance with the area under the receiver operating characteristics curve (AUROC) of 0.923 (95% CI: 0.889 – 0.957) in the test set, which was significantly greater than APACHE II (0.811, 95% CI: 0.778 – 0.844) and SOFA (0.805, 95% CI: 0.770 – 0.840). The Hosmer-Lemeshow statistic was 12.667 with a P-value of 0.124, which indicated a good calibration of our predictive model in the test set. The final model contained variables that were previously known to be associated with mortality, but it also included some features absent from the clinical scores. The mean N-terminal pro-B-type natriuretic peptide (NT-proBNP) of survivors was significantly lower than that of the non-survival group (2066.43 pg/mL vs. 8232.81 pg/mL; p<0.001). Conclusions : Our results showed that the XGBoost model could be a suitable model for predicting RICU mortality with easy-to-collect variables at admission and help intensivists improve clinical decision-making for RICU patients. We found that NT-proBNP can be a good indicator of poor prognosis.


2020 ◽  
Author(s):  
Khaled Shawwa ◽  
Erina Ghosh ◽  
Stephanie Lanius ◽  
Emma Schwager ◽  
Larry Eshelman ◽  
...  

Abstract Background Acute kidney injury (AKI) carries a poor prognosis. Its incidence is increasing in the intensive care unit (ICU). Our purpose in this study is to develop and externally validate a model for predicting AKI in the ICU using patient data present prior to ICU admission. Methods We used data of 98 472 adult ICU admissions at Mayo Clinic between 1 January 2005 and 31 December 2017 and 51 801 encounters from Medical Information Mart for Intensive Care III (MIMIC-III) cohort. A gradient-boosting model was trained on 80% of the Mayo Clinic cohort using a set of features to predict AKI acquired in the ICU. Results AKI was identified in 39 307 (39.9%) encounters in the Mayo Clinic cohort. Patients who developed AKI in the ICU were older and had higher ICU and in-hospital mortality compared to patients without AKI. A 30-feature model yielded an area under the receiver operating curve of 0.690 [95% confidence interval (CI) 0.682–0.697] in the Mayo Clinic cohort set and 0.656 (95% CI 0.648–0.664) in the MIMIC-III cohort. Conclusions Using machine learning, AKI among ICU patients can be predicted using information available prior to admission. This model is independent of ICU information, making it valuable for stratifying patients at admission.


Sign in / Sign up

Export Citation Format

Share Document