Interpretable Machine Learning Model for Mortality Prediction in ICU: A Multicenter Study
Abstract Background: Researchers have long been struggling to improve the disease severity score in mortality prediction in ICU. The digitalization of medical health records and advancement of computation power have promoted the use of machine learning in critical care. This study aimed to develop an interpretable machine learning model using datasets from multicenters, and to compare with the APACHE IV, in predicting hospital mortality of patients admitted to ICU.Method: The datasets were assembled from the eICU database including 136145 patients across 208 hospitals throughout the U.S. and 5 ICUs in Hong Kong, including 10909 patients. The two datasets were first combined into one large dataset before 80:20 stratified split into the training set and the test set. The XGBoost machine algorithm was chosen to predict the hospital mortality. The variables in the model were the same as those included in the APACHE IV score. The discrimination and calibration of the model were assessed. The model would be interpreted using the Shapley Additive explanations values.Results: Of the 147054 patients in the whole cohort, the hospital mortality was 9.3%. The area under the precision-recall curve for the XGBoost algorithm was 0.57, and 0.49 for APACHE IV. Similarly, the XGBoost reached an area under the receiving operating curve (AUROC) of 0.90, while APACHE IV had an AUROC of 0.87. Additionally, the XGBoost algorithm showed better calibration than the APACHE IV. The three most important variables were age, heart rate, and whether the patient was on ventilator.Conclusions: The severity score developed by machine learning model using mutlicenter datasets outperformed the APACHE IV in predicting hospital mortality for patients admitted to ICU.