scholarly journals Score for Emergency Risk Prediction (SERP): An Interpretable Machine Learning AutoScore–Derived Triage Tool for Predicting Mortality after Emergency Admissions

Author(s):  
Feng Xie ◽  
Marcus Eng Hock Ong ◽  
Johannes Nathaniel Min Hui Liew ◽  
Kenneth Boon Kiat Tan ◽  
Andrew Fu Wah Ho ◽  
...  

AbstractImportanceTriage in the emergency department (ED) for admission and appropriate level of hospital care is a complex clinical judgment based on the tacit understanding of the patient’s likely acute course, availability of medical resources, and local practices. While a scoring tool could be valuable in triage, currently available tools have demonstrated limitations.ObjectiveTo develop a tool based on a parsimonious list of predictors available early at ED triage, to provide a simple, early, and accurate estimate of short-term mortality risk, the Score for Emergency Risk Prediction (SERP), and evaluate its predictive accuracy relative to published tools.Design, Setting, and ParticipantsWe performed a single-site, retrospective study for all emergency department (ED) patients between January 2009 and December 2016 admitted in a tertiary hospital in Singapore. SERP was derived using the machine learning framework for developing predictive models, AutoScore, based on six variables easily available early in the ED care process. Using internal validation, the SERP was compared to the current triage system, Patient Acuity Category Scale (PACS), Modified Early Warning Score (MEWS), National Early Warning Score (NEWS), Cardiac Arrest Risk Triage (CART), and Charlson Comorbidity Index (CCI) in predicting both primary and secondary outcomes in the study.Main Outcomes and MeasuresThe primary outcome of interest was 30-day mortality. Secondary outcomes include 2-day mortality, inpatient mortality, 30-day post-discharge mortality, and 1-year mortality. The SERP’s predictive power was measured using the area under the curve (AUC) in the receiver operating characteristic (ROC) analysis. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated under the optimal threshold, defined as the point nearest to the upper-left corner of the ROC curve.ResultsWe included 224,666 ED episodes in the model training cohort, 56,167 episodes in the validation cohort, and 42,676 episodes in the testing cohort. 18,797 (5.8%) of them died in 30 days after their ED visits. Evaluated on the testing set, SERP outperformed several benchmark scores in predicting 30-day mortality and other mortality-related outcomes. Under cut-off score of 27, SERP achieved a sensitivity of 72.6% (95% confidence interval [CI]: 70.7-74.3%), a specificity of 77.8% (95% CI: 77.5-78.2), a positive predictive value of 15.8% (15.4-16.2%) and a negative predictive value of 98% (97.9-98.1%).ConclusionsSERP showed better prediction performance than existing triage scores while maintaining easy implementation and ease of ascertainment at the ED. It has the potential to be widely applied and validated in different circumstances and healthcare settings.Key pointsQuestionHow does a tool for predicting hospital outcomes based on a machine learning-based automatic clinical score generator, AutoScore, perform in a cohort of individuals admitted to hospital from the emergency department (ED) compared to other published clinical tools?FindingsThe new tool, the Score for Emergency Risk Prediction (SERP), is parsimonious and point-based. SERP was more accurate in identifying patients who died during short or long-term care, compared with other point-based clinical tools.MeaningSERP, a tool based on AutoScore is promising for triaging patients admitted from the ED according to mortality risk.

JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Sean C Yu ◽  
Nirmala Shivakumar ◽  
Kevin Betthauser ◽  
Aditi Gupta ◽  
Albert M Lai ◽  
...  

Abstract The objective of this study was to directly compare the ability of commonly used early warning scores (EWS) for early identification and prediction of sepsis in the general ward setting. For general ward patients at a large, academic medical center between early-2012 and mid-2018, common EWS and patient acuity scoring systems were calculated from electronic health records (EHR) data for patients that both met and did not meet Sepsis-3 criteria. For identification of sepsis at index time, National Early Warning Score 2 (NEWS 2) had the highest performance (area under the receiver operating characteristic curve: 0.803 [95% confidence interval [CI]: 0.795–0.811], area under the precision recall curves: 0.130 [95% CI: 0.121–0.140]) followed NEWS, Modified Early Warning Score, and quick Sequential Organ Failure Assessment (qSOFA). Using validated thresholds, NEWS 2 also had the highest recall (0.758 [95% CI: 0.736–0.778]) but qSOFA had the highest specificity (0.950 [95% CI: 0.948–0.952]), positive predictive value (0.184 [95% CI: 0.169–0.198]), and F1 score (0.236 [95% CI: 0.220–0.253]). While NEWS 2 outperformed all other compared EWS and patient acuity scores, due to the low prevalence of sepsis, all scoring systems were prone to false positives (low positive predictive value without drastic sacrifices in sensitivity), thus leaving room for more computationally advanced approaches.


10.2196/24246 ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. e24246 ◽  
Author(s):  
Siavash Bolourani ◽  
Max Brenner ◽  
Ping Wang ◽  
Thomas McGinn ◽  
Jamie S Hirsch ◽  
...  

Background Predicting early respiratory failure due to COVID-19 can help triage patients to higher levels of care, allocate scarce resources, and reduce morbidity and mortality by appropriately monitoring and treating the patients at greatest risk for deterioration. Given the complexity of COVID-19, machine learning approaches may support clinical decision making for patients with this disease. Objective Our objective is to derive a machine learning model that predicts respiratory failure within 48 hours of admission based on data from the emergency department. Methods Data were collected from patients with COVID-19 who were admitted to Northwell Health acute care hospitals and were discharged, died, or spent a minimum of 48 hours in the hospital between March 1 and May 11, 2020. Of 11,525 patients, 933 (8.1%) were placed on invasive mechanical ventilation within 48 hours of admission. Variables used by the models included clinical and laboratory data commonly collected in the emergency department. We trained and validated three predictive models (two based on XGBoost and one that used logistic regression) using cross-hospital validation. We compared model performance among all three models as well as an established early warning score (Modified Early Warning Score) using receiver operating characteristic curves, precision-recall curves, and other metrics. Results The XGBoost model had the highest mean accuracy (0.919; area under the curve=0.77), outperforming the other two models as well as the Modified Early Warning Score. Important predictor variables included the type of oxygen delivery used in the emergency department, patient age, Emergency Severity Index level, respiratory rate, serum lactate, and demographic characteristics. Conclusions The XGBoost model had high predictive accuracy, outperforming other early warning scores. The clinical plausibility and predictive ability of XGBoost suggest that the model could be used to predict 48-hour respiratory failure in admitted patients with COVID-19.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11988
Author(s):  
Kuan-Han Wu ◽  
Fu-Jen Cheng ◽  
Hsiang-Ling Tai ◽  
Jui-Cheng Wang ◽  
Yii-Ting Huang ◽  
...  

Background A feasible and accurate risk prediction systems for emergency department (ED) patients is urgently required. The Modified Early Warning Score (MEWS) is a wide-used tool to predict clinical outcomes in ED. Literatures showed that machine learning (ML) had better predictability in specific patient population than traditional scoring system. By analyzing a large multicenter dataset, we aim to develop a ML model to predict in-hospital morality of the adult non traumatic ED patients for different time stages, and comparing performance with other ML models and MEWS. Methods A retrospective observational cohort study was conducted in five Taiwan EDs including two tertiary medical centers and three regional hospitals. All consecutively adult (>17 years old) non-traumatic patients admit to ED during a 9-year period (January first, 2008 to December 31th, 2016) were included. Exclusion criteria including patients with (1) out-of-hospital cardiac arrest and (2) discharge against medical advice and transferred to other hospital (3) missing collect variables. The primary outcome was in-hospital mortality and were categorized into 6, 24, 72, 168 hours mortality. MEWS was calculated by systolic blood pressure, pulse rate, respiratory rate, body temperature, and level of consciousness. An ensemble supervised stacking ML model was developed and compared to sensitive and unsensitive Xgboost, Random Forest, and Adaboost. We conducted a performance test and examine both the area under the receiver operating characteristic (AUROC) and the area under the precision and recall curve (AUPRC) as the comparative measures. Result After excluding 182,001 visits (7.46%), study group was consisted of 24,37,326 ED visits. The dataset was split into 67% training data and 33% test data for ML model development. There was no statistically difference found in the characteristics between two groups. For the prediction of 6, 24, 72, 168 hours in-hospital mortality, the AUROC of MEW and ML mode was 0.897, 0.865, 0.841, 0.816 and 0.939, 0.928, 0.913, 0.902 respectively. The stacking ML model outperform other ML model as well. For the prediction of in-hospital mortality over 48-hours, AUPRC performance of MEWS drop below 0.1, while the AUPRC of ML mode was 0.317 in 6 hours and 0.2150 in 168 hours. For each time frame, ML model achieved statistically significant higher AUROC and AUPRC than MEWS (all P < 0.001). Both models showed decreasing prediction ability as time elapse, but there was a trend that the gap of AUROC values between two model increases gradually (P < 0.001). Three MEWS thresholds (score >3, >4, and >5) were determined as baselines for comparison, ML mode consistently showed improved or equally performance in sensitivity, PPV, NPV, but not in specific. Conclusion Stacking ML methods improve predicted in-hospital mortality than MEWS in adult non-traumatic ED patients, especially in the prediction of delayed mortality.


Author(s):  
Jeremy Zhenwen Pong ◽  
Stephanie Fook-Chong ◽  
Zhi Xiong Koh ◽  
Mas’uud Ibnu Samsudin ◽  
Takashi Tagami ◽  
...  

The emergency department (ED) serves as the first point of hospital contact for many septic patients, where risk-stratification would be invaluable. We devised a combination model incorporating demographic, clinical, and heart rate variability (HRV) parameters, alongside individual variables of the Sequential Organ Failure Assessment (SOFA), Acute Physiology and Chronic Health Evaluation II (APACHE II), and Mortality in Emergency Department Sepsis (MEDS) scores for mortality risk-stratification. ED patients fulfilling systemic inflammatory response syndrome criteria were recruited. National Early Warning Score (NEWS), Modified Early Warning Score (MEWS), quick SOFA (qSOFA), SOFA, APACHE II, and MEDS scores were calculated. For the prediction of 30-day in-hospital mortality, combination model performed with an area under the receiver operating characteristic curve of 0.91 (95% confidence interval (CI): 0.88–0.95), outperforming NEWS (0.70, 95% CI: 0.63–0.77), MEWS (0.61, 95% CI 0.53–0.69), qSOFA (0.70, 95% CI 0.63–0.77), SOFA (0.74, 95% CI: 0.67–0.80), APACHE II (0.76, 95% CI: 0.69–0.82), and MEDS scores (0.86, 95% CI: 0.81–0.90). The combination model had an optimal sensitivity and specificity of 91.4% (95% CI: 81.6–96.5%) and 77.9% (95% CI: 72.6–82.4%), respectively. A combination model incorporating clinical, HRV, and disease severity score variables showed superior predictive ability for the mortality risk-stratification of septic patients presenting at the ED.


2020 ◽  
Vol 7 (Supplement_1) ◽  
pp. S251-S251
Author(s):  
Joanna S Cavalier ◽  
Benjamin Goldstein ◽  
Cara L O’Brien ◽  
Armando Bedoya

Abstract Background The novel coronavirus disease (COVID-19) results in severe illness in a significant proportion of patients, necessitating a way to discern which patients will become critically ill and which will not. In one large case series, 5.0% of patients required an intensive care unit (ICU) and 1.4% died. Several models have been developed to assess decompensating patients. However, research examining their applicability to COVID-19 patients is limited. An accurate predictive model for patients at risk of decompensation is critical for health systems to optimally triage emergencies, care for patients, and allocate resources. Methods An early warning score (EWS) algorithm created within a large academic medical center, with methodology previously described, was applied to COVID-19 patients admitted to this institution. 122 COVID-19 patients were included. A decompensation event was defined as inpatient mortality or an unanticipated transfer to an ICU from an intermediate medical ward. The EWS was calculated at 12-hour and 24-hour intervals. Results Of 122 patients admitted with COVID-19, 28 had a decompensation event, yielding an event rate of 23.0%. 8 patients died, 13 transferred to the ICU, and 6 both transferred to the ICU and died. Decompensation within 12 and 24 hours were predicted with areas under the curve (AUC) of 0.850 and 0.817, respectively. Using a three-tiered risk model, use of the customized EWS score for patients identified as high risk of decompensation had a positive predictive value of 44.4% and 11.1% and specificity of 99.3% and 99.6% and 12- and 24-hour intervals. Amongst medium-risk patients, the score had a specificity of 85.0% and 85.4%, respectively. Conclusion This EWS allows for prediction of decompensation, defined as transfer to an ICU or death, in COVID-19 patients with excellent specificity and a high positive predictive value. Clinically, implementation of this score can help to identify patients before they decompensate in order to triage at time of presentation and allocate step-down beds, ICU beds, and treatments such as remdesivir. Disclosures All Authors: No reported disclosures


BMJ Open ◽  
2018 ◽  
Vol 8 (12) ◽  
pp. e024120 ◽  
Author(s):  
Xiaohua Xie ◽  
Wenlong Huang ◽  
Qiongling Liu ◽  
Wei Tan ◽  
Lu Pan ◽  
...  

ObjectivesThis study aimed to validate the performance of the Modified Early Warning Score (MEWS) in a Chinese emergency department and to determine the best cut-off value for in-hospital mortality prediction.DesignA prospective, single-centred observational cohort study.SettingThis study was conducted at a tertiary hospital in South China.ParticipantsA total of 383 patients aged 18 years or older who presented to the emergency department from 17 May 2017 through 27 September 2017, triaged as category 1, 2 or 3, were enrolled.OutcomesThe primary outcome was a composite of in-hospital mortality and admission to the intensive care unit. The secondary outcome was using MEWS to predict hospitalised and discharged patients.ResultsA total of 383 patients were included in this study. In-hospital mortality was 13.6% (52/383), and transfer to the intensive care unit was 21.7% (83/383). The area under the receiver operating characteristic curve of MEWS for in-hospital mortality prediction was 0.83 (95% CI 0.786 to 0.881). When predicting in-hospital mortality with the cut-off point defined as 3.5, 158 patients had MEWS >3.5, with a specificity of 66%, a sensitivity of 87%, an accuracy of 69%, a positive predictive value of 28% and a negative predictive value of 97%, respectively.ConclusionOur findings support the use of MEWS for in-hospital mortality prediction in patients who were triaged category 1, 2 or 3 in a Chinese emergency department. The cut-off value for in-hospital mortality prediction defined in this study was different from that seen in many other studies.


Sign in / Sign up

Export Citation Format

Share Document