Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation (Preprint)

BACKGROUND COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. OBJECTIVE The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. METHODS We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. RESULTS Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. CONCLUSIONS We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.

Download Full-text

Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation

Journal of Medical Internet Research ◽

10.2196/24018 ◽

2020 ◽

Vol 22 (11) ◽

pp. e24018 ◽

Cited By ~ 1

Author(s):

Akhil Vaid ◽

Sulaiman Somani ◽

Adam J Russak ◽

Jessica K De Freitas ◽

Fayzan F Chaudhry ◽

...

Keyword(s):

Machine Learning ◽

New York ◽

New York City ◽

York City ◽

Mortality Prediction ◽

Critical Event ◽

Critical Events ◽

Event Prediction ◽

Risk Patients ◽

Machine Learning Models

Background COVID-19 has infected millions of people worldwide and is responsible for several hundred thousand fatalities. The COVID-19 pandemic has necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods to meet these needs are lacking. Objective The aims of this study were to analyze the electronic health records (EHRs) of patients who tested positive for COVID-19 and were admitted to hospitals in the Mount Sinai Health System in New York City; to develop machine learning models for making predictions about the hospital course of the patients over clinically meaningful time horizons based on patient characteristics at admission; and to assess the performance of these models at multiple hospitals and time points. Methods We used Extreme Gradient Boosting (XGBoost) and baseline comparator models to predict in-hospital mortality and critical events at time windows of 3, 5, 7, and 10 days from admission. Our study population included harmonized EHR data from five hospitals in New York City for 4098 COVID-19–positive patients admitted from March 15 to May 22, 2020. The models were first trained on patients from a single hospital (n=1514) before or on May 1, externally validated on patients from four other hospitals (n=2201) before or on May 1, and prospectively validated on all patients after May 1 (n=383). Finally, we established model interpretability to identify and rank variables that drive model predictions. Results Upon cross-validation, the XGBoost classifier outperformed baseline models, with an area under the receiver operating characteristic curve (AUC-ROC) for mortality of 0.89 at 3 days, 0.85 at 5 and 7 days, and 0.84 at 10 days. XGBoost also performed well for critical event prediction, with an AUC-ROC of 0.80 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. In external validation, XGBoost achieved an AUC-ROC of 0.88 at 3 days, 0.86 at 5 days, 0.86 at 7 days, and 0.84 at 10 days for mortality prediction. Similarly, the unimputed XGBoost model achieved an AUC-ROC of 0.78 at 3 days, 0.79 at 5 days, 0.80 at 7 days, and 0.81 at 10 days. Trends in performance on prospective validation sets were similar. At 7 days, acute kidney injury on admission, elevated LDH, tachypnea, and hyperglycemia were the strongest drivers of critical event prediction, while higher age, anion gap, and C-reactive protein were the strongest drivers of mortality prediction. Conclusions We externally and prospectively trained and validated machine learning models for mortality and critical events for patients with COVID-19 at different time horizons. These models identified at-risk patients and uncovered underlying relationships that predicted outcomes.

Download Full-text

Machine Learning to Predict Mortality and Critical Events in COVID-19 Positive New York City Patients

10.1101/2020.04.26.20073411 ◽

2020 ◽

Cited By ~ 5

Author(s):

Akhil Vaid ◽

Sulaiman Somani ◽

Adam J Russak ◽

Jessica K De Freitas ◽

Fayzan F Chaudhry ◽

...

Keyword(s):

Machine Learning ◽

New York ◽

New York City ◽

York City ◽

External Validation ◽

Extreme Case ◽

Critical Events ◽

Health Records ◽

Machine Learning Model ◽

Risk Patients

AbstractCoronavirus 2019 (COVID-19), caused by the SARS-CoV-2 virus, has become the deadliest pandemic in modern history, reaching nearly every country worldwide and overwhelming healthcare institutions. As of April 20, there have been more than 2.4 million confirmed cases with over 160,000 deaths. Extreme case surges coupled with challenges in forecasting the clinical course of affected patients have necessitated thoughtful resource allocation and early identification of high-risk patients. However, effective methods for achieving this are lacking. In this paper, we use electronic health records from over 3,055 New York City confirmed COVID-19 positive patients across five hospitals in the Mount Sinai Health System and present a decision tree-based machine learning model for predicting in-hospital mortality and critical events. This model is first trained on patients from a single hospital and then externally validated on patients from four other hospitals. We achieve strong performance, notably predicting mortality at 1 week with an AUC-ROC of 0.84. Finally, we establish model interpretability by calculating SHAP scores to identify decisive features, including age, inflammatory markers (procalcitonin and LDH), and coagulation parameters (PT, PTT, D-Dimer). To our knowledge, this is one of the first models with external validation to both predict outcomes in COVID-19 patients with strong validation performance and identify key contributors in outcome prediction that may assist clinicians in making effective patient management decisions.One-Sentence SummaryWe identify clinical features that robustly predict mortality and critical events in a large cohort of COVID-19 positive patients in New York City.

Download Full-text

Spatiotemporal and Machine Learning-Based Time Series Assessment of Drinking Water Quality Complaints in New York City

World Environmental and Water Resources Congress 2021 ◽

10.1061/9780784483466.089 ◽

2021 ◽

Author(s):

Jarai Sanneh ◽

Miah Cohall ◽

Juneseok Lee ◽

Yi Wang ◽

Diego Martínez García ◽

...

Keyword(s):

Machine Learning ◽

Water Quality ◽

New York ◽

Time Series ◽

New York City ◽

Drinking Water ◽

York City ◽

Drinking Water Quality

Download Full-text

Using machine learning to help vulnerable tenants in New York city

Proceedings of the Conference on Computing & Sustainable Societies - COMPASS 19 ◽

10.1145/3314344.3332484 ◽

2019 ◽

Author(s):

Teng Ye ◽

Rebecca Johnson ◽

Samantha Fu ◽

Jerica Copeny ◽

Bridgit Donnelly ◽

...

Keyword(s):

Machine Learning ◽

New York ◽

New York City ◽

York City

Download Full-text

Applications of Machine Learning Methods to Predict Readmission and Length-of-Stay for Homeless Families: The Case of Win Shelters in New York City

Journal of Technology in Human Services ◽

10.1080/15228835.2017.1418703 ◽

2018 ◽

Vol 36 (1) ◽

pp. 89-104 ◽

Cited By ~ 2

Author(s):

Boyeong Hong ◽

Awais Malik ◽

Jack Lundquist ◽

Ira Bellach ◽

Constantine E. Kontokosta

Keyword(s):

Machine Learning ◽

New York ◽

New York City ◽

Length Of Stay ◽

York City ◽

Homeless Families ◽

Learning Methods ◽

Machine Learning Methods ◽

Applications Of Machine Learning

Download Full-text

The New York City COVID-19 Spread in the 2020 Spring: A Study on the Potential Role of Particulate Using Time Series Analysis and Machine Learning

Applied Sciences ◽

10.3390/app11031177 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1177

Author(s):

Silvia Mirri ◽

Marco Roccetti ◽

Giovanni Delnevo

Keyword(s):

Machine Learning ◽

New York ◽

Time Series ◽

New York City ◽

York City ◽

Statistical Tests ◽

Ambient Air ◽

Air Pollutant ◽

Machine Learning Algorithms ◽

P Value

This study investigates the potential association between the daily distribution of the PM2,5 air pollutant and the initial spreading of COVID-19 in New York City. We study the period from 4 March to 22 March 2020, and apply our analysis to all five counties, including the city, plus seven neighboring counties, including both urban and peripheral districts. Using the Granger causality methodology, and considering the maximum lag period (14 days) between infection and the correspondent diagnosis, we found that the time series of the new daily infections registered in those 12 counties appear to correlate to the time series of the concentrations of the PM2.5 particulate circulating in the air, with 33 over 36 statistical tests with a p-value less than 0.005, thus confirming such a hypothesis. Moreover, looking for further confirmation of this association, we train four different machine learning algorithms on a portion of those time series. These are able to predict that the number of the new daily infections would have surpassed a given infections threshold for the remaining portion of the series, with an average accuracy ranging from 84% to 95%, depending on the algorithm and/or on the specific county under observation. This is similar to other results obtained from several polluted urban areas, e.g., Wuhan, Xiaogan, and Huanggang in China, and Northern Italy. Our study provides further evidence that ambient air pollutants can be associated with a daily COVID-19 infection incidence.

Download Full-text

Machine Learning for the New York City Power Grid

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2011.108 ◽

2012 ◽

Vol 34 (2) ◽

pp. 328-345 ◽

Cited By ~ 98

Author(s):

C. Rudin ◽

D. Waltz ◽

R. N. Anderson ◽

A. Boulanger ◽

A. Salleb-Aouissi ◽

...

Keyword(s):

Machine Learning ◽

New York ◽

New York City ◽

York City ◽

Power Grid

Download Full-text

An Important Armenian MS. with Greek Miniatures

Journal of the Royal Asiatic Society ◽

10.1017/s0035869x00097884 ◽

1942 ◽

Vol 74 (3-4) ◽

pp. 155-162

Author(s):

H. Kurdian

Keyword(s):

New York ◽

New York City ◽

York City ◽

Christian Iconography

In 1941 while in New York City I was fortunate enough to purchase an Armenian MS. which I believe will be of interest to students of Eastern Christian iconography.

Download Full-text

Hospitals: N.Y. Appellate Court Denies Move to Privatize Public Hospital

The Journal of Law Medicine & Ethics ◽

10.1017/s1073110500012961 ◽

1999 ◽

Vol 27 (2) ◽

pp. 202-203

Author(s):

Robert Chatham

Keyword(s):

New York ◽

New York City ◽

York City ◽

Public Health System ◽

Municipal Hospital ◽

The Public ◽

Public Benefit ◽

For Profit ◽

Benefit Corporation ◽

The City

The Court of Appeals of New York held, in Council of the City of New York u. Giuliani, slip op. 02634, 1999 WL 179257 (N.Y. Mar. 30, 1999), that New York City may not privatize a public city hospital without state statutory authorization. The court found invalid a sublease of a municipal hospital operated by a public benefit corporation to a private, for-profit entity. The court reasoned that the controlling statute prescribed the operation of a municipal hospital as a government function that must be fulfilled by the public benefit corporation as long as it exists, and nothing short of legislative action could put an end to the corporation's existence.In 1969, the New York State legislature enacted the Health and Hospitals Corporation Act (HHCA), establishing the New York City Health and Hospitals Corporation (HHC) as an attempt to improve the New York City public health system. Thirty years later, on a renewed perception that the public health system was once again lacking, the city administration approved a sublease of Coney Island Hospital from HHC to PHS New York, Inc. (PHS), a private, for-profit entity.

Download Full-text

ASHA Voices: What a New York City SLP's Seen During COVID-19

PodCast Digital Object Group ◽

10.1044/2020-0730-ashavoices-tami-nyu-icu-covid ◽

2020 ◽

Keyword(s):

New York ◽

New York City ◽

York City

Download Full-text