scholarly journals Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing (Preprint)

2020 ◽  
Author(s):  
Jose Luis Izquierdo ◽  
Julio Ancochea ◽  
Joan B Soriano ◽  

BACKGROUND Many factors involved in the onset and clinical course of the ongoing COVID-19 pandemic are still unknown. Although big data analytics and artificial intelligence are widely used in the realms of health and medicine, researchers are only beginning to use these tools to explore the clinical characteristics and predictive factors of patients with COVID-19. OBJECTIVE Our primary objectives are to describe the clinical characteristics and determine the factors that predict intensive care unit (ICU) admission of patients with COVID-19. Determining these factors using a well-defined population can increase our understanding of the real-world epidemiology of the disease. METHODS We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling) to analyze the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the Servicio de Salud de Castilla-La Mancha (SESCAM) Health Care Network (Castilla-La Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1 to March 29, 2020. We extracted related clinical information regarding diagnosis, progression, and outcome for all COVID-19 cases. RESULTS A total of 10,504 patients with a clinical or polymerase chain reaction–confirmed diagnosis of COVID-19 were identified; 5519 (52.5%) were male, with a mean age of 58.2 years (SD 19.7). Upon admission, the most common symptoms were cough, fever, and dyspnea; however, all three symptoms occurred in fewer than half of the cases. Overall, 6.1% (83/1353) of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm, we identified that a combination of age, fever, and tachypnea was the most parsimonious predictor of ICU admission; patients younger than 56 years, without tachypnea, and temperature <39 degrees Celsius (or >39 ºC without respiratory crackles) were not admitted to the ICU. In contrast, patients with COVID-19 aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnea and delayed their visit to the emergency department after being seen in primary care. CONCLUSIONS Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnea with or without respiratory crackles) predicts whether patients with COVID-19 will require ICU admission.

10.2196/21801 ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. e21801 ◽  
Author(s):  
Jose Luis Izquierdo ◽  
Julio Ancochea ◽  
Joan B Soriano ◽  

Background Many factors involved in the onset and clinical course of the ongoing COVID-19 pandemic are still unknown. Although big data analytics and artificial intelligence are widely used in the realms of health and medicine, researchers are only beginning to use these tools to explore the clinical characteristics and predictive factors of patients with COVID-19. Objective Our primary objectives are to describe the clinical characteristics and determine the factors that predict intensive care unit (ICU) admission of patients with COVID-19. Determining these factors using a well-defined population can increase our understanding of the real-world epidemiology of the disease. Methods We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling) to analyze the electronic health records (EHRs) of patients with COVID-19. We explored the unstructured free text in the EHRs within the Servicio de Salud de Castilla-La Mancha (SESCAM) Health Care Network (Castilla-La Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1 to March 29, 2020. We extracted related clinical information regarding diagnosis, progression, and outcome for all COVID-19 cases. Results A total of 10,504 patients with a clinical or polymerase chain reaction–confirmed diagnosis of COVID-19 were identified; 5519 (52.5%) were male, with a mean age of 58.2 years (SD 19.7). Upon admission, the most common symptoms were cough, fever, and dyspnea; however, all three symptoms occurred in fewer than half of the cases. Overall, 6.1% (83/1353) of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm, we identified that a combination of age, fever, and tachypnea was the most parsimonious predictor of ICU admission; patients younger than 56 years, without tachypnea, and temperature <39 degrees Celsius (or >39 ºC without respiratory crackles) were not admitted to the ICU. In contrast, patients with COVID-19 aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnea and delayed their visit to the emergency department after being seen in primary care. Conclusions Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnea with or without respiratory crackles) predicts whether patients with COVID-19 will require ICU admission.


Author(s):  
Jose L. Izquierdo ◽  
Julio Ancochea ◽  
Joan B. Soriano ◽  

ABSTRACTThere remain many unknowns regarding the onset and clinical course of the ongoing COVID-19 pandemic. We used a combination of classic epidemiological methods, natural language processing (NLP), and machine learning (for predictive modeling), to analyse the electronic health records (EHRs) of patients with COVID-19.We explored the unstructured free text in the EHRs within the SESCAM Healthcare Network (Castilla La-Mancha, Spain) from the entire population with available EHRs (1,364,924 patients) from January 1st to March 29th, 2020. We extracted related clinical information upon diagnosis, progression and outcome for all COVID-19 cases, focusing in those requiring ICU admission.A total of 10,504 patients with a clinical or PCR-confirmed diagnosis of COVID-19 were identified, 52.5% males, with age of 58.2±19.7 years. Upon admission, the most common symptoms were cough, fever, and dyspnoea, but all in less than half of cases. Overall, 6% of hospitalized patients required ICU admission. Using a machine-learning, data-driven algorithm we identified that a combination of age, fever, and tachypnoea was the most parsimonious predictor of ICU admission: those younger than 56 years, without tachypnoea, and temperature <39°C, (or >39°C without respiratory crackles), were free of ICU admission. On the contrary, COVID-19 patients aged 40 to 79 years were likely to be admitted to the ICU if they had tachypnoea and delayed their visit to the ER after being seen in primary care.Our results show that a combination of easily obtainable clinical variables (age, fever, and tachypnoea with/without respiratory crackles) predicts which COVID-19 patients require ICU admission.


PLoS ONE ◽  
2020 ◽  
Vol 15 (3) ◽  
pp. e0229331 ◽  
Author(s):  
Marta Fernandes ◽  
Rúben Mendes ◽  
Susana M. Vieira ◽  
Francisca Leite ◽  
Carlos Palos ◽  
...  

2018 ◽  
Vol 84 (7) ◽  
pp. 1190-1194 ◽  
Author(s):  
Joshua Parreco ◽  
Antonio Hidalgo ◽  
Robert Kozol ◽  
Nicholas Namias ◽  
Rishi Rattan

The purpose of this study was to use natural language processing of physician documentation to predict mortality in patients admitted to the surgical intensive care unit (SICU). The Multiparameter Intelligent Monitoring in Intensive Care III database was used to obtain SICU stays with six different severity of illness scores. Natural language processing was performed on the physician notes. Classifiers for predicting mortality were created. One classifier used only the physician notes, one used only the severity of illness scores, and one used the physician notes with severity of injury scores. There were 3838 SICU stays identified during the study period and 5.4 per cent ended with mortality. The classifier trained with physician notes with severity of injury scores performed with the highest area under the curve (0.88 ± 0.05) and accuracy (94.6 ± 1.1%). The most important variable was the Oxford Acute Severity of Illness Score (16.0%). The most important terms were “dilated” (4.3%) and “hemorrhage” (3.7%). This study demonstrates the novel use of artificial intelligence to process physician documentation to predict mortality in the SICU. The classifiers were able to detect the subtle nuances in physician vernacular that predict mortality. These nuances provided improved performance in predicting mortality over physiologic parameters alone.


2016 ◽  
Vol 07 (01) ◽  
pp. 101-115 ◽  
Author(s):  
Christoph Lehmann ◽  
Daniel Fabbri ◽  
Michael Temple

SummaryDischarging patients from the Neonatal Intensive Care Unit (NICU) can be delayed for non-medical reasons including the procurement of home medical equipment, parental education, and the need for children’s services. We previously created a model to identify patients that will be medically ready for discharge in the subsequent 2–10 days. In this study we use Natural Language Processing to improve upon that model and discern why the model performed poorly on certain patients.We retrospectively examined the text of the Assessment and Plan section from daily progress notes of 4,693 patients (103,206 patient-days) from the NICU of a large, academic children’s hospital. A matrix was constructed using words from NICU notes (single words and bigrams) to train a supervised machine learning algorithm to determine the most important words differentiating poorly performing patients compared to well performing patients in our original discharge prediction model.NLP using a bag of words (BOW) analysis revealed several cohorts that performed poorly in our original model. These included patients with surgical diagnoses, pulmonary hypertension, retinopathy of prematurity, and psychosocial issues.The BOW approach aided in cohort discovery and will allow further refinement of our original discharge model prediction. Adequately identifying patients discharged home on g-tube feeds alone could improve the AUC of our original model by 0.02. Additionally, this approach identified social issues as a major cause for delayed discharge.A BOW analysis provides a method to improve and refine our NICU discharge prediction model and could potentially avoid over 900 (0.9%) hospital days.AUC – Area under the Curve, CART -- Classification And Regression Trees, DTD – Days to Dis- charge, GI – Gastrointestinal, LOS – Length of Stay, NICU – Neonatal Intensive Care Unit, NS – Neurosurgery, RF – Random Forest.


2021 ◽  
Author(s):  
Francisco Martos Pérez ◽  
Ricardo Gomez Huelgas ◽  
María Dolores Martín Escalante ◽  
José Manuel Casas Rojo

UNSTRUCTURED Letter to Editor. Comment to “Clinical characteristics and prognostic factors for intensive care unit admission of patients with COVID-19: retrospective study using machine learning and natural language processing” publicado por Izquierdo et al en Journal of Medical Internet Research Dear Sir, The article by Izquierdo et al published in the recent issue of Journal of Medical Internet Research (1) employed a combination of conventional and machine-learning tools to describe the clinical characteristics of patients with COVID-19 and the factors that predict intensive care unit (ICU) admission. We would like to make some comments about its design. The authors should have provided the proportion of patients with positive microbiological diagnosis. If the artificial intelligence software’s capacity for retrieving this information is limited in some way, this should be explained. The classification bias introduced by the lack of microbiological confirmation may have been significant, since the study includes patients from 1 January 2020. Although some undiagnosed cases have likely been present prior to the first declared case (1st march 2020)(2) in Castilla-La Mancha, it is improbable that there were many of them. ICU admissions are related to many factors not addressed in the study. The decision not to admit a patient to the ICU because of short life expectancy, low quality of life, or high burden of comorbidities may have had a great impact during the first wave of the COVID-19 pandemic, when a scarcity of ICU beds was manifested in some regions of Spain. The 6,1% ICU admission rate reported by the authors was 36% lower than the 8,3% reported in a national survey of 15111 patients from 150 hospitals in Spain(3). We could hypothesize that the patients included in the study had a milder disease. However, given the absence of microbiological diagnosis in an unknown percentage of patients, inclusion of a significant proportion of patients without a real COVID-19 diagnosis cannot be ruled out. These doubts could have been resolved if a microbiological diagnosis had been a requisite for inclusion. The mortality rate, the most robust and relevant endpoint, should also been reported and the factors related to it analysed. Artificial intelligence is having an increasing impact on the rate of health care information processing. However, minimization of selection and classification biases should be guaranteed in the design of investigations. In this case, this could have been achieved by including only microbiologically confirmed cases and prolonging the period of inclusion, since most of the COVID-19 cases emerged after the end date of the study inclusion period. These changes in the design would have allowed for a better evaluation of the performance of artificial intelligence techniques, making the results obtained in the sample closer to those of real population.   Bibliography 1. Izquierdo JL, Ancochea J; Savana COVID-19 Research Group, Soriano JB. Clinical Characteristics and Prognostic Factors for Intensive Care Unit Admission of Patients With COVID-19: Retrospective Study Using Machine Learning and Natural Language Processing. J Med Internet Res. 2020;22(10):e21801. Published 2020 Oct 28. doi:10.2196/21801. PMID: 33090964 2. Europa Press (2020, march 1st). Un varón de 62 años ingresado en Guadalajara, primer caso positivo por coronavirus en C-LM. Retrieved 2020, January 8th. https://www.europapress.es/castilla-lamancha/noticia-varon-62-anos-ingresado-guadalajara-primer-caso-positivo-coronavirus-lm-20200301103741.html 3. Casas-Rojo JM, Antón-Santos JM, Millán-Núñez-Cortés J, et al. Clinical characteristics of patients hospitalized with COVID-19 in Spain: Results from the SEMI-COVID-19 Registry. Características clínicas de los pacientes hospitalizados con COVID-19 en España: resultados del Registro SEMI-COVID-19. Rev Clin Esp. 2020;220(8):480-494. doi:10.1016/j.rce.2020.07.003. PMID: 32762922


Sign in / Sign up

Export Citation Format

Share Document