scholarly journals A machine-learning parsimonious multivariable predictive model of mortality risk in patients with Covid-19

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Rita Murri ◽  
Jacopo Lenkowicz ◽  
Carlotta Masciocchi ◽  
Chiara Iacomini ◽  
Massimo Fantoni ◽  
...  

AbstractThe COVID-19 pandemic is impressively challenging the healthcare system. Several prognostic models have been validated but few of them are implemented in daily practice. The objective of the study was to validate a machine-learning risk prediction model using easy-to-obtain parameters to help to identify patients with COVID-19 who are at higher risk of death. The training cohort included all patients admitted to Fondazione Policlinico Gemelli with COVID-19 from March 5, 2020, to November 5, 2020. Afterward, the model was tested on all patients admitted to the same hospital with COVID-19 from November 6, 2020, to February 5, 2021. The primary outcome was in-hospital case-fatality risk. The out-of-sample performance of the model was estimated from the training set in terms of Area under the Receiving Operator Curve (AUROC) and classification matrix statistics by averaging the results of fivefold cross validation repeated 3-times and comparing the results with those obtained on the test set. An explanation analysis of the model, based on the SHapley Additive exPlanations (SHAP), is also presented. To assess the subsequent time evolution, the change in paO2/FiO2 (P/F) at 48 h after the baseline measurement was plotted against its baseline value. Among the 921 patients included in the training cohort, 120 died (13%). Variables selected for the model were age, platelet count, SpO2, blood urea nitrogen (BUN), hemoglobin, C-reactive protein, neutrophil count, and sodium. The results of the fivefold cross-validation repeated 3-times gave AUROC of 0.87, and statistics of the classification matrix to the Youden index as follows: sensitivity 0.840, specificity 0.774, negative predictive value 0.971. Then, the model was tested on a new population (n = 1463) in which the case-fatality rate was 22.6%. The test model showed AUROC 0.818, sensitivity 0.813, specificity 0.650, negative predictive value 0.922. Considering the first quartile of the predicted risk score (low-risk score group), the case-fatality rate was 1.6%, 17.8% in the second and third quartile (high-risk score group) and 53.5% in the fourth quartile (very high-risk score group). The three risk score groups showed good discrimination for the P/F value at admission, and a positive correlation was found for the low-risk class to P/F at 48 h after admission (adjusted R-squared = 0.48). We developed a predictive model of death for people with SARS-CoV-2 infection by including only easy-to-obtain variables (abnormal blood count, BUN, C-reactive protein, sodium and lower SpO2). It demonstrated good accuracy and high power of discrimination. The simplicity of the model makes the risk prediction applicable for patients in the Emergency Department, or during hospitalization. Although it is reasonable to assume that the model is also applicable in not-hospitalized persons, only appropriate studies can assess the accuracy of the model also for persons at home.

2021 ◽  
Author(s):  
Rita Murri ◽  
Jacopo Lenkowicz ◽  
Carlotta Masciocchi ◽  
Chiara Iacomini ◽  
Massimo Fantoni ◽  
...  

Abstract BackgroundThe COVID-19 pandemic is impressively challenging the healthcare system. Several prognostic models have been validated but few of them are implemented in daily practice. The objective of the study was to validate a machine-learning risk prediction model using easy-to-obtain parameters, potentially available at home, to help identifying patients with COVID-19 who are at higher risk of death.MethodsThe training cohort included all patients admitted to Fondazione Policlinico Gemelli with COVID-19 from March 5, 2020 to November 5, 2020. Afterwards, the model was tested on all patients admitted to the same hospital with COVID-19 from November 6, 2020 to February 5 2021. The primary outcome was in-hospital mortality.The out-of-sample performance of the model was estimated from the training set in terms of Area under the Receiving Operator Curve (AUROC) and classification matrix statistics by averaging the results of 5-fold cross validation repeated 3-times and comparing the results with those obtained on the test set. An explanation analysis of the model, based on the SHapley Additive exPlanations (SHAP), is also presented. To assess the subsequent time evolution, the change in paO2/FiO2 (P/F) at 48 hours after the baseline measurement was plotted against its baseline value.ResultsAmong the 921 patients included in the training cohort, 120 died (13%). Variables selected for the model were age, platelet count, SpO2, blood urea nitrogen (BUN), hemoglobin, C-reactive protein, neutrophil count, and sodium. The results of the 5-fold cross-validation repeated 3-times gave AUROC of 0.87, and statistics of the classification matrix to the Youden index as follows: sensitivity 0.840, specificity 0.774, negative predictive value 0.971. Then, the model was tested on a new population (n=1463) in which the mortality rate was 22.6 %. The test model showed AUROC 0.818, sensitivity 0.813, specificity 0.650, negative predictive value 0.922. Considering the first quartile of the predicted risk score (low-risk score group), the mortality rate was 1.6%, 17.8% in the second and third quartile (high-risk score group) and 53.5% in the fourth quartile (very high-risk score group). The three risk score groups showed good discrimination for the P/F value at admission, and a positive correlation was found for the low-risk class to P/F at 48 hours after admission (adjusted R-squared= 0.48).ConclusionsWe developed a predictive model of death for people with SARS-CoV-2 infection by including only easy-to-obtain variables (abnormal blood count, BUN, C-reactive protein, sodium and lower SpO2). It demonstrated good accuracy and high power of discrimination. The simplicity of the model makes the risk prediction applicable for patients at home, in the Emergency Department, or during hospitalization.


2019 ◽  
Author(s):  
Andrea Fabbri ◽  
Giulio Marchesini ◽  
Barbara Benazzi ◽  
Alice Morelli ◽  
Danilo Montesi ◽  
...  

Abstract Background: The burden of sepsis represents a global health care problem. We aimed to assess the case fatality rate (CFR) and its predictors in subjects with sepsis admitted to a general Italian hospital from 2009 to 2016, stratified by risk score.Methods: We performed a retrospective analysis of all sepsis-related hospitalizations after Emergency Department (ED) visit in a public Italian hospital in an 8-year period. A risk score to predict CFR was computed by logistic regression analysis of selected variables in a training set (2009-2012), and then confirmed in the whole study population. A trend analysis of CFR during the study period was performed dividing patient as high-risk (upper tertile of risk score) or low-risk . Results: 2,492 subjects were included. Over time the incidental admission rate (no. of sepsis-related admissions per 100 total admissions) increased from 4.1% (2009-2010) to 5.4% (2015-2016); P<0.001, accompanied by a reduced CFR (from 38.0% to 18.4%; P<0.001). A group of 10 variables (admission in intensive care unit, cardio-vascular dysfunction, HIV infection, diabetes, age ≥80 years, respiratory diseases, number of organ dysfunction, digestive diseases, dementia and cancer) were selected by the logistic model to predict CFR with good accuracy: AUC 0.873 [0.009]. Along the years CFR decreased from 31.8% (2009-2010) to 25.0% (2015-2016); P = 0.007. The relative proportion of subjects ≥80 years (overall, 52.9% of cases) and classified as high-risk did not change along the years. CFR decreased only in low-risk subjects (from 13.3% to 5.2%; P<0.001), and particularly in those aged ≥80 (from 18.2% to 6.6%; P=0.003), but not in high-risk individuals (from 69.9% to 64.2%; P=0.713). Conclusion: Between 2009 and 2016 the incidence of sepsis-related hospitalization increased in a general Italian hospital, with a downward trend in CFR, only limited to low-risk patients and particularly to subjects ≥80 years.


2019 ◽  
Author(s):  
Andrea Fabbri ◽  
Giulio Marchesini ◽  
Barbara Benazzi ◽  
Alice Morelli ◽  
Danilo Montesi ◽  
...  

Abstract Background The burden of sepsis represents a global health care problem. We aimed to assess the case fatality rate (CFR) and its predictors in subjects with sepsis admitted to a general Italian hospital from 2009 to 2016, stratified by risk score. Methods We performed a retrospective analysis of all sepsis-related hospitalizations after Emergency Department (ED) visit in a public Italian hospital in an 8-year period. A risk score to predict CFR was computed by logistic regression analysis of selected variables in a training set (2009-2012), and then confirmed in the whole study population. A trend analysis of CFR during the study period was performed dividing patient as high-risk (upper tertile of risk score) or low-risk . Results 2,492 subjects were included. Over time the incidental admission rate (no. of sepsis-related admissions per 100 total admissions) increased from 4.1% (2009-2010) to 5.4% (2015-2016); P<0.001, accompanied by a reduced CFR (from 38.0% to 18.4%; P<0.001). A group of 10 variables (admission in intensive care unit, cardio-vascular dysfunction, HIV infection, diabetes, age ≥80 years, respiratory diseases, number of organ dysfunction, digestive diseases, dementia and cancer) were selected by the logistic model to predict CFR with good accuracy: AUC 0.873 [0.009]. Along the years CFR decreased from 31.8% (2009-2010) to 25.0% (2015-2016); P = 0.007. The relative proportion of subjects ≥80 years (overall, 52.9% of cases) and classified as high-risk did not change along the years. CFR decreased only in low-risk subjects (from 13.3% to 5.2%; P<0.001), and particularly in those aged ≥80 (from 18.2% to 6.6%; P=0.003), but not in high-risk individuals (from 69.9% to 64.2%; P=0.713). Conclusion Between 2009 and 2016 the incidence of sepsis-related hospitalization increased in a general Italian hospital, with a downward trend in CFR, only limited to low-risk patients and particularly to subjects ≥80 years.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Andrea Fabbri ◽  
Giulio Marchesini ◽  
Barbara Benazzi ◽  
Alice Morelli ◽  
Danilo Montesi ◽  
...  

Abstract Background The burden of sepsis represents a global health care problem. We aimed to assess the case fatality rate (CFR) and its predictors in subjects with sepsis admitted to a general Italian hospital from 2009 to 2016, stratified by risk score. Methods We performed a retrospective analysis of all sepsis-related hospitalizations after Emergency Department (ED) visit in a public Italian hospital in an 8-year period. A risk score to predict CFR was computed by logistic regression analysis of selected variables in a training set (2009–2012), and then confirmed in the whole study population. A trend analysis of CFR during the study period was performed dividing patient as high-risk (upper tertile of risk score) or low-risk. Results Two thousand four hundred ninety-two subjects were included. Over time the incidental admission rate (no. of sepsis-related admissions per 100 total admissions) increased from 4.1% (2009–2010) to 5.4% (2015–2016); P < 0.001, accompanied by a reduced CFR (from 38.0 to 18.4%; P < 0.001). A group of 10 variables (admission to intensive care unit, cardio-vascular dysfunction, HIV infection, diabetes, age ≥ 80 years, respiratory diseases, number of organ dysfunction, digestive diseases, dementia and cancer) were selected by the logistic model to predict CFR with good accuracy: AUC 0.873 [0.009]. Along the years CFR decreased from 31.8% (2009–2010) to 25.0% (2015–2016); P = 0.007. The relative proportion of subjects ≥80 years (overall, 52.9% of cases) and classified as high-risk did not change along the years. CFR decreased only in low-risk subjects (from 13.3 to 5.2%; P < 0.001), and particularly in those aged ≥80 (from 18.2 to 6.6%; P = 0.003), but not in high-risk individuals (from 69.9 to 64.2%; P = 0.713). Conclusion Between 2009 and 2016 the incidence of sepsis-related hospitalization increased in a general Italian hospital, with a downward trend in CFR, only limited to low-risk patients and particularly to subjects ≥80 years.


Author(s):  
Francesco Fontana ◽  
Francesco Giaroni ◽  
Monica Frisina ◽  
Gaetano Alfano ◽  
Giacomo Mori ◽  
...  

Abstract Background Dialysis patients are considered at high risk for COVID-19 and the infection can easily spread in dialysis units. Methods We conducted an observational single-centre cohort study to describe clinical characteristics, treatments and outcomes of dialysis patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We tested patients who presented symptoms or had contact with a confirmed case. We enrolled 15 patients positive for SARS-CoV-2. Results We tested 37 of 306 dialysis patients. Patients with SARS-CoV-2 infection were older (mean age 75.96 ± 11.09 years) and all had comorbidities. At presentation, most had interstitial infiltrates on chest X-ray, three-quarters had leucopenia and none had respiratory insufficiency. During follow-up, there was an increase in serum C-reactive protein and interleukin-6. Eighty percent of patients received supplemental oxygen; none received non-invasive ventilation, one was intubated. Most patients (80%) were treated with oral hydroxychloroquine for a median time of 6.5 days [interquartile range (IQR) 5–14.5] and 40% received azithromycin; two patients received a short course of antivirals and one received a single dose of tocilizumab. Only two patients did not require hospitalization. Of the nine survivors, eight still tested positive for SARS-CoV-2 a median of 19 days (IQR 9.25–23) after diagnosis. Six patients died (case fatality rate 40%) a median of 5.5 days (IQR 1.75–9.75) after diagnosis. The main reported cause of death was respiratory failure related to COVID-19 (five patients). Conclusions We report a single-centre experience of SARS-CoV-2 infection in dialysis patients. The disease showed a high case fatality rate and most patients required hospitalization. Survivors show prolonged viral shedding.


2020 ◽  
Vol 5 ◽  
pp. 56 ◽  
Author(s):  
Rodrigo M. Carrillo-Larco ◽  
Manuel Castillo-Cara

Background: The COVID-19 pandemic has attracted the attention of researchers and clinicians whom have provided evidence about risk factors and clinical outcomes. Research on the COVID-19 pandemic benefiting from open-access data and machine learning algorithms is still scarce yet can produce relevant and pragmatic information. With country-level pre-COVID-19-pandemic variables, we aimed to cluster countries in groups with shared profiles of the COVID-19 pandemic. Methods: Unsupervised machine learning algorithms (k-means) were used to define data-driven clusters of countries; the algorithm was informed by disease prevalence estimates, metrics of air pollution, socio-economic status and health system coverage. Using the one-way ANOVA test, we compared the clusters in terms of number of confirmed COVID-19 cases, number of deaths, case fatality rate and order in which the country reported the first case. Results: The model to define the clusters was developed with 155 countries. The model with three principal component analysis parameters and five or six clusters showed the best ability to group countries in relevant sets. There was strong evidence that the model with five or six clusters could stratify countries according to the number of confirmed COVID-19 cases (p<0.001). However, the model could not stratify countries in terms of number of deaths or case fatality rate. Conclusions: A simple data-driven approach using available global information before the COVID-19 pandemic, seemed able to classify countries in terms of the number of confirmed COVID-19 cases. The model was not able to stratify countries based on COVID-19 mortality data.


2020 ◽  
Vol 5 ◽  
pp. 56 ◽  
Author(s):  
Rodrigo M. Carrillo-Larco ◽  
Manuel Castillo-Cara

Background: The COVID-19 pandemic has attracted the attention of researchers and clinicians whom have provided evidence about risk factors and clinical outcomes. Research on the COVID-19 pandemic benefiting from open-access data and machine learning algorithms is still scarce yet can produce relevant and pragmatic information. With country-level pre-COVID-19-pandemic variables, we aimed to cluster countries in groups with shared profiles of the COVID-19 pandemic. Methods: Unsupervised machine learning algorithms (k-means) were used to define data-driven clusters of countries; the algorithm was informed by disease prevalence estimates, metrics of air pollution, socio-economic status and health system coverage. Using the one-way ANOVA test, we compared the clusters in terms of number of confirmed COVID-19 cases, number of deaths, case fatality rate and order in which the country reported the first case. Results: The model to define the clusters was developed with 155 countries. The model with three principal component analysis parameters and five or six clusters showed the best ability to group countries in relevant sets. There was strong evidence that the model with five or six clusters could stratify countries according to the number of confirmed COVID-19 cases (p<0.001). However, the model could not stratify countries in terms of number of deaths or case fatality rate. Conclusions: A simple data-driven approach using available global information before the COVID-19 pandemic, seemed able to classify countries in terms of the number of confirmed COVID-19 cases. The model was not able to stratify countries based on COVID-19 mortality data.


Blood ◽  
2020 ◽  
Vol 136 (10) ◽  
pp. 1134-1143 ◽  
Author(s):  
Anthony R. Mato ◽  
Lindsey E. Roeker ◽  
Nicole Lamanna ◽  
John N. Allan ◽  
Lori Leslie ◽  
...  

Abstract Given advanced age, comorbidities, and immune dysfunction, chronic lymphocytic leukemia (CLL) patients may be at particularly high risk of infection and poor outcomes related to coronavirus disease 2019 (COVID-19). Robust analysis of outcomes for CLL patients, particularly examining effects of baseline characteristics and CLL-directed therapy, is critical to optimally manage CLL patients through this evolving pandemic. CLL patients diagnosed with symptomatic COVID-19 across 43 international centers (n = 198) were included. Hospital admission occurred in 90%. Median age at COVID-19 diagnosis was 70.5 years. Median Cumulative Illness Rating Scale score was 8 (range, 4-32). Thirty-nine percent were treatment naive (“watch and wait”), while 61% had received ≥1 CLL-directed therapy (median, 2; range, 1-8). Ninety patients (45%) were receiving active CLL therapy at COVID-19 diagnosis, most commonly Bruton tyrosine kinase inhibitors (BTKi’s; n = 68/90 [76%]). At a median follow-up of 16 days, the overall case fatality rate was 33%, though 25% remain admitted. Watch-and-wait and treated cohorts had similar rates of admission (89% vs 90%), intensive care unit admission (35% vs 36%), intubation (33% vs 25%), and mortality (37% vs 32%). CLL-directed treatment with BTKi’s at COVID-19 diagnosis did not impact survival (case fatality rate, 34% vs 35%), though the BTKi was held during the COVID-19 course for most patients. These data suggest that the subgroup of CLL patients admitted with COVID-19, regardless of disease phase or treatment status, are at high risk of death. Future epidemiologic studies are needed to assess severe acute respiratory syndrome coronavirus 2 infection risk, these data should be validated independently, and randomized studies of BTKi’s in COVID-19 are needed to provide definitive evidence of benefit.


2020 ◽  
Vol 5 ◽  
pp. 56 ◽  
Author(s):  
Rodrigo M. Carrillo-Larco ◽  
Manuel Castillo-Cara

Background: The COVID-19 pandemic has attracted the attention of researchers and clinicians whom have provided evidence about risk factors and clinical outcomes. Research on the COVID-19 pandemic benefiting from open-access data and machine learning algorithms is still scarce yet can produce relevant and pragmatic information. With country-level pre-COVID-19-pandemic variables, we aimed to cluster countries in groups with shared profiles of the COVID-19 pandemic. Methods: Unsupervised machine learning algorithms (k-means) were used to define data-driven clusters of countries; the algorithm was informed by disease prevalence estimates, metrics of air pollution, socio-economic status and health system coverage. Using the one-way ANOVA test, we compared the clusters in terms of number of confirmed COVID-19 cases, number of deaths, case fatality rate and order in which the country reported the first case. Results: The model to define the clusters was developed with 155 countries. The model with three principal component analysis parameters and five or six clusters showed the best ability to group countries in relevant sets. There was strong evidence that the model with five or six clusters could stratify countries according to the number of confirmed COVID-19 cases (p<0.001). However, the model could not stratify countries in terms of number of deaths or case fatality rate. Conclusions: A simple data-driven approach using available global information before the COVID-19 pandemic, seemed able to classify countries in terms of the number of confirmed COVID-19 cases. The model was not able to stratify countries based on COVID-19 mortality data.


Sign in / Sign up

Export Citation Format

Share Document