scholarly journals Developing and validating COVID-19 adverse outcome risk prediction models from a bi-national European cohort of 5594 patients

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Espen Jimenez-Solem ◽  
Tonny S. Petersen ◽  
Casper Hansen ◽  
Christian Hansen ◽  
Christina Lioma ◽  
...  

AbstractPatients with severe COVID-19 have overwhelmed healthcare systems worldwide. We hypothesized that machine learning (ML) models could be used to predict risks at different stages of management and thereby provide insights into drivers and prognostic markers of disease progression and death. From a cohort of approx. 2.6 million citizens in Denmark, SARS-CoV-2 PCR tests were performed on subjects suspected for COVID-19 disease; 3944 cases had at least one positive test and were subjected to further analysis. SARS-CoV-2 positive cases from the United Kingdom Biobank was used for external validation. The ML models predicted the risk of death (Receiver Operation Characteristics—Area Under the Curve, ROC-AUC) of 0.906 at diagnosis, 0.818, at hospital admission and 0.721 at Intensive Care Unit (ICU) admission. Similar metrics were achieved for predicted risks of hospital and ICU admission and use of mechanical ventilation. Common risk factors, included age, body mass index and hypertension, although the top risk features shifted towards markers of shock and organ dysfunction in ICU patients. The external validation indicated fair predictive performance for mortality prediction, but suboptimal performance for predicting ICU admission. ML may be used to identify drivers of progression to more severe disease and for prognostication patients in patients with COVID-19. We provide access to an online risk calculator based on these findings.

2020 ◽  
Author(s):  
Espen Jimenez-Solem ◽  
Tonny S Petersen ◽  
Casper Hansen ◽  
Christian Hansen ◽  
Christina Lioma ◽  
...  

ABSTRACTBackgroundPatients with severe COVID-19 have overwhelmed healthcare systems worldwide. We hypothesized that Machine Learning (ML) models could be used to predict risks at different stages of management (at diagnosis, hospital admission and ICU admission) and thereby provide insights into drivers and prognostic markers of disease progression and death.MethodsFrom a cohort of approx. 2.6 million citizens in the two regions of Denmark, SARS-CoV-2 PCR tests were performed on subjects suspected for COVID-19 disease; 3944 cases had at least one positive test and were subjected to further analysis. A cohort of SARS- CoV-2 positive cases from the United Kingdom Biobank was used for external validation.FindingsThe ML models predicted the risk of death (Receiver Operation Characteristics – Area Under the Curve, ROC-AUC) of 0.904 at diagnosis, 0.818, at hospital admission and 0.723 at Intensive Care Unit (ICU) admission. Similar metrics were achieved for predicted risks of hospital and ICU admission and use of mechanical ventilation. We identified some common risk factors, including age, body mass index (BMI) and hypertension as driving factors, although the top risk features shifted towards markers of shock and organ dysfunction in ICU patients. The external validation indicated fair predictive performance for mortality prediction, but suboptimal performance for predicting ICU admission.InterpretationML may be used to identify drivers of progression to more severe disease and for prognostication patients in patients with COVID-19. Prognostic features included age, BMI and hypertension, although markers of shock and organ dysfunction became more important in more severe cases.We provide access to an online risk calculator based on these findings.FundingThe study was funded by grants from the Novo Nordisk Foundation to MS (#NNF20SA0062879 and #NNF19OC0055183) and MN (#NNF20SA0062879). The foundation took no part in project design, data handling and manuscript preparation.


BioMed ◽  
2022 ◽  
Vol 2 (1) ◽  
pp. 13-26
Author(s):  
Avishek Chatterjee ◽  
Guus Wilmink ◽  
Henry Woodruff ◽  
Philippe Lambin

We conducted a systematic survey of COVID-19 endpoint prediction literature to: (a) identify publications that include data that adhere to FAIR (findability, accessibility, interoperability, and reusability) principles and (b) develop and reuse mortality prediction models that best generalize to these datasets. The largest such cohort data we knew of was used for model development. The associated published prediction model was subjected to recursive feature elimination to find a minimal logistic regression model which had statistically and clinically indistinguishable predictive performance. This model could still not be applied to the four external validation sets that were identified, due to complete absence of needed model features in some external sets. Thus, a generalizable model (GM) was built which could be applied to all four external validation sets. An age-only model was used as a benchmark, as it is the simplest, effective, and robust predictor of mortality currently known in COVID-19 literature. While the GM surpassed the age-only model in three external cohorts, for the fourth external cohort, there was no statistically significant difference. This study underscores: (1) the paucity of FAIR data being shared by researchers despite the glut of COVID-19 prediction models and (2) the difficulty of creating any model that consistently outperforms an age-only model due to the cohort diversity of available datasets.


2021 ◽  
Vol 10 (5) ◽  
pp. 992
Author(s):  
Martina Barchitta ◽  
Andrea Maugeri ◽  
Giuliana Favara ◽  
Paolo Marco Riela ◽  
Giovanni Gallo ◽  
...  

Patients in intensive care units (ICUs) were at higher risk of worsen prognosis and mortality. Here, we aimed to evaluate the ability of the Simplified Acute Physiology Score (SAPS II) to predict the risk of 7-day mortality, and to test a machine learning algorithm which combines the SAPS II with additional patients’ characteristics at ICU admission. We used data from the “Italian Nosocomial Infections Surveillance in Intensive Care Units” network. Support Vector Machines (SVM) algorithm was used to classify 3782 patients according to sex, patient’s origin, type of ICU admission, non-surgical treatment for acute coronary disease, surgical intervention, SAPS II, presence of invasive devices, trauma, impaired immunity, antibiotic therapy and onset of HAI. The accuracy of SAPS II for predicting patients who died from those who did not was 69.3%, with an Area Under the Curve (AUC) of 0.678. Using the SVM algorithm, instead, we achieved an accuracy of 83.5% and AUC of 0.896. Notably, SAPS II was the variable that weighted more on the model and its removal resulted in an AUC of 0.653 and an accuracy of 68.4%. Overall, these findings suggest the present SVM model as a useful tool to early predict patients at higher risk of death at ICU admission.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
R Chopard ◽  
D Jimenez ◽  
G Serzian ◽  
F Ecarnot ◽  
N Falvo ◽  
...  

Abstract Background Renal dysfunction may influence outcomes after pulmonary embolism (PE). We determined the incremental value of adding renal function impairment (estimated glomerular filtration rate, eGFR <60 ml/min/1.73m2) on top of the 2019 ESC prognostic model, for the prediction of 30-day all-cause mortality in acute PE patients from a prospective, multicenter cohort. Methods and results We identified which of three eGFR formulae predicted death most accurately. Changes in global model fit, discrimination, calibration and net reclassification index (NRI) were evaluated with addition of eGFR. We prospectively included consecutive adult patients with acute PE diagnosed as per ESC guidelines. Among 1,943 patients, (mean age 67.3±17.1, 50.4% women), 107 (5.5% (95% CI 4.5–6.5%)) died during 30-day follow-up. The eGFRMDRD4 formula was the most accurate for prediction of death. The observed mortality rate was higher for intermediate-low risk (OR 1.8, 95% CI 1.1–3.4) and high-risk PE (OR 10.3, 95% CI 3.6–17.3), and 30-day bleeding was significantly higher (OR 2.1, 95% CI 1.3–3.5) in patients with vs without eGFRMDRD4 <60 ml/min/1.73m2. The addition of eGFRMDRD4 information improved model fit, discriminatory capacity, and calibration of the ESC models. NRI was significantly improved (p<0.001), with 18% reclassification of predicted mortality, specifically in intermediate and high-risk PE. External validation using data from the RIETE registry confirmed our findings (Table). Conclusion Addition of eGFRMDRD4-derived renal dysfunction on top of the ESC prognostic algorithm yields significant reclassification of risk of death in intermediate and high-risk PE. Impact on therapy remains to be determined. Funding Acknowledgement Type of funding source: Private grant(s) and/or Sponsorship. Main funding source(s): BMS-Pfizer Alliance, Bayer Healthcare


Circulation ◽  
2019 ◽  
Vol 140 (Suppl_2) ◽  
Author(s):  
Marinos Kosmopoulos ◽  
Jason A Bartos ◽  
Demetris Yannopoulos

Introduction: Veno-Arterial Extracorporeal Membrane Oxygenation (VA ECMO) has emerged as a prominent tool for management of patients with Inability to Wean Off Cardiopulmonary Bypass (IWOCB), extracorporeal cardiopulmonary resuscitation (eCPR) or refractory cardiogenic shock (RCS). The high mortality that is still associated with these diseases urges for the development of reliable prediction models for mortality after cannulation. Survival After VA ECMO (SAVE) Score consists one of the most widely used prediction tools and the only model with external validation. However, its predictive value is still under debate. Hypothesis: Whether VA ECMO indication affects the predictive value of SAVE Score. Methods: 317 patients treated with VA ECMO in a quaternary center (n= 52 for IWOCB, n=179 for eCPR and n=86 for RCS) were retrospectively assessed for differences in SAVE Score and their primary outcomes. The Receiver Operating Characteristic (ROC) curve for SAVE Score and mortality was calculated separately for each VA ECMO indication. Results: The three groups had significant differences in SAVE Score (p<0.01) without significant differences in mortality (p=0.176). ROC Curve calculation indicated significant differences in predictive value of SAVE Score for survival among its different indications. (Area Under the Curve= 81.69% for IWOCB, 53.79% for eCPR and 69.46% for RCS). Conclusion: VA ECMO indication markedly affects the predictive value of SAVE Score. Prediction of primary outcome in IWOCB patients was reliable. On the contrary, routine application for survival estimation in eCPR patients is not supported from our results.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
A Youssef

Abstract Study question Which models that predict pregnancy outcome in couples with unexplained RPL exist and what is the performance of the most used model? Summary answer We identified seven prediction models; none followed the recommended prediction model development steps. Moreover, the most used model showed poor predictive performance. What is known already RPL remains unexplained in 50–75% of couples For these couples, there is no effective treatment option and clinical management rests on supportive care. Essential part of supportive care consists of counselling on the prognosis of subsequent pregnancies. Indeed, multiple prediction models exist, however the quality and validity of these models varies. In addition, the prediction model developed by Brigham et al is the most widely used model, but has never been externally validated. Study design, size, duration We performed a systematic review to identify prediction models for pregnancy outcome after unexplained RPL. In addition we performed an external validation of the Brigham model in a retrospective cohort, consisting of 668 couples with unexplained RPL that visited our RPL clinic between 2004 and 2019. Participants/materials, setting, methods A systematic search was performed in December 2020 in Pubmed, Embase, Web of Science and Cochrane library to identify relevant studies. Eligible studies were selected and assessed according to the TRIPOD) guidelines, covering topics on model performance and validation statement. The performance of predicting live birth in the Brigham model was evaluated through calibration and discrimination, in which the observed pregnancy rates were compared to the predicted pregnancy rates. Main results and the role of chance Seven models were compared and assessed according to the TRIPOD statement. This resulted in two studies of low, three of moderate and two of above average reporting quality. These studies did not follow the recommended steps for model development and did not calculate a sample size. Furthermore, the predictive performance of neither of these models was internally- or externally validated. We performed an external validation of Brigham model. Calibration showed overestimation of the model and too extreme predictions, with a negative calibration intercept of –0.52 (CI 95% –0.68 – –0.36), with a calibration slope of 0.39 (CI 95% 0.07 – 0.71). The discriminative ability of the model was very low with a concordance statistic of 0.55 (CI 95% 0.50 – 0.59). Limitations, reasons for caution None of the studies are specifically named prediction models, therefore models may have been missed in the selection process. The external validation cohort used a retrospective design, in which only the first pregnancy after intake was registered. Follow-up time was not limited, which is important in counselling unexplained RPL couples. Wider implications of the findings: Currently, there are no suitable models that predict on pregnancy outcome after RPL. Moreover, we are in need of a model with several variables such that prognosis is individualized, and factors from both the female as the male to enable a couple specific prognosis. Trial registration number Not applicable


2020 ◽  
Vol 71 (16) ◽  
pp. 2079-2088 ◽  
Author(s):  
Kun Wang ◽  
Peiyuan Zuo ◽  
Yuwei Liu ◽  
Meng Zhang ◽  
Xiaofang Zhao ◽  
...  

Abstract Background This study aimed to develop mortality-prediction models for patients with coronavirus disease-2019 (COVID-19). Methods The training cohort included consecutive COVID-19 patients at the First People’s Hospital of Jiangxia District in Wuhan, China, from 7 January 2020 to 11 February 2020. We selected baseline data through the stepwise Akaike information criterion and ensemble XGBoost (extreme gradient boosting) model to build mortality-prediction models. We then validated these models by randomly collected COVID-19 patients in Union Hospital, Wuhan, from 1 January 2020 to 20 February 2020. Results A total of 296 COVID-19 patients were enrolled in the training cohort; 19 died during hospitalization and 277 discharged from the hospital. The clinical model developed using age, history of hypertension, and coronary heart disease showed area under the curve (AUC), 0.88 (95% confidence interval [CI], .80–.95); threshold, −2.6551; sensitivity, 92.31%; specificity, 77.44%; and negative predictive value (NPV), 99.34%. The laboratory model developed using age, high-sensitivity C-reactive protein, peripheral capillary oxygen saturation, neutrophil and lymphocyte count, d-dimer, aspartate aminotransferase, and glomerular filtration rate had a significantly stronger discriminatory power than the clinical model (P = .0157), with AUC, 0.98 (95% CI, .92–.99); threshold, −2.998; sensitivity, 100.00%; specificity, 92.82%; and NPV, 100.00%. In the subsequent validation cohort (N = 44), the AUC (95% CI) was 0.83 (.68–.93) and 0.88 (.75–.96) for the clinical model and laboratory model, respectively. Conclusions We developed 2 predictive models for the in-hospital mortality of patients with COVID-19 in Wuhan that were validated in patients from another center.


2019 ◽  
Vol 54 (3) ◽  
pp. 1900224 ◽  
Author(s):  
Sanja Stanojevic ◽  
Jenna Sykes ◽  
Anne L. Stephenson ◽  
Shawn D. Aaron ◽  
George A. Whitmore

IntroductionWe aimed to develop a clinical tool for predicting 1- and 2-year risk of death for patients with cystic fibrosis (CF). The model considers patients' overall health status as well as risk of intermittent shock events in calculating the risk of death.MethodsCanadian CF Registry data from 1982 to 2015 were used to develop a predictive risk model using threshold regression. A 2-year risk of death estimated conditional probability of surviving the second year given survival for the first year. UK CF Registry data from 2007 to 2013 were used to externally validate the model.ResultsThe combined effect of CF chronic health status and CF intermittent shock risk provided a simple clinical scoring tool for assessing 1-year and 2-year risk of death for an individual CF patient. At a threshold risk of death of ≥20%, the 1-year model had a sensitivity of 74% and specificity of 96%. The area under the receiver operating curve (AUC) for the 2-year mortality model was significantly greater than the AUC for a model that predicted survival based on forced expiratory volume in 1 s <30% predicted (AUC 0.95 versus 0.68 respectively, p<0.001). The Canadian-derived model validated well with the UK data and correctly identified 79% of deaths and 95% of survivors in a single year in the UK.ConclusionsThe prediction models provide an accurate risk of death over a 1- and 2-year time horizon. The models performed equally well when validated in an independent UK CF population.


2020 ◽  
Author(s):  
Victoria Garcia-Montemayor ◽  
Alejandro Martin-Malo ◽  
Carlo Barbieri ◽  
Francesco Bellocchio ◽  
Sagrario Soriano ◽  
...  

Abstract Background Besides the classic logistic regression analysis, non-parametric methods based on machine learning techniques such as random forest are presently used to generate predictive models. The aim of this study was to evaluate random forest mortality prediction models in haemodialysis patients. Methods Data were acquired from incident haemodialysis patients between 1995 and 2015. Prediction of mortality at 6 months, 1 year and 2 years of haemodialysis was calculated using random forest and the accuracy was compared with logistic regression. Baseline data were constructed with the information obtained during the initial period of regular haemodialysis. Aiming to increase accuracy concerning baseline information of each patient, the period of time used to collect data was set at 30, 60 and 90 days after the first haemodialysis session. Results There were 1571 incident haemodialysis patients included. The mean age was 62.3 years and the average Charlson comorbidity index was 5.99. The mortality prediction models obtained by random forest appear to be adequate in terms of accuracy [area under the curve (AUC) 0.68–0.73] and superior to logistic regression models (ΔAUC 0.007–0.046). Results indicate that both random forest and logistic regression develop mortality prediction models using different variables. Conclusions Random forest is an adequate method, and superior to logistic regression, to generate mortality prediction models in haemodialysis patients.


Sign in / Sign up

Export Citation Format

Share Document