scholarly journals Fluid Overload Phenotypes in Critical Illness—A Machine Learning Approach

2022 ◽  
Vol 11 (2) ◽  
pp. 336
Author(s):  
Anna S. Messmer ◽  
Michel Moser ◽  
Patrick Zuercher ◽  
Joerg C. Schefold ◽  
Martin Müller ◽  
...  

Background: The detrimental impact of fluid overload (FO) on intensive care unit (ICU) morbidity and mortality is well known. However, research to identify subgroups of patients particularly prone to fluid overload is scarce. The aim of this cohort study was to derive “FO phenotypes” in the critically ill by using machine learning techniques. Methods: Retrospective single center study including adult intensive care patients with a length of stay of ≥3 days and sufficient data to compute FO. Data was analyzed by multivariable logistic regression, fast and frugal trees (FFT), classification decision trees (DT), and a random forest (RF) model. Results: Out of 1772 included patients, 387 (21.8%) met the FO definition. The random forest model had the highest area under the curve (AUC) (0.84, 95% CI 0.79–0.86), followed by multivariable logistic regression (0.81, 95% CI 0.77–0.86), FFT (0.75, 95% CI 0.69–0.79) and DT (0.73, 95% CI 0.68–0.78) to predict FO. The most important predictors identified in all models were lactate and bicarbonate at admission and postsurgical ICU admission. Sepsis/septic shock was identified as a risk factor in the MV and RF analysis. Conclusion: The FO phenotypes consist of patients admitted after surgery or with sepsis/septic shock with high lactate and low bicarbonate.

2020 ◽  
Author(s):  
Victoria Garcia-Montemayor ◽  
Alejandro Martin-Malo ◽  
Carlo Barbieri ◽  
Francesco Bellocchio ◽  
Sagrario Soriano ◽  
...  

Abstract Background Besides the classic logistic regression analysis, non-parametric methods based on machine learning techniques such as random forest are presently used to generate predictive models. The aim of this study was to evaluate random forest mortality prediction models in haemodialysis patients. Methods Data were acquired from incident haemodialysis patients between 1995 and 2015. Prediction of mortality at 6 months, 1 year and 2 years of haemodialysis was calculated using random forest and the accuracy was compared with logistic regression. Baseline data were constructed with the information obtained during the initial period of regular haemodialysis. Aiming to increase accuracy concerning baseline information of each patient, the period of time used to collect data was set at 30, 60 and 90 days after the first haemodialysis session. Results There were 1571 incident haemodialysis patients included. The mean age was 62.3 years and the average Charlson comorbidity index was 5.99. The mortality prediction models obtained by random forest appear to be adequate in terms of accuracy [area under the curve (AUC) 0.68–0.73] and superior to logistic regression models (ΔAUC 0.007–0.046). Results indicate that both random forest and logistic regression develop mortality prediction models using different variables. Conclusions Random forest is an adequate method, and superior to logistic regression, to generate mortality prediction models in haemodialysis patients.


2020 ◽  
Vol 29 (4) ◽  
pp. e70-e80
Author(s):  
Mireia Ladios-Martin ◽  
José Fernández-de-Maya ◽  
Francisco-Javier Ballesta-López ◽  
Adrián Belso-Garzas ◽  
Manuel Mas-Asencio ◽  
...  

Background Pressure injuries are an important problem in hospital care. Detecting the population at risk for pressure injuries is the first step in any preventive strategy. Available tools such as the Norton and Braden scales do not take into account all of the relevant risk factors. Data mining and machine learning techniques have the potential to overcome this limitation. Objectives To build a model to detect pressure injury risk in intensive care unit patients and to put the model into production in a real environment. Methods The sample comprised adult patients admitted to an intensive care unit (N = 6694) at University Hospital of Torrevieja and University Hospital of Vinalopó. A retrospective design was used to train (n = 2508) and test (n = 1769) the model and then a prospective design was used to test the model in a real environment (n = 2417). Data mining was used to extract variables from electronic medical records and a predictive model was built with machine learning techniques. The sensitivity, specificity, area under the curve, and accuracy of the model were evaluated. Results The final model used logistic regression and incorporated 23 variables. The model had sensitivity of 0.90, specificity of 0.74, and area under the curve of 0.89 during the initial test, and thus it outperformed the Norton scale. The model performed well 1 year later in a real environment. Conclusions The model effectively predicts risk of pressure injury. This allows nurses to focus on patients at high risk for pressure injury without increasing workload.


Author(s):  
Joshua J. Levy ◽  
A. James O’Malley

AbstractBackgroundMachine learning approaches have become increasingly popular modeling techniques, relying on data-driven heuristics to arrive at its solutions. Recent comparisons between these algorithms and traditional statistical modeling techniques have largely ignored the superiority gained by the former approaches due to involvement of model-building search algorithms. This has led to alignment of statistical and machine learning approaches with different types of problems and the under-development of procedures that combine their attributes. In this context, we hoped to understand the domains of applicability for each approach and to identify areas where a marriage between the two approaches is warranted. We then sought to develop a hybrid statistical-machine learning procedure with the best attributes of each.MethodsWe present three simple examples to illustrate when to use each modeling approach and posit a general framework for combining them into an enhanced logistic regression model building procedure that aids interpretation. We study 556 benchmark machine learning datasets to uncover when machine learning techniques outperformed rudimentary logistic regression models and so are potentially well-equipped to enhance them. We illustrate a software package, InteractionTransformer, which embeds logistic regression with advanced model building capacity by using machine learning algorithms to extract candidate interaction features from a random forest model for inclusion in the model. Finally, we apply our enhanced logistic regression analysis to two real-word biomedical examples, one where predictors vary linearly with the outcome and another with extensive second-order interactions.ResultsPreliminary statistical analysis demonstrated that across 556 benchmark datasets, the random forest approach significantly outperformed the logistic regression approach. We found a statistically significant increase in predictive performance when using hybrid procedures and greater clarity in the association with the outcome of terms acquired compared to directly interpreting the random forest output.ConclusionsWhen a random forest model is closer to the true model, hybrid statistical-machine learning procedures can substantially enhance the performance of statistical procedures in an automated manner while preserving easy interpretation of the results. Such hybrid methods may help facilitate widespread adoption of machine learning techniques in the biomedical setting.


2021 ◽  
Author(s):  
Roobaea Alroobaea ◽  
Seifeddine Mechti ◽  
Mariem Haoues ◽  
Saeed Rubaiee ◽  
Anas Ahmed ◽  
...  

Abstract Alzheimer's is the main reason for dementia, that affects frequently older adults. This disease is costly especially, in terms of treatment. In addition, Alzheimer's is one of the deaths causes in the old-age citizens. Early Alzheimer's detection helps medical staffs in this disease diagnosis, which will certainly decrease the risk of death. This made the early Alzheimer's disease detection a crucial problem in the healthcare industry. The objective of this research study is to introduce a computer-aided diagnosis system for Alzheimer's disease detection using machine learning techniques. We employed data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS) brain datasets. Common supervised machine learning techniques have been applied for automatic Alzheimer’s disease detection such as: logistic regression, support vector machine, random forest, linear discriminant analysis, etc. The best accuracy values provided by the machine learning classifiers are 99.43% and 99.10% given by respectively, logistic regression and support vector machine using ADNI dataset, whereas for the OASIS dataset, we obtained 84.33% and 83.92% given by respectively logistic regression and random forest.


2021 ◽  
Vol 11 (1) ◽  
pp. 44
Author(s):  
Helen R. Gosselt ◽  
Maxime M. A. Verhoeven ◽  
Maja Bulatović-Ćalasan ◽  
Paco M. Welsing ◽  
Maurits C. F. J. de Rotte ◽  
...  

The goals of this study were to examine whether machine-learning algorithms outperform multivariable logistic regression in the prediction of insufficient response to methotrexate (MTX); secondly, to examine which features are essential for correct prediction; and finally, to investigate whether the best performing model specifically identifies insufficient responders to MTX (combination) therapy. The prediction of insufficient response (3-month Disease Activity Score 28-Erythrocyte-sedimentation rate (DAS28-ESR) > 3.2) was assessed using logistic regression, least absolute shrinkage and selection operator (LASSO), random forest, and extreme gradient boosting (XGBoost). The baseline features of 355 rheumatoid arthritis (RA) patients from the “treatment in the Rotterdam Early Arthritis CoHort” (tREACH) and the U-Act-Early trial were combined for analyses. The model performances were compared using area under the curve (AUC) of receiver operating characteristic (ROC) curves, 95% confidence intervals (95% CI), and sensitivity and specificity. Finally, the best performing model following feature selection was tested on 101 RA patients starting tocilizumab (TCZ)-monotherapy. Logistic regression (AUC = 0.77 95% CI: 0.68–0.86) performed as well as LASSO (AUC = 0.76, 95% CI: 0.67–0.85), random forest (AUC = 0.71, 95% CI: 0.61 = 0.81), and XGBoost (AUC = 0.70, 95% CI: 0.61–0.81), yet logistic regression reached the highest sensitivity (81%). The most important features were baseline DAS28 (components). For all algorithms, models with six features performed similarly to those with 16. When applied to the TCZ-monotherapy group, logistic regression’s sensitivity significantly dropped from 83% to 69% (p = 0.03). In the current dataset, logistic regression performed equally well compared to machine-learning algorithms in the prediction of insufficient response to MTX. Models could be reduced to six features, which are more conducive for clinical implementation. Interestingly, the prediction model was specific to MTX (combination) therapy response.


2021 ◽  
Vol 10 (2) ◽  
pp. 301
Author(s):  
Debdipto Misra ◽  
Venkatesh Avula ◽  
Donna M. Wolk ◽  
Hosam A. Farag ◽  
Jiang Li ◽  
...  

Background: Developing a decision support system based on advances in machine learning is one area for strategic innovation in healthcare. Predicting a patient’s progression to septic shock is an active field of translational research. The goal of this study was to develop a working model of a clinical decision support system for predicting septic shock in an acute care setting for up to 6 h from the time of admission in an integrated healthcare setting. Method: Clinical data from Electronic Health Record (EHR), at encounter level, were used to build a predictive model for progression from sepsis to septic shock up to 6 h from the time of admission; that is, T = 1, 3, and 6 h from admission. Eight different machine learning algorithms (Random Forest, XGBoost, C5.0, Decision Trees, Boosted Logistic Regression, Support Vector Machine, Logistic Regression, Regularized Logistic, and Bayes Generalized Linear Model) were used for model development. Two adaptive sampling strategies were used to address the class imbalance. Data from two sources (clinical and billing codes) were used to define the case definition (septic shock) using the Centers for Medicare & Medicaid Services (CMS) Sepsis criteria. The model assessment was performed using Area under Receiving Operator Characteristics (AUROC), sensitivity, and specificity. Model predictions for each feature window (1, 3 and 6 h from admission) were consolidated. Results: Retrospective data from April 2005 to September 2018 were extracted from the EHR, Insurance Claims, Billing, and Laboratory Systems to create a dataset for septic shock detection. The clinical criteria and billing information were used to label patients into two classes-septic shock patients and sepsis patients at three different time points from admission, creating two different case-control cohorts. Data from 45,425 unique in-patient visits were used to build 96 prediction models comparing clinical-based definition versus billing-based information as the gold standard. Of the 24 consolidated models (based on eight machine learning algorithms and three feature windows), four models reached an AUROC greater than 0.9. Overall, all the consolidated models reached an AUROC of at least 0.8820 or higher. Based on the AUROC of 0.9483, the best model was based on Random Forest, with a sensitivity of 83.9% and specificity of 88.1%. The sepsis detection window at 6 h outperformed the 1 and 3-h windows. The sepsis definition based on clinical variables had improved performance when compared to the sepsis definition based on only billing information. Conclusion: This study corroborated that machine learning models can be developed to predict septic shock using clinical and administrative data. However, the use of clinical information to define septic shock outperformed models developed based on only administrative data. Intelligent decision support tools can be developed and integrated into the EHR and improve clinical outcomes and facilitate the optimization of resources in real-time.


2021 ◽  
Vol 10 (5) ◽  
pp. 992
Author(s):  
Martina Barchitta ◽  
Andrea Maugeri ◽  
Giuliana Favara ◽  
Paolo Marco Riela ◽  
Giovanni Gallo ◽  
...  

Patients in intensive care units (ICUs) were at higher risk of worsen prognosis and mortality. Here, we aimed to evaluate the ability of the Simplified Acute Physiology Score (SAPS II) to predict the risk of 7-day mortality, and to test a machine learning algorithm which combines the SAPS II with additional patients’ characteristics at ICU admission. We used data from the “Italian Nosocomial Infections Surveillance in Intensive Care Units” network. Support Vector Machines (SVM) algorithm was used to classify 3782 patients according to sex, patient’s origin, type of ICU admission, non-surgical treatment for acute coronary disease, surgical intervention, SAPS II, presence of invasive devices, trauma, impaired immunity, antibiotic therapy and onset of HAI. The accuracy of SAPS II for predicting patients who died from those who did not was 69.3%, with an Area Under the Curve (AUC) of 0.678. Using the SVM algorithm, instead, we achieved an accuracy of 83.5% and AUC of 0.896. Notably, SAPS II was the variable that weighted more on the model and its removal resulted in an AUC of 0.653 and an accuracy of 68.4%. Overall, these findings suggest the present SVM model as a useful tool to early predict patients at higher risk of death at ICU admission.


2020 ◽  
Vol 41 (Supplement_2) ◽  
Author(s):  
A.Y Lui ◽  
L Garber ◽  
M Vincent ◽  
L Celi ◽  
J Masip ◽  
...  

Abstract Background Hyperoxia produces reactive oxygen species, apoptosis, and vasoconstriction, and is associated with adverse outcomes in patients with heart failure and cardiac arrest. Our aim was to evaluate the association between hyperoxia and mortality in patients (pts) receiving positive pressure ventilation (PPV) in the cardiac intensive care unit (CICU). Methods Patients admitted to our medical center CICU who received any PPV (invasive or non-invasive) from 2001 through 2012 were included. Hyperoxia was defined as time-weighted mean of PaO2 >120mmHg and non-hyperoxia as PaO2 ≤120mmHg during CICU admission. Primary outcome was in-hospital mortality. Multivariable logistic regression was used to assess the association between hyperoxia and in-hospital mortality adjusted for age, female sex, Oxford Acute Severity of Illness Score, creatinine, lactate, pH, PaO2/FiO2 ratio, PCO2, PEEP, and estimated time spent on PEEP. Results Among 1493 patients, hyperoxia (median PaO2 147mmHg) during the CICU admission was observed in 702 (47.0%) pts. In-hospital mortality was 29.7% in the non-hyperoxia group and 33.9% in the hyperoxia group ((log rank test, p=0.0282, see figure). Using multivariable logistic regression, hyperoxia was independently associated with in-hospital mortality (OR 1.507, 95% CI 1.311–2.001, p=0.00508). Post-hoc analysis with PaO2 as a continuous variable was consistent with the primary analysis (OR 1.053 per 10mmHg increase in PaO2, 95% CI 1.024–1.082, p=0.0002). Conclusions In a large CICU cohort, hyperoxia was associated with increased mortality. Trials of titration of supplemental oxygen across the full spectrum of critically ill cardiac patients are warranted. Funding Acknowledgement Type of funding source: None


2017 ◽  
Vol 107 (10) ◽  
pp. 1187-1198 ◽  
Author(s):  
L. Wen ◽  
C. R. Bowen ◽  
G. L. Hartman

Dispersal of urediniospores by wind is the primary means of spread for Phakopsora pachyrhizi, the cause of soybean rust. Our research focused on the short-distance movement of urediniospores from within the soybean canopy and up to 61 m from field-grown rust-infected soybean plants. Environmental variables were used to develop and compare models including the least absolute shrinkage and selection operator regression, zero-inflated Poisson/regular Poisson regression, random forest, and neural network to describe deposition of urediniospores collected in passive and active traps. All four models identified distance of trap from source, humidity, temperature, wind direction, and wind speed as the five most important variables influencing short-distance movement of urediniospores. The random forest model provided the best predictions, explaining 76.1 and 86.8% of the total variation in the passive- and active-trap datasets, respectively. The prediction accuracy based on the correlation coefficient (r) between predicted values and the true values were 0.83 (P < 0.0001) and 0.94 (P < 0.0001) for the passive and active trap datasets, respectively. Overall, multiple machine learning techniques identified the most important variables to make the most accurate predictions of movement of P. pachyrhizi urediniospores short-distance.


2021 ◽  
pp. 1-10
Author(s):  
I. Krug ◽  
J. Linardon ◽  
C. Greenwood ◽  
G. Youssef ◽  
J. Treasure ◽  
...  

Abstract Background Despite a wide range of proposed risk factors and theoretical models, prediction of eating disorder (ED) onset remains poor. This study undertook the first comparison of two machine learning (ML) approaches [penalised logistic regression (LASSO), and prediction rule ensembles (PREs)] to conventional logistic regression (LR) models to enhance prediction of ED onset and differential ED diagnoses from a range of putative risk factors. Method Data were part of a European Project and comprised 1402 participants, 642 ED patients [52% with anorexia nervosa (AN) and 40% with bulimia nervosa (BN)] and 760 controls. The Cross-Cultural Risk Factor Questionnaire, which assesses retrospectively a range of sociocultural and psychological ED risk factors occurring before the age of 12 years (46 predictors in total), was used. Results All three statistical approaches had satisfactory model accuracy, with an average area under the curve (AUC) of 86% for predicting ED onset and 70% for predicting AN v. BN. Predictive performance was greatest for the two regression methods (LR and LASSO), although the PRE technique relied on fewer predictors with comparable accuracy. The individual risk factors differed depending on the outcome classification (EDs v. non-EDs and AN v. BN). Conclusions Even though the conventional LR performed comparably to the ML approaches in terms of predictive accuracy, the ML methods produced more parsimonious predictive models. ML approaches offer a viable way to modify screening practices for ED risk that balance accuracy against participant burden.


Sign in / Sign up

Export Citation Format

Share Document