Early Prediction of Mortality, Severity, and Length of Stay in the Intensive Care Unit of Sepsis Patients Based on Sepsis 3.0 by Machine Learning Models

Background: Early prediction of the clinical outcome of patients with sepsis is of great significance and can guide treatment and reduce the mortality of patients. However, it is clinically difficult for clinicians.Methods: A total of 2,224 patients with sepsis were involved over a 3-year period (2016–2018) in the intensive care unit (ICU) of Peking Union Medical College Hospital. With all the key medical data from the first 6 h in the ICU, three machine learning models, logistic regression, random forest, and XGBoost, were used to predict mortality, severity (sepsis/septic shock), and length of ICU stay (LOS) (>6 days, ≤ 6 days). Missing data imputation and oversampling were completed on the dataset before introduction into the models.Results: Compared to the mortality and LOS predictions, the severity prediction achieved the best classification results, based on the area under the operating receiver characteristics (AUC), with the random forest classifier (sensitivity = 0.65, specificity = 0.73, F1 score = 0.72, AUC = 0.79). The random forest model also showed the best overall performance (mortality prediction: sensitivity = 0.50, specificity = 0.84, F1 score = 0.66, AUC = 0.74; LOS prediction: sensitivity = 0.79, specificity = 0.66, F1 score = 0.69, AUC = 0.76) among the three models. The predictive ability of the SOFA score itself was inferior to that of the above three models.Conclusions: Using the random forest classifier in the first 6 h of ICU admission can provide a comprehensive early warning of sepsis, which will contribute to the formulation and management of clinical decisions and the allocation and management of resources.

Download Full-text

Machine learning can accurately predict pre-admission baseline hemoglobin and creatinine in intensive care patients

npj Digital Medicine ◽

10.1038/s41746-019-0192-z ◽

2019 ◽

Vol 2 (1) ◽

Cited By ~ 2

Author(s):

Antonin Dauvin ◽

Carolina Donado ◽

Patrik Bachtiger ◽

Ke-Chun Huang ◽

Christopher Martin Sauer ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Impaired Renal Function ◽

Kidney Injury ◽

Learning Models ◽

Intensive Care Patients ◽

Using Data ◽

Machine Learning Models ◽

Baseline Hemoglobin

AbstractPatients admitted to the intensive care unit frequently have anemia and impaired renal function, but often lack historical blood results to contextualize the acuteness of these findings. Using data available within two hours of ICU admission, we developed machine learning models that accurately (AUC 0.86–0.89) classify an individual patient’s baseline hemoglobin and creatinine levels. Compared to assuming the baseline to be the same as the admission lab value, machine learning performed significantly better at classifying acute kidney injury regardless of initial creatinine value, and significantly better at predicting baseline hemoglobin value in patients with admission hemoglobin of <10 g/dl.

Download Full-text

Predicting length of stay using regression and Machine Learning models in Intensive Care Unit: a pilot study

2021 11th International Conference on Biomedical Engineering and Technology ◽

10.1145/3460238.3460247 ◽

2021 ◽

Author(s):

Ilaria Picone ◽

Imma Latessa ◽

Antonella Fiorillo ◽

Arianna Scala ◽

Teresa Angela Trunfio ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Pilot Study ◽

Length Of Stay ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Fake News Data Exploration and Analytics

Electronics ◽

10.3390/electronics10192326 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2326

Author(s):

Mazhar Javed Awan ◽

Awais Yasin ◽

Haitham Nobanee ◽

Ahmed Abid Ali ◽

Zain Shahzad ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Random Forest Classifier ◽

The Internet ◽

Fake News ◽

Learning Models ◽

Decision Tree Classifier ◽

Tree Classifier ◽

Machine Learning Models

Before the internet, people acquired their news from the radio, television, and newspapers. With the internet, the news moved online, and suddenly, anyone could post information on websites such as Facebook and Twitter. The spread of fake news has also increased with social media. It has become one of the most significant issues of this century. People use the method of fake news to pollute the reputation of a well-reputed organization for their benefit. The most important reason for such a project is to frame a device to examine the language designs that describe fake and right news through machine learning. This paper proposes models of machine learning that can successfully detect fake news. These models identify which news is real or fake and specify the accuracy of said news, even in a complex environment. After data-preprocessing and exploration, we applied three machine learning models; random forest classifier, logistic regression, and term frequency-inverse document frequency (TF-IDF) vectorizer. The accuracy of the TFIDF vectorizer, logistic regression, random forest classifier, and decision tree classifier models was approximately 99.52%, 98.63%, 99.63%, and 99.68%, respectively. Machine learning models can be considered a great choice to find reality-based results and applied to other unstructured data for various sentiment analysis applications.

Download Full-text

Comparison of machine learning models for the prediction of mortality of patients with unplanned extubation in intensive care units

Scientific Reports ◽

10.1038/s41598-018-35582-2 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 8

Author(s):

Meng Hsuen Hsieh ◽

Meng Ju Hsieh ◽

Chin-Ming Chen ◽

Chia-Chang Hsieh ◽

Chien-Ming Chao ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care ◽

Intensive Care Units ◽

Learning Models ◽

Unplanned Extubation ◽

Prediction Of Mortality ◽

Machine Learning Models

Download Full-text

Developing Machine Learning Models for Predicting Intensive Care Unit Resource Use During the COVID-19 Pandemic: Experiences from a Bi-Regional Health Care System Covering 2.5 Million Citizens in Denmark

SSRN Electronic Journal ◽

10.2139/ssrn.3796914 ◽

2021 ◽

Author(s):

Stephan Sloth Lorenzen ◽

Mads Nielsen ◽

Espen Jimenez-Solem ◽

Tonny Studsgaard Petersen ◽

Anders Perner ◽

...

Keyword(s):

Machine Learning ◽

Health Care ◽

Intensive Care Unit ◽

Intensive Care ◽

Health Care System ◽

Resource Use ◽

Learning Models ◽

Regional Health ◽

Regional Health Care ◽

Machine Learning Models

Download Full-text

Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data

PLoS ONE ◽

10.1371/journal.pone.0212665 ◽

2019 ◽

Vol 14 (2) ◽

pp. e0212665 ◽

Cited By ~ 24

Author(s):

Aaron J. Masino ◽

Mary Catherine Harris ◽

Daniel Forsyth ◽

Svetlana Ostapenko ◽

Lakshmi Srinivasan ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Neonatal Intensive Care Unit ◽

Neonatal Intensive Care ◽

Health Record ◽

Learning Models ◽

Electronic Health Record Data ◽

Record Data ◽

Machine Learning Models

Download Full-text

Document Preprocessing with TF-IDF to Improve the Polarity Classification Performance of Unstructured Sentiment Analysis

Kinetik Game Technology Information System Computer Network Computing Electronics and Control ◽

10.22219/kinetik.v5i3.1066 ◽

2020 ◽

pp. 235-242

Author(s):

Farrikh Alzami ◽

Erika Devi Udayanti ◽

Dwi Puji Prabowo ◽

Rama Aria Megantara

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Random Forest ◽

Sentiment Analysis ◽

Classification Performance ◽

Document Preparation ◽

Learning Models ◽

Polarity Classification ◽

Negative Sentiment ◽

Machine Learning Models

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.

Download Full-text

Random forest and long short-term memory based machine learning models for classification of ion mobility spectrometry spectra

Chemical, Biological, Radiological, Nuclear, and Explosives (CBRNE) Sensing XXII ◽

10.1117/12.2585829 ◽

2021 ◽

Author(s):

Patrick C. Riley ◽

Samir V. Deshpande ◽

Brian S. Ince ◽

Brian C. Hauck ◽

Kyle P. O'Donnell ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Ion Mobility ◽

Short Term Memory ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Machine Learning Models

Download Full-text

Comparison of Resampling Algorithms to Address Class Imbalance when Developing Machine Learning Models to Predict Foodborne Pathogen Presence in Agricultural Water

Frontiers in Environmental Science ◽

10.3389/fenvs.2021.701288 ◽

2021 ◽

Vol 9 ◽

Author(s):

Daniel Lowell Weller ◽

Tanzy M. T. Love ◽

Martin Wiedmann

Keyword(s):

Machine Learning ◽

Random Forest ◽

Predictive Models ◽

Training Data ◽

Agricultural Water ◽

Learning Models ◽

Safety Hazards ◽

E Coli ◽

Resampling Method ◽

Machine Learning Models

Recent studies have shown that predictive models can supplement or provide alternatives to E. coli-testing for assessing the potential presence of food safety hazards in water used for produce production. However, these studies used balanced training data and focused on enteric pathogens. As such, research is needed to determine 1) if predictive models can be used to assess Listeria contamination of agricultural water, and 2) how resampling (to deal with imbalanced data) affects performance of these models. To address these knowledge gaps, this study developed models that predict nonpathogenic Listeria spp. (excluding L. monocytogenes) and L. monocytogenes presence in agricultural water using various combinations of learner (e.g., random forest, regression), feature type, and resampling method (none, oversampling, SMOTE). Four feature types were used in model training: microbial, physicochemical, spatial, and weather. “Full models” were trained using all four feature types, while “nested models” used between one and three types. In total, 45 full (15 learners*3 resampling approaches) and 108 nested (5 learners*9 feature sets*3 resampling approaches) models were trained per outcome. Model performance was compared against baseline models where E. coli concentration was the sole predictor. Overall, the machine learning models outperformed the baseline E. coli models, with random forests outperforming models built using other learners (e.g., rule-based learners). Resampling produced more accurate models than not resampling, with SMOTE models outperforming, on average, oversampling models. Regardless of resampling method, spatial and physicochemical water quality features drove accurate predictions for the nonpathogenic Listeria spp. and L. monocytogenes models, respectively. Overall, these findings 1) illustrate the need for alternatives to existing E. coli-based monitoring programs for assessing agricultural water for the presence of potential food safety hazards, and 2) suggest that predictive models may be one such alternative. Moreover, these findings provide a conceptual framework for how such models can be developed in the future with the ultimate aim of developing models that can be integrated into on-farm risk management programs. For example, future studies should consider using random forest learners, SMOTE resampling, and spatial features to develop models to predict the presence of foodborne pathogens, such as L. monocytogenes, in agricultural water when the training data is imbalanced.

Download Full-text

Toward Optimal Heparin Dosing by Comparing Multiple Machine Learning Methods: Retrospective Study (Preprint)

10.2196/preprints.17648 ◽

2019 ◽

Author(s):

Longxiang Su ◽

Chun Liu ◽

Dongkai Li ◽

Jie He ◽

Fanglan Zheng ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Data Set ◽

Machine Learning Methods ◽

Heparin Treatment ◽

Neural Network Algorithm ◽

Mimic Iii ◽

Machine Learning Models

BACKGROUND Heparin is one of the most commonly used medications in intensive care units. In clinical practice, the use of a weight-based heparin dosing nomogram is standard practice for the treatment of thrombosis. Recently, machine learning techniques have dramatically improved the ability of computers to provide clinical decision support and have allowed for the possibility of computer generated, algorithm-based heparin dosing recommendations. OBJECTIVE The objective of this study was to predict the effects of heparin treatment using machine learning methods to optimize heparin dosing in intensive care units based on the predictions. Patient state predictions were based upon activated partial thromboplastin time in 3 different ranges: subtherapeutic, normal therapeutic, and supratherapeutic, respectively. METHODS Retrospective data from 2 intensive care unit research databases (Multiparameter Intelligent Monitoring in Intensive Care III, MIMIC-III; e–Intensive Care Unit Collaborative Research Database, eICU) were used for the analysis. Candidate machine learning models (random forest, support vector machine, adaptive boosting, extreme gradient boosting, and shallow neural network) were compared in 3 patient groups to evaluate the classification performance for predicting the subtherapeutic, normal therapeutic, and supratherapeutic patient states. The model results were evaluated using precision, recall, F1 score, and accuracy. RESULTS Data from the MIMIC-III database (n=2789 patients) and from the eICU database (n=575 patients) were used. In 3-class classification, the shallow neural network algorithm performed the best (F1 scores of 87.26%, 85.98%, and 87.55% for data set 1, 2, and 3, respectively). The shallow neural network algorithm achieved the highest F1 scores within the patient therapeutic state groups: subtherapeutic (data set 1: 79.35%; data set 2: 83.67%; data set 3: 83.33%), normal therapeutic (data set 1: 93.15%; data set 2: 87.76%; data set 3: 84.62%), and supratherapeutic (data set 1: 88.00%; data set 2: 86.54%; data set 3: 95.45%) therapeutic ranges, respectively. CONCLUSIONS The most appropriate model for predicting the effects of heparin treatment was found by comparing multiple machine learning models and can be used to further guide optimal heparin dosing. Using multicenter intensive care unit data, our study demonstrates the feasibility of predicting the outcomes of heparin treatment using data-driven methods, and thus, how machine learning–based models can be used to optimize and personalize heparin dosing to improve patient safety. Manual analysis and validation suggested that the model outperformed standard practice heparin treatment dosing.

Download Full-text