Early Prediction of Ventilator-associated Pneumonia in Critical Care Patients: a Machine Learning Model

Abstract Background: This study was performed to develop and validate machine learning models for the early detection of ventilator-associated pneumonia (VAP) in patients 24 h before the diagnosis that enables VAP patients to receive early intervention and reduces the occurrence of complications.Patients and Methods: This study was based on the MIMIC-III dataset, which was a retrospective cohort. The random forest algorithm was applied to construct a base classifier, and the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity and specificity of the prediction model were evaluated. Meanwhile, a Clinical Pulmonary Infection Score (CPIS)-based model (threshold value≥3) using the same training and test data set was used as the control model.Results: A total of 38,515 ventilation durations occurred in 61,532 ICU admissions. VAP occurred in 212 of these durations. We incorporated 42 VAP risk factors on admission and routinely measured vital characteristics and laboratory results. Five-fold cross-validation was performed to evaluate the model performance, and the model achieved an AUC of 84.4%±1.7% on validation, 74.3%±2.5% sensitivity and 70.7.6%±1.2% specificity 24 h before the gold standard time (at least 48 h after ventilation). Our VAP machine learning model improved the AUC of the CPIS-based model by almost 25%, and the sensitivity and specificity were also improved by almost 14% and 15%, respectively.Conclusions: We developed and internally validated an automated model of VAP prediction in the MIMIC-III cohort. The VAP prediction model achieved high performance for AUC, sensitivity and specificity. and its performance was superior to that of the CPIS model. External validation and prospective interventional or outcome studies using this prediction model are envisioned as future work.

Download Full-text

Development of a Machine Learning Model for the Prediction of the Real Time Mortality in Patients in the Intensive Care Unit

10.21203/rs.3.rs-1066192/v1 ◽

2021 ◽

Author(s):

Jaeyoung Yang ◽

Hong-Gook Lim ◽

Wonhyeong Park ◽

Dongseok Kim ◽

Jin Sun Yoon ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Prediction Model ◽

Prediction Models ◽

External Validation ◽

Learning Model ◽

Mortality Prediction ◽

Machine Learning Model ◽

Mortality Prediction Model

Abstract BackgroundPrediction of mortality in intensive care units is very important. Thus, various mortality prediction models have been developed for this purpose. However, they do not accurately reflect the changing condition of the patient in real time. The aim of this study was to develop and evaluate a machine learning model that predicts short-term mortality in the intensive care unit using four easy-to-collect vital signs.MethodsTwo independent retrospective observational cohorts were included in this study. The primary training cohort included the data of 1968 patients admitted to the intensive care unit at the Veterans Health Service Medical Center, Seoul, South Korea, from January 2018 to March 2019. The external validation cohort comprised the records of 409 patients admitted to the medical intensive care unit at Seoul National University Hospital, Seoul, South Korea, from January 2019 to December 2019. Datasets of four vital signs (heart rate, systolic blood pressure, diastolic blood pressure, and peripheral capillary oxygen saturation [SpO2]) measured every hour for 10 h were used for the development of the machine learning model. The performances of mortality prediction models generated using five machine learning algorithms, Random Forest (RF), XGboost, perceptron, convolutional neural network, and Long Short-Term Memory, were calculated and compared using area under the receiver operating characteristic curve (AUROC) values and an external validation dataset.ResultsThe machine learning model generated using the RF algorithm showed the best performance. Its AUROC was 0.922, which is much better than the 0.8408 of the Acute Physiology and Chronic Health Evaluation II. Thus, to investigate the importance of variables that influence the performance of the machine learning model, machine learning models were generated for each observation time or vital sign using the RF algorithm. The machine learning model developed using SpO2 showed the best performance (AUROC, 0.89). ConclusionsThe mortality prediction model developed in this study using data from only four types of commonly recorded vital signs is simpler than any existing mortality prediction model. This simple yet powerful new mortality prediction model could be useful for early detection of probable mortality and appropriate medical intervention, especially in rapidly deteriorating patients.

Download Full-text

Machine Learning Model to Identify Sepsis Patients in the Emergency Department: Algorithm Development and Validation

Journal of Personalized Medicine ◽

10.3390/jpm11111055 ◽

2021 ◽

Vol 11 (11) ◽

pp. 1055

Author(s):

Pei-Chen Lin ◽

Kuo-Tai Chen ◽

Huan-Chieh Chen ◽

Md. Mohaimenul Islam ◽

Ming-Chin Lin

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Characteristic Curve ◽

External Validation ◽

Model Performance ◽

Learning Model ◽

Gradient Boosting ◽

Machine Learning Model ◽

Extreme Gradient Boosting ◽

Development And Validation

Accurate stratification of sepsis can effectively guide the triage of patient care and shared decision making in the emergency department (ED). However, previous research on sepsis identification models focused mainly on ICU patients, and discrepancies in model performance between the development and external validation datasets are rarely evaluated. The aim of our study was to develop and externally validate a machine learning model to stratify sepsis patients in the ED. We retrospectively collected clinical data from two geographically separate institutes that provided a different level of care at different time periods. The Sepsis-3 criteria were used as the reference standard in both datasets for identifying true sepsis cases. An eXtreme Gradient Boosting (XGBoost) algorithm was developed to stratify sepsis patients and the performance of the model was compared with traditional clinical sepsis tools; quick Sequential Organ Failure Assessment (qSOFA) and Systemic Inflammatory Response Syndrome (SIRS). There were 8296 patients (1752 (21%) being septic) in the development and 1744 patients (506 (29%) being septic) in the external validation datasets. The mortality of septic patients in the development and validation datasets was 13.5% and 17%, respectively. In the internal validation, XGBoost achieved an area under the receiver operating characteristic curve (AUROC) of 0.86, exceeding SIRS (0.68) and qSOFA (0.56). The performance of XGBoost deteriorated in the external validation (the AUROC of XGBoost, SIRS and qSOFA was 0.75, 0.57 and 0.66, respectively). Heterogeneity in patient characteristics, such as sepsis prevalence, severity, age, comorbidity and infection focus, could reduce model performance. Our model showed good discriminative capabilities for the identification of sepsis patients and outperformed the existing sepsis identification tools. Implementation of the ML model in the ED can facilitate timely sepsis identification and treatment. However, dataset discrepancies should be carefully evaluated before implementing the ML approach in clinical practice. This finding reinforces the necessity for future studies to perform external validation to ensure the generalisability of any developed ML approaches.

Download Full-text

AN EFFICIENT MACHINE LEARNING MODEL FOR PREDICTION OF ACUTE MYOCARDIAL INFARCTION

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666200325104317 ◽

2020 ◽

Vol 13 ◽

Author(s):

Dhilsath Fathima.M ◽

S. Justin Samuel ◽

R. Hari Haran

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Acute Myocardial Infarction ◽

Logistic Regression ◽

Decision Tree ◽

Learning Model ◽

Training Dataset ◽

Data Set ◽

Machine Learning Model ◽

Proposed Model

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

Development of a machine learning model for predicting pediatric mortality in the early stages of intensive care unit admission

Scientific Reports ◽

10.1038/s41598-020-80474-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Bongjin Lee ◽

Kyunghoon Kim ◽

Hyejin Hwang ◽

You Sun Kim ◽

Eun Hee Chung ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Validation Cohort ◽

External Validation ◽

Learning Model ◽

Derivation Cohort ◽

Icu Admission ◽

Early Stages ◽

Machine Learning Model

AbstractThe aim of this study was to develop a predictive model of pediatric mortality in the early stages of intensive care unit (ICU) admission using machine learning. Patients less than 18 years old who were admitted to ICUs at four tertiary referral hospitals were enrolled. Three hospitals were designated as the derivation cohort for machine learning model development and internal validation, and the other hospital was designated as the validation cohort for external validation. We developed a random forest (RF) model that predicts pediatric mortality within 72 h of ICU admission, evaluated its performance, and compared it with the Pediatric Index of Mortality 3 (PIM 3). The area under the receiver operating characteristic curve (AUROC) of RF model was 0.942 (95% confidence interval [CI] = 0.912–0.972) in the derivation cohort and 0.906 (95% CI = 0.900–0.912) in the validation cohort. In contrast, the AUROC of PIM 3 was 0.892 (95% CI = 0.878–0.906) in the derivation cohort and 0.845 (95% CI = 0.817–0.873) in the validation cohort. The RF model in our study showed improved predictive performance in terms of both internal and external validation and was superior even when compared to PIM 3.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

Development and external validation of a deep learning-based computed tomography classification system for COVID-19

10.31219/osf.io/j6xhb ◽

2021 ◽

Author(s):

Yuki KATAOKA

Keyword(s):

Machine Learning ◽

Computed Tomography ◽

Interstitial Pneumonia ◽

External Validation ◽

Usual Interstitial Pneumonia ◽

Evaluation Process ◽

High Sensitivity ◽

Learning Model ◽

Machine Learning Model ◽

Ablation Study

Rationale: Currently available machine learning models for diagnosing COVID-19 based on computed tomography (CT) images are limited due to concerns regarding methodological flaws or underlying biases in the evaluation process. Objectives: We aimed to develop and externally validate a novel machine learning model that can classify CT image findings as positive or negative for SARS-CoV-2 reverse transcription polymerase chain reaction (RT-PCR).Methods: We used 3128 images from a wide variety of two-gate data sources for the development and ablation study of the machine learning model. A total of 633 COVID-19 cases and 2295 non-COVID-19 cases were included in the study. We randomly divided cases into a development set and ablation set at a ratio of 8:2. For the ablation study, we used another dataset including 150 cases of interstitial pneumonia among non-COVID-19 images. For external validation, we used 893 images from 740 consecutive patients at 11 acute care hospitals suspected of having COVID-19 at the time of diagnosis. The dataset included 343 COVID-19 patients. The reference standard was RT-PCR.Result: In ablation study, using interstitial pneumonia images, the specificity of the model were 0.986 for usual interstitial pneumonia pattern, 0.820 for non-specific interstitial pneumonia pattern, 0.400 for organizing pneumonia pattern. In the external validation study, the sensitivity and specificity of the model were 0.869 and 0.432, respectively, at the low-level cutoff, and 0.724 and 0.721, respectively, at the high-level cutoff.Conclusions: Our machine learning model exhibited a high sensitivity in external validation datasets and may assist physicians to rule out COVID-19 diagnosis in a timely manner. Further studies are warranted to improve model specificity.

Download Full-text

Machine Learning Model for Outcome Prediction of Patients Suffering from Acute Diverticulitis Arriving at the Emergency Department—A Proof of Concept Study

Diagnostics ◽

10.3390/diagnostics11112102 ◽

2021 ◽

Vol 11 (11) ◽

pp. 2102

Author(s):

Eyal Klang ◽

Robert Freeman ◽

Matthew A. Levin ◽

Shelly Soffer ◽

Yiftach Barash ◽

...

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Acute Diverticulitis ◽

External Validation ◽

White Blood Cells ◽

Learning Model ◽

Complicated Diverticulitis ◽

Gradient Boosting ◽

Proof Of Concept ◽

Machine Learning Model

Background & Aims: We aimed at identifying specific emergency department (ED) risk factors for developing complicated acute diverticulitis (AD) and evaluate a machine learning model (ML) for predicting complicated AD. Methods: We analyzed data retrieved from unselected consecutive large bowel AD patients from five hospitals from the Mount Sinai health system, NY. The study time frame was from January 2011 through March 2021. Data were used to train and evaluate a gradient-boosting machine learning model to identify patients with complicated diverticulitis, defined as a need for invasive intervention or in-hospital mortality. The model was trained and evaluated on data from four hospitals and externally validated on held-out data from the fifth hospital. Results: The final cohort included 4997 AD visits. Of them, 129 (2.9%) visits had complicated diverticulitis. Patients with complicated diverticulitis were more likely to be men, black, and arrive by ambulance. Regarding laboratory values, patients with complicated diverticulitis had higher levels of absolute neutrophils (AUC 0.73), higher white blood cells (AUC 0.70), platelet count (AUC 0.68) and lactate (AUC 0.61), and lower levels of albumin (AUC 0.69), chloride (AUC 0.64), and sodium (AUC 0.61). In the external validation cohort, the ML model showed AUC 0.85 (95% CI 0.78–0.91) for predicting complicated diverticulitis. For Youden’s index, the model showed a sensitivity of 88% with a false positive rate of 1:3.6. Conclusions: A ML model trained on clinical measures provides a proof of concept performance in predicting complications in patients presenting to the ED with AD. Clinically, it implies that a ML model may classify low-risk patients to be discharged from the ED for further treatment under an ambulatory setting.

Download Full-text

Assessing the International Transferability of a Machine Learning Model for Detecting Medication Error in the General Internal Medicine Clinic: Multicenter Preliminary Validation Study

JMIR Medical Informatics ◽

10.2196/23454 ◽

2021 ◽

Vol 9 (1) ◽

pp. e23454

Author(s):

Yen Po Harvey Chin ◽

Wenyu Song ◽

Chia En Lien ◽

Chang Ho Yoon ◽

Wei-Chen Wang ◽

...

Keyword(s):

Machine Learning ◽

Medication Error ◽

Model Performance ◽

Learning Model ◽

Study Cohort ◽

Learning Approach ◽

Hospital Data ◽

General Internal ◽

Machine Learning Model ◽

Outpatient Prescriptions

Background Although most current medication error prevention systems are rule-based, these systems may result in alert fatigue because of poor accuracy. Previously, we had developed a machine learning (ML) model based on Taiwan’s local databases (TLD) to address this issue. However, the international transferability of this model is unclear. Objective This study examines the international transferability of a machine learning model for detecting medication errors and whether the federated learning approach could further improve the accuracy of the model. Methods The study cohort included 667,572 outpatient prescriptions from 2 large US academic medical centers. Our ML model was applied to build the original model (O model), the local model (L model), and the hybrid model (H model). The O model was built using the data of 1.34 billion outpatient prescriptions from TLD. A validation set with 8.98% (60,000/667,572) of the prescriptions was first randomly sampled, and the remaining 91.02% (607,572/667,572) of the prescriptions served as the local training set for the L model. With a federated learning approach, the H model used the association values with a higher frequency of co-occurrence among the O and L models. A testing set with 600 prescriptions was classified as substantiated and unsubstantiated by 2 independent physician reviewers and was then used to assess model performance. Results The interrater agreement was significant in terms of classifying prescriptions as substantiated and unsubstantiated (κ=0.91; 95% CI 0.88 to 0.95). With thresholds ranging from 0.5 to 1.5, the alert accuracy ranged from 75%-78% for the O model, 76%-78% for the L model, and 79%-85% for the H model. Conclusions Our ML model has good international transferability among US hospital data. Using the federated learning approach with local hospital data could further improve the accuracy of the model.

Download Full-text

Identifying Factors Associated With Severe Intellectual Disabilities in Teenagers With Cerebral Palsy Using a Predictive Learning Model

Journal of Child Neurology ◽

10.1177/0883073818822358 ◽

2019 ◽

Vol 34 (4) ◽

pp. 221-229 ◽

Cited By ~ 6

Author(s):

Carlo M. Bertoncelli ◽

Paola Altamura ◽

Edgar Ramos Vieira ◽

Domenico Bertoncelli ◽

Susanne Thummler ◽

...

Keyword(s):

Machine Learning ◽

Cerebral Palsy ◽

Intellectual Disability ◽

Prediction Model ◽

Motor Skills ◽

Adaptive Functioning ◽

Learning Model ◽

Intelligence Scale ◽

Factors Associated ◽

Machine Learning Model

Background: Intellectual disability and impaired adaptive functioning are common in children with cerebral palsy, but there is a lack of studies assessing these issues in teenagers with cerebral palsy. Therefore, the aim of this study was to develop and test a predictive machine learning model to identify factors associated with intellectual disability in teenagers with cerebral palsy. Methods: This was a multicenter controlled cohort study of 91 teenagers with cerebral palsy (53 males, 38 females; mean age ± SD = 17 ± 1 y; range: 12-18 y). Data on etiology, diagnosis, spasticity, epilepsy, clinical history, communication abilities, behaviors, motor skills, eating, and drinking abilities were collected between 2005 and 2015. Intellectual disability was classified as “mild,” “moderate,” “severe,” or “profound” based on adaptive functioning, and according to the DSM-5 after 2013 and DSM-IV before 2013, the Wechsler Intelligence Scale for Children for patients up to ages 16 years, 11 months, and the Wechsler Adult Intelligence Scale for patients ages 17-18. Statistical analysis included Fisher’s exact test and multiple logistic regressions to identify factors associated with intellectual disability. A predictive machine learning model was developed to identify factors associated with having profound intellectual disability. The guidelines of the “Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis Statement” were followed. Results: Poor manual abilities ( P ≤ .001), gross motor function ( P ≤ .001), and type of epilepsy (intractable: P = .04; well controlled: P = .01) were significantly associated with profound intellectual disability. The average model accuracy, specificity, and sensitivity was 78%. Conclusion: Poor motor skills and epilepsy were associated with profound intellectual disability. The machine learning prediction model was able to adequately identify high likelihood of severe intellectual disability in teenagers with cerebral palsy.

Download Full-text