Machine learning algorithm to predict delirium from emergency department data

AbstractIntroductionDelirium is a cerebral dysfunction seen commonly in the acute care setting. Delirium is associated with increased mortality and morbidity and is frequently missed in the emergency department (ED) by clinical gestalt alone. Identifying those at risk of delirium may help prioritize screening and interventions.ObjectiveOur objective was to identify clinically valuable predictive models for prevalent delirium within the first 24 hours of hospitalization based on the available data by assessing the performance of logistic regression and a variety of machine learning models.MethodsThis was a retrospective cohort study to develop and validate a predictive risk model to detect delirium using patient data obtained around an ED encounter. Data from electronic health records for patients hospitalized from the ED between January 1, 2014, and December 31, 2019, were extracted. Eligible patients were aged 65 or older, admitted to an inpatient unit from the emergency department, and had at least one DOSS assessment or CAM-ICU recorded while hospitalized. The outcome measure of this study was delirium within one day of hospitalization determined by a positive DOSS or CAM assessment. We developed the model with and without the Barthel index for activity of daily living, since this was measured after hospital admission.ResultsThe area under the ROC curves for delirium ranged from .69 to .77 without the Barthel index. Random forest and gradient-boosted machine showed the highest AUC of .77. At the 90% sensitivity threshold, gradient-boosted machine, random forest, and logistic regression achieved a specificity of 35%. After the Barthel index was included, random forest, gradient-boosted machine, and logistic regression models demonstrated the best predictive ability with respective AUCs of .85 to .86.ConclusionThis study demonstrated the use of machine learning algorithms to identify the combination of variables that are predictive of delirium within 24 hours of hospitalization from the ED.

Download Full-text

Implementation of Machine Learning Algorithms for Prediction of Fluidelastic Instability in Tube Arrays

Journal of Pressure Vessel Technology ◽

10.1115/1.4049876 ◽

2021 ◽

Vol 143 (2) ◽

Author(s):

Joaquin E. Moran ◽

Yasser Selima

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Two Phase ◽

Factors Affecting ◽

Logistic Regression Models ◽

Number Of Factors ◽

Tube Arrays ◽

Fluidelastic Instability

Abstract Fluidelastic instability (FEI) in tube arrays has been studied extensively experimentally and theoretically for the last 50 years, due to its potential to cause significant damage in short periods. Incidents similar to those observed at San Onofre Nuclear Generating Station indicate that the problem is not yet fully understood, probably due to the large number of factors affecting the phenomenon. In this study, a new approach for the analysis and interpretation of FEI data using machine learning (ML) algorithms is explored. FEI data for both single and two-phase flows have been collected from the literature and utilized for training a machine learning algorithm in order to either provide estimates of the reduced velocity (single and two-phase) or indicate if the bundle is stable or unstable under certain conditions (two-phase). The analysis included the use of logistic regression as a classification algorithm for two-phase flow problems to determine if specific conditions produce a stable or unstable response. The results of this study provide some insight into the capability and potential of logistic regression models to analyze FEI if appropriate quantities of experimental data are available.

Download Full-text

A machine learning approach for identification of gastrointestinal predictors for the risk of COVID-19 related hospitalization

10.1101/2021.08.27.21262728 ◽

2021 ◽

Author(s):

Peter Liptak ◽

Peter Banovcin ◽

Robert Rosolanka ◽

Michal Prokopic ◽

Ivan Kocan ◽

...

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Random Forest ◽

Gastrointestinal Symptoms ◽

Machine Learning Algorithms ◽

Important Predictor ◽

University Hospital ◽

Home Based ◽

Severity Of The Disease ◽

The University

Background and aim: COVID-19 can be presented with various gastrointestinal symptoms. Shortly after the pandemic outbreak several machine learning algorithms have been implemented to assess new diagnostic and therapeutic methods for this disease. Aim of this study is to assess gas-trointestinal and liver related predictive factors for SARS-CoV-2 associated risk of hospitalization. Methods: Data collection was based on questionnaire from the COVID-19 outpatient test center and from the emergency department at the University hospital in combination with data from inter-nal hospital information system and from the mobile application used for telemedicine follow-up of patients. For statistical analysis SARS-CoV-2 negative patients were considered as controls to three different SARS-CoV-2 positive patient groups (divided based on severity of the disease). Results: Total of 710 patients were enrolled in the study. Presence of diarrhea and nausea was significantly higher in emergency department group than in the COVID-19 outpatient test center. Among liver enzymes only aspartate transaminase (AST) has been significantly elevated in the hospitalized group compared to patients discharged home. Based on random forest algorithm, AST has been identified as the most important predictor followed by age or diabetes mellitus. Diarrhea and bloating have also predictive importance although much lower than AST. Conclusion: SARS-CoV-2 positivity is connected with isolated AST elevation and the level is linked with the severity of the disease. Furthermore, using machine learning random forest algo-rithm, we have identified elevated AST as the most important predictor for COVID-19 related hos-pitalizations.

Download Full-text

FLOOD MAPPING USING RANDOM FOREST AND IDENTIFYING THE ESSENTIAL CONDITIONING FACTORS; A CASE STUDY IN FREDERICTON, NEW BRUNSWICK, CANADA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-3-2020-609-2020 ◽

2020 ◽

Vol V-3-2020 ◽

pp. 609-615 ◽

Cited By ~ 1

Author(s):

M. Esfandiari ◽

S. Jabari ◽

H. McGrath ◽

D. Coleman

Keyword(s):

Machine Learning ◽

Random Forest ◽

New Brunswick ◽

Urban Areas ◽

Learning Algorithm ◽

Satellite Image ◽

Machine Learning Algorithms ◽

Slope Aspect ◽

Flood Peak ◽

Conditioning Factors

Abstract. Flood is one of the most damaging natural hazards in urban areas in many places around the world as well as the city of Fredericton, New Brunswick, Canada. Recently, Fredericton has been flooded in two consecutive years in 2018 and 2019. Due to the complicated behaviour of water when a river overflows its bank, estimating the flood extent is challenging. The issue gets even more challenging when several different factors are affecting the water flow, like the land texture or the surface flatness, with varying degrees of intensity. Recently, machine learning algorithms and statistical methods are being used in many research studies for generating flood susceptibility maps using topographical, hydrological, and geological conditioning factors. One of the major issues that researchers have been facing is the complexity and the number of features required to input in a machine-learning algorithm to produce acceptable results. In this research, we used Random Forest to model the 2018 flood in Fredericton and analyzed the effect of several combinations of 12 different flood conditioning factors. The factors were tested against a Sentinel-2 optical satellite image available around the flood peak day. The highest accuracy was obtained using only 5 factors namely, altitude, slope, aspect, distance from the river, and land-use/cover with 97.57% overall accuracy and 95.14% kappa coefficient.

Download Full-text

Predicting Bank Operational Efficiency Using Machine Learning Algorithm: Comparative Study of Decision Tree, Random Forest, and Neural Networks

Advances in Fuzzy Systems ◽

10.1155/2020/8581202 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Peter Appiahene ◽

Yaw Marfo Missah ◽

Ussiph Najim

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Banking Sector ◽

Banking Industry ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

And Performance

The financial crisis that hit Ghana from 2015 to 2018 has raised various issues with respect to the efficiency of banks and the safety of depositors’ in the banking industry. As part of measures to improve the banking sector and also restore customers’ confidence, efficiency and performance analysis in the banking industry has become a hot issue. This is because stakeholders have to detect the underlying causes of inefficiencies within the banking industry. Nonparametric methods such as Data Envelopment Analysis (DEA) have been suggested in the literature as a good measure of banks’ efficiency and performance. Machine learning algorithms have also been viewed as a good tool to estimate various nonparametric and nonlinear problems. This paper presents a combined DEA with three machine learning approaches in evaluating bank efficiency and performance using 444 Ghanaian bank branches, Decision Making Units (DMUs). The results were compared with the corresponding efficiency ratings obtained from the DEA. Finally, the prediction accuracies of the three machine learning algorithm models were compared. The results suggested that the decision tree (DT) and its C5.0 algorithm provided the best predictive model. It had 100% accuracy in predicting the 134 holdout sample dataset (30% banks) and a P value of 0.00. The DT was followed closely by random forest algorithm with a predictive accuracy of 98.5% and a P value of 0.00 and finally the neural network (86.6% accuracy) with a P value 0.66. The study concluded that banks in Ghana can use the result of this study to predict their respective efficiencies. All experiments were performed within a simulation environment and conducted in R studio using R codes.

Download Full-text

A Daily Covid-19 Cases Prediction System using Data Mining and Machine Learning Algorithm

10.5121/csit.2021.112320 ◽

2021 ◽

Author(s):

Yiqi Jack Gao ◽

Yu Sun

Keyword(s):

Machine Learning ◽

Random Forest ◽

Hospital Admissions ◽

Polynomial Regression ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Policy Makers ◽

Diverse Range ◽

Using Data

The start of 2020 marked the beginning of the deadly COVID-19 pandemic caused by the novel SARS-COV-2 from Wuhan, China. As of the time of writing, the virus had infected over 150 million people worldwide and resulted in more than 3.5 million global deaths. Accurate future predictions made through machine learning algorithms can be very useful as a guide for hospitals and policy makers to make adequate preparations and enact effective policies to combat the pandemic. This paper carries out a two pronged approach to analyzing COVID-19. First, the model utilizes the feature significance of random forest regressor to select eight of the most significant predictors (date, new tests, weekly hospital admissions, population density, total tests, total deaths, location, and total cases) for predicting daily increases of Covid-19 cases, highlighting potential target areas in order to achieve efficient pandemic responses. Then it utilizes machine learning algorithms such as linear regression, polynomial regression, and random forest regression to make accurate predictions of daily COVID-19 cases using a combination of this diverse range of predictors and proved to be competent at generating predictions with reasonable accuracy.

Download Full-text

Machine learning algorithms can predict tail biting outbreaks in pigs using feeding behaviour records

10.1101/2021.05.11.443554 ◽

2021 ◽

Author(s):

Catherine Ollagnier ◽

Claudia Kasper ◽

Anna Wallenbeck ◽

Linda Keeling ◽

Siavash A Bigdeli

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

Feeding Behaviour ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Algorithm ◽

Tail Biting ◽

Testing Set

Tail biting is a detrimental behaviour that impacts the welfare and health of pigs. Early detection of tail biting precursor signs allows for preventive measures to be taken, thus avoiding the occurrence of the tail biting event. This study aimed to build a machine-learning algorithm for real time detection of upcoming tail biting outbreaks, using feeding behaviour data recorded by an electronic feeder. Prediction capacities of seven machine learning algorithms (e.g., random forest, neural networks) were evaluated from daily feeding data collected from 65 pens originating from 2 herds of grower-finisher pigs (25-100kg), in which 27 tail biting events occurred. Data were divided into training and testing data, either by randomly splitting data into 75% (training set) and 25% (testing set), or by randomly selecting pens to constitute the testing set. The random forest algorithm was able to predict 70% of the upcoming events with an accuracy of 94%, when predicting events in pens for which it had previous data. The detection of events for unknown pens was less sensitive, and the neural network model was able to detect 14% of the upcoming events with an accuracy of 63%. A machine-learning algorithm based on ongoing data collection should be considered for implementation into automatic feeder systems for real time prediction of tail biting events.

Download Full-text

Predicting the Grade of Prostate Cancer Based on a Biparametric MRI Radiomics Signature

Contrast Media & Molecular Imaging ◽

10.1155/2021/7830909 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Li Zhang ◽

Xia Zhe ◽

Min Tang ◽

Jing Zhang ◽

Jialiang Ren ◽

...

Keyword(s):

Prostate Cancer ◽

Machine Learning ◽

Logistic Regression ◽

Learning Algorithm ◽

Area Under The Curve ◽

Machine Learning Algorithms ◽

Support Vector ◽

Low Grade ◽

High Grade ◽

Random Forest Tree

Purpose. This study aimed to investigate the value of biparametric magnetic resonance imaging (bp-MRI)-based radiomics signatures for the preoperative prediction of prostate cancer (PCa) grade compared with visual assessments by radiologists based on the Prostate Imaging Reporting and Data System Version 2.1 (PI-RADS V2.1) scores of multiparametric MRI (mp-MRI). Methods. This retrospective study included 142 consecutive patients with histologically confirmed PCa who were undergoing mp-MRI before surgery. MRI images were scored and evaluated by two independent radiologists using PI-RADS V2.1. The radiomics workflow was divided into five steps: (a) image selection and segmentation, (b) feature extraction, (c) feature selection, (d) model establishment, and (e) model evaluation. Three machine learning algorithms (random forest tree (RF), logistic regression, and support vector machine (SVM)) were constructed to differentiate high-grade from low-grade PCa. Receiver operating characteristic (ROC) analysis was used to compare the machine learning-based analysis of bp-MRI radiomics models with PI-RADS V2.1. Results. In all, 8 stable radiomics features out of 804 extracted features based on T2-weighted imaging (T2WI) and ADC sequences were selected. Radiomics signatures successfully categorized high-grade and low-grade PCa cases ( P < 0.05 ) in both the training and test datasets. The radiomics model-based RF method (area under the curve, AUC: 0.982; 0.918), logistic regression (AUC: 0.886; 0.886), and SVM (AUC: 0.943; 0.913) in both the training and test cohorts had better diagnostic performance than PI-RADS V2.1 (AUC: 0.767; 0.813) when predicting PCa grade. Conclusions. The results of this clinical study indicate that machine learning-based analysis of bp-MRI radiomic models may be helpful for distinguishing high-grade and low-grade PCa that outperformed the PI-RADS V2.1 scores based on mp-MRI. The machine learning algorithm RF model was slightly better.

Download Full-text

The Contribution of CD148, CD180 and CD200 Combination in the Diagnosis of Chronic B-Cell Lymphoproliferative Disorders

Blood ◽

10.1182/blood-2021-150810 ◽

2021 ◽

Vol 138 (Supplement 1) ◽

pp. 3520-3520

Author(s):

Laurent Miguet ◽

Caroline Mayeur-Rousse ◽

Alice Eischen ◽

Anne-Cecile Galoisy ◽

Delphine C. M. Rolland ◽

...

Keyword(s):

Machine Learning ◽

Bone Marrow ◽

Flow Cytometry ◽

Logistic Regression ◽

Random Forest ◽

Expression Patterns ◽

Machine Learning Algorithms ◽

Relative Importance ◽

Predictive Values ◽

Molecular Features

Abstract Introduction: B-cell immunophenotype could be swiftly assessed by flow cytometry on blood samples or bone marrow aspirate specimens. It provides crucial information later refined with histologic, genetic and molecular features to assert accurate diagnosis of chronic B-cell lymphoproliferative disorders (B-CLPD). Besides Matutes score we identified additional useful markers, i.e. CD148 and CD180 to classify mantle cell lymphoma (MCL) and marginal zone lymphoma (MZL), respectively. Furthermore, CD200 is known to be highly expressed in chronic lymphoid leukemia (CLL) while absent in MCL. Hypothesis: The determination of CD148, CD180 and CD200 expression on B-cells by flow cytometry on blood samples and/or bone marrow aspirates could be a potent tool to accurately identify B-CLPD. We postulated the existence of the following specific expression patterns in B-CLPD: CD148 dim/CD180 dim/CD200 bright for CLL, CD148 dim/CD180 dim/CD200 dim for lymphoplasmocytic lymphoma (LPL), CD148 bright/CD180 dim/CD200 neg/dim for MCL and CD148 dim/CD180 bright/CD200 dim for MZL . Methods: In a prospective study we investigated the expression of CD148/CD180/CD200 on B-cells from 673 patients at the time of B-CLPD diagnosis in our hospital from 2014 to 2020. We analyzed 440 blood and 233 bone marrow aspirate specimens using a BD FACSCanto II flow cytometry instrument. Based solely on CD148/CD180/CD200 specific expression patterns we postulated a diagnosis of CLL, LPL, MCL or MZL. These postulated diagnoses were later confronted to the final diagnoses when all histologic, genetic and molecular features were finalized. Sensitivity, specificity, positive and negative predictive values of the expression profiles were determined. In addition, to investigate the relative importance of these three CD markers we then normalized their mean fluorescence intensities (MFI) and applied several supervised machine learning algorithms including Logistic Regression, Random Forest and Light Gradient Boosting Machine (LightGBM). Results: Out of the 673 clinical samples the CD148/CD180/CD200 expression patterns classified 212 specimens as CLL/SLL (30.8%), 160 as LPL (23.8%), 76 as MCL (11.28%) and 169 as MZL (25%). These diagnosis hypotheses were retrospectively compared to the final diagnoses based on all histologic, genetic and molecular features These diagnosis hypotheses of CLL, LPL, MCL and MZL were consistent with the final diagnosis in 583 out of the 617 corresponding cases (94%) with high positive and negative predictive values. The characteristics of the diagnosis accuracy are detailed in the table below. HCL and FL were not further investigated as their immunophenotype usually do not overlap with those of other B-CLPD. Seventeen out of 617 patients (17/617, 5.3%) did not displayed a clear CD148/CD180/CD200 pattern: 9 LPL, 4 CLL and 4 MZL. In sixteen patients (16/617, 5.0%) the diagnosis hypothesis based on this strategy was not confirmed after completion of the exploration including karyotype, MYD88 L265P mutational status, CCND1 overexpression and pathology explorations. We next investigated the relative importance of these 3 markers. We focused on MFI values of CD148, CD180 and CD200 and three categorical "positive or negative" markers (CD5, CD23, FMC7) that were assembled into a composite marker. After Cox-box normalization of CD148, CD180 and CD200 MFIs, a set of supervised machine learning algorithms including Logistic Regression, Random Forest and Light Gradient Boosting Machine (LightGBM) were applied to the cohort of CLL, LPL, MCL and MZL. We established that the highest diagnosis weights were obtained for CD200 in CLL, CD200 and CD148 in MCL (negatively and positively, respectively), CD180 in MZL. In LPL, CD148, CD180 and CD200 had the highest weights using LightGBM and Random Forest algorithms, while Logistic Regression determined that CD5 and CD23 had the highest (negative) weights. In conclusion, the determination of CD148/CD180/CD200 surface expression patterns by flow cytometry, along with morphology, allowed to assert an accurate diagnosis hypothesis in CLL, MCL, LPL and MZL with high positive and negative predictive values. Machine learning algorithms allowed to measure the relative importance of these markers, that could be of great help in case of discordant expression of the main diagnosis markers. Figure 1 Figure 1. Disclosures No relevant conflicts of interest to declare.

Download Full-text

Machine learning in the diagnosis of Myocardial Infarction with Non-Obstructive Coronary Arteries

European Heart Journal ◽

10.1093/eurheartj/ehab724.3067 ◽

2021 ◽

Vol 42 (Supplement_1) ◽

Author(s):

M J Espinosa Pascual ◽

P Vaquero Martinez ◽

V Vaquero Martinez ◽

J Lopez Pais ◽

B Izquierdo Coronel ◽

...

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Support Vector Machine ◽

Logistic Regression ◽

Random Forest ◽

Obstructive Coronary Artery Disease ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Classification Model ◽

Support Vector

Abstract Introduction Out of all patients admitted with Myocardial Infarction, 10 to 15% have Myocardial Infarction with Non-Obstructive Coronaries Arteries (MINOCA). Classification algorithms based on deep learning substantially exceed traditional diagnostic algorithms. Therefore, numerous machine learning models have been proposed as useful tools for the detection of various pathologies, but to date no study has proposed a diagnostic algorithm for MINOCA. Purpose The aim of this study was to estimate the diagnostic accuracy of several automated learning algorithms (Support-Vector Machine [SVM], Random Forest [RF] and Logistic Regression [LR]) to discriminate between people suffering from MINOCA from those with Myocardial Infarction with Obstructive Coronary Artery Disease (MICAD) at the time of admission and before performing a coronary angiography, whether invasive or not. Methods A Diagnostic Test Evaluation study was carried out applying the proposed algorithms to a database constituted by 553 consecutive patients admitted to our Hospital with Myocardial Infarction. According to the definitions of 2016 ESC Position Paper on MINOCA, patients were classified into two groups: MICAD and MINOCA. Out of the total 553 patients, 214 were discarded due to the lack of complete data. The set of machine learning algorithms was trained on 244 patients (training sample: 75%) and tested on 80 patients (test sample: 25%). A total of 64 variables were available for each patient, including demographic, clinical and laboratorial features before the angiographic procedure. Finally, the diagnostic precision of each architecture was taken. Results The most accurate classification model was the Random Forest algorithm (Specificity [Sp] 0.88, Sensitivity [Se] 0.57, Negative Predictive Value [NPV] 0.93, Area Under the Curve [AUC] 0.85 [CI 0.83–0.88]) followed by the standard Logistic Regression (Sp 0.76, Se 0.57, NPV 0.92 AUC 0.74 and Support-Vector Machine (Sp 0.84, Se 0.38, NPV 0.90, AUC 0.78) (see graph). The variables that contributed the most in order to discriminate a MINOCA from a MICAD were the traditional cardiovascular risk factors, biomarkers of myocardial injury, hemoglobin and gender. Results were similar when the 19 patients with Takotsubo syndrome were excluded from the analysis. Conclusion A prediction system for diagnosing MINOCA before performing coronary angiographies was developed using machine learning algorithms. Results show higher accuracy of diagnosing MINOCA than conventional statistical methods. This study supports the potential of machine learning algorithms in clinical cardiology. However, further studies are required in order to validate our results. FUNDunding Acknowledgement Type of funding sources: None. ROC curves of different algorithms

Download Full-text

ClickbaitTR: Dataset for clickbait detection from Turkish news sites and social media with a comparative analysis via machine learning algorithms

Journal of Information Science ◽

10.1177/01655515211007746 ◽

2021 ◽

pp. 016555152110077

Author(s):

Şura Genç ◽

Elif Surer

Keyword(s):

Machine Learning ◽

Social Media ◽

Logistic Regression ◽

Random Forest ◽

Short Term Memory ◽

Ensemble Classifier ◽

Machine Learning Algorithms ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Clickbait is a strategy that aims to attract people’s attention and direct them to specific content. Clickbait titles, created by the information that is not included in the main content or using intriguing expressions with various text-related features, have become very popular, especially in social media. This study expands the Turkish clickbait dataset that we had constructed for clickbait detection in our proof-of-concept study, written in Turkish. We achieve a 48,060 sample size by adding 8859 tweets and release a publicly available dataset – ClickbaitTR – with its open-source data analysis library. We apply machine learning algorithms such as Artificial Neural Network (ANN), Logistic Regression, Random Forest, Long Short-Term Memory Network (LSTM), Bidirectional Long Short-Term Memory (BiLSTM) and Ensemble Classifier on 48,060 news headlines extracted from Twitter. The results show that the Logistic Regression algorithm has 85% accuracy; the Random Forest algorithm has a performance of 86% accuracy; the LSTM has 93% accuracy; the ANN has 93% accuracy; the Ensemble Classifier has 93% accuracy; and finally, the BiLSTM has 97% accuracy. A thorough discussion is provided for the psychological aspects of clickbait strategy focusing on curiosity and interest arousal. In addition to a successful clickbait detection performance and the detailed analysis of clickbait sentences in terms of language and psychological aspects, this study also contributes to clickbait detection studies with the largest clickbait dataset in Turkish.

Download Full-text