Machine learning: how much does it improve the prediction of unplanned hospital admissions?

IntroductionRisk prediction models can be used to inform decision-making in clinical settings. With large and detailed electronic medical record data, machine learning may improve predictions. The objective of this work is to determine the feasibility and accuracy of machine learning versus logistic regression to predict unplanned hospital admissions. Objectives and ApproachData from primary care electronic medical records for community-dwelling adults in Alberta, Canada available from the Canadian Primary Care Sentinel Surveillance Network will be linked to acute care administrative health data held by Alberta Health Services. Two regression methods (forward stepwise logistic, LASSO logistic) will be compared with three machine learning methods (classification tree, random forest, gradient boosted trees). Prior primary and acute care use will be used to predict three outcomes: ≥1 unplanned admission within 1 year, ≥1 unplanned admission within 90 days, and ≥1 unplanned admission within 1 year due to an ambulatory care sensitive condition. ResultsThe results of this work in progress will be presented at the conference. 41,142 patients will have their primary and acute care data linked. We anticipate that the machine learning methods will improve predictive performance but will be more challenging for clinicians and patients to understand, including why a given patient is predicted to be at higher risk. The primary comparison of machine learning and regression methods will be based on positive predictive values corresponding to the top 5% predicted risk threshold, and estimated via 10-fold cross-validation. Conclusion/ImplicationsThis project aims to help researchers decide which statistical methods to use for risk prediction models. When considering machine learning methods the best approach may be to try multiple methods, compare their predictive accuracy and interpretability, and then choose a final method.

Download Full-text

Machine Learning Methods Applied to the Prediction of Pseudo-nitzschia spp. Blooms in the Galician Rias Baixas (NW Spain)

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10040199 ◽

2021 ◽

Vol 10 (4) ◽

pp. 199

Author(s):

Francisco M. Bellas Aláez ◽

Jesus M. Torres Palenzuela ◽

Evangelos Spyrakos ◽

Luis González Vilas

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Prediction Models ◽

Support Vector ◽

False Alarms ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Rías Baixas ◽

New Algorithms

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

Innovation in Aging ◽

10.1093/geroni/igaa057.859 ◽

2020 ◽

Vol 4 (Supplement_1) ◽

pp. 268-269

Author(s):

Jaime Speiser ◽

Kathryn Callahan ◽

Jason Fanning ◽

Thomas Gill ◽

Anne Newman ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Receiver Operating Curve ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods

Abstract Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty understanding the complex algorithms behind models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated in data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Machine learning methods may offer improved performance compared to traditional models for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Advances in Blast-Induced Impact Prediction—A Review of Machine Learning Applications

Minerals ◽

10.3390/min11060601 ◽

2021 ◽

Vol 11 (6) ◽

pp. 601

Author(s):

Nelson K. Dumakor-Dupey ◽

Sampurna Arya ◽

Ankit Jha

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Academic Research ◽

Empirical Models ◽

Rock Breakage ◽

Environmental Implications ◽

Learning Methods ◽

Factors Affecting ◽

Impact Prediction ◽

Machine Learning Methods

Rock fragmentation in mining and construction industries is widely achieved using drilling and blasting technique. The technique remains the most effective and efficient means of breaking down rock mass into smaller pieces. However, apart from its intended purpose of rock breakage, throw, and heave, blasting operations generate adverse impacts, such as ground vibration, airblast, flyrock, fumes, and noise, that have significant operational and environmental implications on mining activities. Consequently, blast impact studies are conducted to determine an optimum blast design that can maximize the desirable impacts and minimize the undesirable ones. To achieve this objective, several blast impact estimation empirical models have been developed. However, despite being the industry benchmark, empirical model results are based on a limited number of factors affecting the outcomes of a blast. As a result, modern-day researchers are employing machine learning (ML) techniques for blast impact prediction. The ML approach can incorporate several factors affecting the outcomes of a blast, and therefore, it is preferred over empirical and other statistical methods. This paper reviews the various blast impacts and their prediction models with a focus on empirical and machine learning methods. The details of the prediction methods for various blast impacts—including their applications, advantages, and limitations—are discussed. The literature reveals that the machine learning methods are better predictors compared to the empirical models. However, we observed that presently these ML models are mainly applied in academic research.

Download Full-text

Classification models using circulating neutrophil transcripts can detect unruptured intracranial aneurysm

Journal of Translational Medicine ◽

10.1186/s12967-020-02550-2 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Kerry E. Poppenberg ◽

Vincent M. Tutino ◽

Lu Li ◽

Muhammad Waqas ◽

Armond June ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Prediction Models ◽

Model Performance ◽

Supervised Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Training Cohort ◽

Network Analyses ◽

Machine Learning Methods

Abstract Background Intracranial aneurysms (IAs) are dangerous because of their potential to rupture. We previously found significant RNA expression differences in circulating neutrophils between patients with and without unruptured IAs and trained machine learning models to predict presence of IA using 40 neutrophil transcriptomes. Here, we aim to develop a predictive model for unruptured IA using neutrophil transcriptomes from a larger population and more robust machine learning methods. Methods Neutrophil RNA extracted from the blood of 134 patients (55 with IA, 79 IA-free controls) was subjected to next-generation RNA sequencing. In a randomly-selected training cohort (n = 94), the Least Absolute Shrinkage and Selection Operator (LASSO) selected transcripts, from which we constructed prediction models via 4 well-established supervised machine-learning algorithms (K-Nearest Neighbors, Random Forest, and Support Vector Machines with Gaussian and cubic kernels). We tested the models in the remaining samples (n = 40) and assessed model performance by receiver-operating-characteristic (ROC) curves. Real-time quantitative polymerase chain reaction (RT-qPCR) of 9 IA-associated genes was used to verify gene expression in a subset of 49 neutrophil RNA samples. We also examined the potential influence of demographics and comorbidities on model prediction. Results Feature selection using LASSO in the training cohort identified 37 IA-associated transcripts. Models trained using these transcripts had a maximum accuracy of 90% in the testing cohort. The testing performance across all methods had an average area under ROC curve (AUC) = 0.97, an improvement over our previous models. The Random Forest model performed best across both training and testing cohorts. RT-qPCR confirmed expression differences in 7 of 9 genes tested. Gene ontology and IPA network analyses performed on the 37 model genes reflected dysregulated inflammation, cell signaling, and apoptosis processes. In our data, demographics and comorbidities did not affect model performance. Conclusions We improved upon our previous IA prediction models based on circulating neutrophil transcriptomes by increasing sample size and by implementing LASSO and more robust machine learning methods. Future studies are needed to validate these models in larger cohorts and further investigate effect of covariates.

Download Full-text

learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data

10.1101/2021.12.13.472185 ◽

2021 ◽

Author(s):

Cathy C. Westhues ◽

Henner Simianer ◽

Timothy M. Beissinger

Keyword(s):

Machine Learning ◽

Genomic Prediction ◽

Prediction Models ◽

R Package ◽

Fixed Number ◽

Environmental Data ◽

Weather Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Daily Weather Data

We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial (MET) breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or can retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated in daily windows based on naive (for instance, daily windows with a fixed number of days) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient boosted trees, random forests, stacked ensemble models, and multi-layer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with MET experimental data in a user-friendly way. The package is fully open source and accessible on GitHub.

Download Full-text

Development of a Probabilistic Seismic Performance Assessment Model of Slope Using Machine Learning Methods

Sustainability ◽

10.3390/su12083269 ◽

2020 ◽

Vol 12 (8) ◽

pp. 3269

Author(s):

Shinyoung Kwag ◽

Daegi Hahm ◽

Minkyu Kim ◽

Seunghyun Eem

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Linear Regression ◽

Seismic Performance ◽

Prediction Models ◽

Linear Regression Analysis ◽

Assessment Model ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

The objective of this study is to propose a model that can predict the seismic performance of slope relatively accurately and efficiently by using machine learning methods. Probabilistic seismic fragility analyses of the slope had been carried out in other studies, and a closed-form equation for slope seismic performance was proposed through a multiple linear regression analysis. However, the traditional statistical linear regression analysis showed a limit that could not accurately represent such nonlinear slope seismic performances. To overcome this limit, in this study, we used three machine learning methods (i.e., support vector machine (SVM), artificial neural network (ANN), Gaussian process regression (GPR)) to generate prediction models of the slope seismic performance. The models obtained through the machine learning methods basically showed better performance compared to the models of the traditional statistical methods. The results of the SVM showed no significant performance difference compared with the results of the nonlinear regression analysis method, but the results based on the ANN and GPR showed a remarkable improvement in the prediction performance over the other models. Furthermore, this study confirmed that the GPR-based model predicted relatively accurate seismic performance values compared with the model through the ANN.

Download Full-text

Machine Learning in Aging: An Example of Developing Prediction Models for Serious Fall Injury in Older Adults

The Journals of Gerontology Series A ◽

10.1093/gerona/glaa138 ◽

2020 ◽

Author(s):

Jaime Lynn Speiser ◽

Kathryn E Callahan ◽

Denise K Houston ◽

Jason Fanning ◽

Thomas M Gill ◽

...

Keyword(s):

Machine Learning ◽

Older Adults ◽

Random Forest ◽

Decision Tree ◽

Prediction Models ◽

Learning Methods ◽

Life Study ◽

Fall Injury ◽

Machine Learning Methods ◽

Using Data

Abstract Background Advances in computational algorithms and the availability of large datasets with clinically relevant characteristics provide an opportunity to develop machine learning prediction models to aid in diagnosis, prognosis, and treatment of older adults. Some studies have employed machine learning methods for prediction modeling, but skepticism of these methods remains due to lack of reproducibility and difficulty in understanding the complex algorithms that underlie models. We aim to provide an overview of two common machine learning methods: decision tree and random forest. We focus on these methods because they provide a high degree of interpretability. Method We discuss the underlying algorithms of decision tree and random forest methods and present a tutorial for developing prediction models for serious fall injury using data from the Lifestyle Interventions and Independence for Elders (LIFE) study. Results Decision tree is a machine learning method that produces a model resembling a flow chart. Random forest consists of a collection of many decision trees whose results are aggregated. In the tutorial example, we discuss evaluation metrics and interpretation for these models. Illustrated using data from the LIFE study, prediction models for serious fall injury were moderate at best (area under the receiver operating curve of 0.54 for decision tree and 0.66 for random forest). Conclusions Machine learning methods offer an alternative to traditional approaches for modeling outcomes in aging, but their use should be justified and output should be carefully described. Models should be assessed by clinical experts to ensure compatibility with clinical practice.

Download Full-text

Comparison of Prediction Models for Delays of Trains by using Data Mining and Machine Learning Methods

10.33107/ubt-ic.2018.87 ◽

2018 ◽

Author(s):

Dennis Leser ◽

Matthias Wastian ◽

Matthias Rößler ◽

Michael Landsied

Keyword(s):

Machine Learning ◽

Data Mining ◽

Prediction Models ◽

Learning Methods ◽

Machine Learning Methods ◽

Using Data

Download Full-text

Machine Learning Methods Applied to Predict Ventilator-Associated Pneumonia with Pseudomonas aeruginosa Infection via Sensor Array of Electronic Nose in Intensive Care Unit

Sensors ◽

10.3390/s19081866 ◽

2019 ◽

Vol 19 (8) ◽

pp. 1866 ◽

Cited By ~ 13

Author(s):

Liao ◽

Wang ◽

Zhang ◽

Abbod ◽

Shih ◽

...

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Pseudomonas Aeruginosa ◽

Intensive Care ◽

Prediction Models ◽

Support Vector ◽

Ventilator Associated Pneumonia ◽

Learning Methods ◽

Machine Learning Methods ◽

Pseudomonas Aeruginosa Infection

One concern to the patients is the off-line detection of pneumonia infection status after using the ventilator in the intensive care unit. Hence, machine learning methods for ventilator-associated pneumonia (VAP) rapid diagnose are proposed. A popular device, Cyranose 320 e-nose, is usually used in research on lung disease, which is a highly integrated system and sensor comprising 32 array using polymer and carbon black materials. In this study, a total of 24 subjects were involved, including 12 subjects who are infected with pneumonia, and the rest are non-infected. Three layers of back propagation artificial neural network and support vector machine (SVM) methods were applied to patients’ data to predict whether they are infected with VAP with Pseudomonas aeruginosa infection. Furthermore, in order to improve the accuracy and the generalization of the prediction models, the ensemble neural networks (ENN) method was applied. In this study, ENN and SVM prediction models were trained and tested. In order to evaluate the models’ performance, a fivefold cross-validation method was applied. The results showed that both ENN and SVM models have high recognition rates of VAP with Pseudomonas aeruginosa infection, with 0.9479 ± 0.0135 and 0.8686 ± 0.0422 accuracies, 0.9714 ± 0.0131, 0.9250 ± 0.0423 sensitivities, and 0.9288 ± 0.0306, 0.8639 ± 0.0276 positive predictive values, respectively. The ENN model showed better performance compared to SVM in the recognition of VAP with Pseudomonas aeruginosa infection. The areas under the receiver operating characteristic curve of the two models were 0.9842 ± 0.0058 and 0.9410 ± 0.0301, respectively, showing that both models are very stable and accurate classifiers. This study aims to assist the physician in providing a scientific and effective reference for performing early detection in Pseudomonas aeruginosa infection or other diseases.

Download Full-text

Intelligent Road Inspection with Advanced Machine Learning; Hybrid Prediction Models for Smart Mobility and Transportation Maintenance Systems

Energies ◽

10.3390/en13071718 ◽

2020 ◽

Vol 13 (7) ◽

pp. 1718 ◽

Cited By ~ 6

Author(s):

Nader Karballaeezadeh ◽

Farah Zaremotekhases ◽

Shahaboddin Shamshirband ◽

Amir Mosavi ◽

Narjes Nabipour ◽

...

Keyword(s):

Machine Learning ◽

Relative Error ◽

Intelligent Systems ◽

Prediction Models ◽

Rbf Neural Networks ◽

Learning Methods ◽

Machine Learning Methods ◽

Maintenance Systems ◽

Road Inspection ◽

Percent Relative Error

Prediction models in mobility and transportation maintenance systems have been dramatically improved by using machine learning methods. This paper proposes novel machine learning models for an intelligent road inspection. The traditional road inspection systems based on the pavement condition index (PCI) are often associated with the critical safety, energy and cost issues. Alternatively, the proposed models utilize surface deflection data from falling weight deflectometer (FWD) tests to predict the PCI. Machine learning methods are the single multi-layer perceptron (MLP) and radial basis function (RBF) neural networks as well as their hybrids, i.e., Levenberg–Marquardt (MLP-LM), scaled conjugate gradient (MLP-SCG), imperialist competitive (RBF-ICA), and genetic algorithms (RBF-GA). Furthermore, the committee machine intelligent systems (CMIS) method was adopted to combine the results and improve the accuracy of the modeling. The results of the analysis have been verified through using four criteria of average percent relative error (APRE), average absolute percent relative error (AAPRE), root mean square error (RMSE) and standard error (SE). The CMIS model outperforms other models with the promising results of APRE = 2.3303, AAPRE = 11.6768, RMSE = 12.0056 and SD = 0.0210.

Download Full-text