Development and assessment of a machine learning tool for predicting emergency admission in Scotland

Mapping Intimacies ◽

10.1101/2021.08.06.21261593 ◽

2021 ◽

Author(s):

James Liley ◽

Gergo Bohner ◽

Samuel R Emerson ◽

Bilal A Mateen ◽

Katie Borland ◽

...

Keyword(s):

Machine Learning ◽

Model Updating ◽

Predictive Accuracy ◽

Temporal Stability ◽

Emergency Admission ◽

Healthcare Setting ◽

Emergency Hospital Admission ◽

Scottish Population ◽

Substantial Risk ◽

Learning Score

Avoiding emergency hospital admission (EA) is advantageous to individual health and the healthcare system. We develop a statistical model estimating risk of EA for most of the Scottish population (>4.8M individuals) using electronic health records, such as hospital episodes and prescribing activity. We demonstrate good predictive accuracy (AUROC 0.80), calibration and temporal stability. We find strong prediction of respiratory and metabolic EA, show a substantial risk contribution from socioeconomic decile, and highlight an important problem in model updating. Our work constitutes a rare example of a population-scale machine learning score to be deployed in a healthcare setting.

Get full-text (via PubEx)

A feature-based hybrid recommender system for risk prediction : Machine learning approach (Preprint)

10.2196/preprints.11010 ◽

2020 ◽

Author(s):

Uzair Bhatti

Keyword(s):

Machine Learning ◽

Risk Prediction ◽

Predictive Accuracy ◽

Correct Diagnosis ◽

Recommendation Systems ◽

Data Integrity ◽

Machine Learning Algorithms ◽

Patient Counseling ◽

Hybrid Filtering ◽

Novel Algorithm

BACKGROUND In the era of health informatics, exponential growth of information generated by health information systems and healthcare organizations demands expert and intelligent recommendation systems. It has become one of the most valuable tools as it reduces problems such as information overload while selecting and suggesting doctors, hospitals, medicine, diagnosis etc according to patients’ interests. OBJECTIVE Recommendation uses Hybrid Filtering as one of the most popular approaches, but the major limitations of this approach are selectivity and data integrity issues.Mostly existing recommendation systems & risk prediction algorithms focus on a single domain, on the other end cross-domain hybrid filtering is able to alleviate the degree of selectivity and data integrity problems to a better extent. METHODS We propose a novel algorithm for recommendation & predictive model using KNN algorithm with machine learning algorithms and artificial intelligence (AI). We find the factors that directly impact on diseases and propose an approach for predicting the correct diagnosis of different diseases. We have constructed a series of models with good reliability for predicting different surgery complications and identified several novel clinical associations. We proposed a novel algorithm pr-KNN to use KNN for prediction and recommendation of diseases RESULTS Beside that we compared the performance of our algorithm with other machine algorithms and found better performance of our algorithm, with predictive accuracy improving by +3.61%. CONCLUSIONS The potential to directly integrate these predictive tools into EHRs may enable personalized medicine and decision-making at the point of care for patient counseling and as a teaching tool. CLINICALTRIAL dataset for the trials of patient attached

Get full-text (via PubEx)

Development of A Drug Early Warning System Model for Cardiac Arrest Using Deep Learning: Retrospective Cohort Study (Preprint)

10.2196/preprints.26783 ◽

2020 ◽

Author(s):

Hsiao-Ko Chang ◽

Hui-Chih Wang ◽

Chih-Fen Huang ◽

Feipei Lai

Keyword(s):

Machine Learning ◽

Time Series ◽

Cardiac Arrest ◽

Early Warning ◽

Time Series Data ◽

Predictive Accuracy ◽

Vital Signs ◽

Warning System ◽

Series Data ◽

Dynamic Time

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a Drug Early Warning System Model (DEWSM), it included drug injections and vital signs as this research important features. We use it to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose this new model for detecting cardiac arrest via drug classification and by using a sliding window; we apply learning-based algorithms to time-series data for a DEWSM. By treating drug features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model, we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits (intravenous therapy), and replenishers and regulators of water and electrolytes (fluid and electrolyte supplement). The best AUROC of bits is 85%, it means the medical expert suggest the drug features: bits, it will affect the vital signs, and then the evaluate this model correctly classified patients with CPR reach 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. It can be seen that the use of new AI technology will achieve better results, currently comparable to the accuracy of traditional common RF, and the LSTM model can be adjusted in the future to obtain better results. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. The National Early Warning Score (NEWS) only focuses on the score of vital signs, and does not include factors related to drug injections. In this study, the experimental results of adding the drug injections are better than only vital signs. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, we use traditional machine learning methods and deep learning (using LSTM method as the main processing time series data) as the basis for comparison of this research. The proposed DEWSM, which offers 4-hour predictions, is better than the NEWS in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.

Get full-text (via PubEx)

A novel multi-stage ensemble model with multiple K-means-based selective undersampling: An application in credit scoring

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201954 ◽

2021 ◽

Vol 40 (5) ◽

pp. 9471-9484

Author(s):

Yilun Jin ◽

Yanan Liu ◽

Wenyu Zhang ◽

Shuai Zhang ◽

Yu Lou

Keyword(s):

Machine Learning ◽

Predictive Accuracy ◽

Credit Scoring ◽

Imbalanced Data ◽

Ensemble Model ◽

Selective Sampling ◽

Machine Learning Methods ◽

Multi Stage ◽

Proposed Model ◽

New Feature

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.

Get full-text (via PubEx)

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Get full-text (via PubEx)

Prediction of cardiac arrest in critically ill patients presenting to the emergency department using a machine learning score incorporating heart rate variability compared with the modified early warning score

Critical Care ◽

10.1186/cc11396 ◽

2012 ◽

Vol 16 (3) ◽

pp. R108 ◽

Cited By ~ 44

Author(s):

Marcus Eng Hock Ong ◽

Christina Hui Lee Ng ◽

Ken Goh ◽

Nan Liu ◽

Zhi Koh ◽

...

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Heart Rate ◽

Heart Rate Variability ◽

Cardiac Arrest ◽

Critically Ill ◽

Early Warning ◽

Critically Ill Patients ◽

Early Warning Score ◽

Learning Score

Get full-text (via PubEx)

A Scalable Feature Selection and Model Updating Approach for Big Data Machine Learning

2016 IEEE International Conference on Smart Cloud (SmartCloud) ◽

10.1109/smartcloud.2016.32 ◽

2016 ◽

Cited By ~ 3

Author(s):

Baijian Yang ◽

Tonglin Zhang

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Big Data ◽

Model Updating

Get full-text (via PubEx)

Identifying neuroanatomical signatures of anorexia nervosa: a multivariate machine learning approach

Psychological Medicine ◽

10.1017/s0033291715000768 ◽

2015 ◽

Vol 45 (13) ◽

pp. 2805-2812 ◽

Cited By ~ 15

Author(s):

L. Lavagnino ◽

F. Amianto ◽

B. Mwangi ◽

F. D'Agata ◽

A. Spalatro ◽

...

Keyword(s):

Machine Learning ◽

Anorexia Nervosa ◽

Predictive Accuracy ◽

Third Ventricle ◽

Healthy Controls ◽

Drive For Thinness ◽

Individual Subject ◽

Machine Learning Approach ◽

Scan Data ◽

Selection Operator

BackgroundThere are currently no neuroanatomical biomarkers of anorexia nervosa (AN) available to make clinical inferences at an individual subject level. We present results of a multivariate machine learning (ML) approach utilizing structural neuroanatomical scan data to differentiate AN patients from matched healthy controls at an individual subject level.MethodStructural neuroimaging scans were acquired from 15 female patients with AN (age = 20, s.d. = 4 years) and 15 demographically matched female controls (age = 22, s.d. = 3 years). Neuroanatomical volumes were extracted using the FreeSurfer software and input into the Least Absolute Shrinkage and Selection Operator (LASSO) multivariate ML algorithm. LASSO was ‘trained’ to identify ‘novel’ individual subjects as either AN patients or healthy controls. Furthermore, the model estimated the probability that an individual subject belonged to the AN group based on an individual scan.ResultsThe model correctly predicted 25 out of 30 subjects, translating into 83.3% accuracy (sensitivity 86.7%, specificity 80.0%) (p < 0.001; χ2 test). Six neuroanatomical regions (cerebellum white matter, choroid plexus, putamen, accumbens, the diencephalon and the third ventricle) were found to be relevant in distinguishing individual AN patients from healthy controls. The predicted probabilities showed a linear relationship with drive for thinness clinical scores (r = 0.52, p < 0.005) and with body mass index (BMI) (r = −0.45, p = 0.01).ConclusionsThe model achieved a good predictive accuracy and drive for thinness showed a strong neuroanatomical signature. These results indicate that neuroimaging scans coupled with ML techniques have the potential to provide information at an individual subject level that might be relevant to clinical outcomes.

Get full-text (via PubEx)

Remote sensing inversion of water quality in coastal sea area based on machine learning: a case study of Shenzhen bay, China

10.5194/egusphere-egu21-1972 ◽

2021 ◽

Author(s):

Xiaotong Zhu ◽

Jinhui Jeanne Huang

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Water Quality ◽

Predictive Accuracy ◽

Water Environment ◽

Quality Parameters ◽

Machine Learning Algorithms ◽

Dynamic Monitoring ◽

Support Vector ◽

Seawater Quality

Remote sensing monitoring has the characteristics of wide monitoring range, celerity, low cost for long-term dynamic monitoring of water environment. With the flourish of artificial intelligence, machine learning has enabled remote sensing inversion of seawater quality to achieve higher prediction accuracy. However, due to the physicochemical property of the water quality parameters, the performance of algorithms differs a lot. In order to improve the predictive accuracy of seawater quality parameters, we proposed a technical framework to identify the optimal machine learning algorithms using Sentinel-2 satellite and in-situ seawater sample data. In the study, we select three algorithms, i.e. support vector regression (SVR), XGBoost and deep learning (DL), and four seawater quality parameters, i.e. dissolved oxygen (DO), total dissolved solids (TDS), turbidity(TUR) and chlorophyll-a (Chla). The results show that SVR is a more precise algorithm to inverse DO (R2 = 0.81). XGBoost has the best accuracy for Chla and Tur inversion (R2 = 0.75 and 0.78 respectively) while DL performs better in TDS (R2 =0.789). Overall, this research provides a theoretical support for high precision remote sensing inversion of offshore seawater quality parameters based on machine learning.

Get full-text (via PubEx)

Application of machine learning in predicting construction project profit in Ghana using Support Vector Regression Algorithm (SVRA)

Engineering Construction & Architectural Management ◽

10.1108/ecam-08-2020-0618 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Emmanuel Adinyira ◽

Emmanuel Akoi-Gyebi Adjei ◽

Kofi Agyekum ◽

Frank Desmond Kofi Fugar

Keyword(s):

Machine Learning ◽

Support Vector Regression ◽

Cash Flow ◽

Predictive Accuracy ◽

Model Development ◽

Construction Project ◽

Support Vector ◽

Sensitivity Index ◽

Content Type ◽

Hyperparameter Selection

PurposeKnowledge of the effect of various cash-flow factors on expected project profit is important to effectively manage productivity on construction projects. This study was conducted to develop and test the sensitivity of a Machine Learning Support Vector Regression Algorithm (SVRA) to predict construction project profit in Ghana.Design/methodology/approachThe study relied on data from 150 institutional projects executed within the past five years (2014–2018) in developing the model. Eighty percent (80%) of the data from the 150 projects was used at hyperparameter selection and final training phases of the model development and the remaining 20% for model testing. Using MATLAB for Support Vector Regression, the parameters available for tuning were the epsilon values, the kernel scale, the box constraint and standardisations. The sensitivity index was computed to determine the degree to which the independent variables impact the dependent variable.FindingsThe developed model's predictions perfectly fitted the data and explained all the variability of the response data around its mean. Average predictive accuracy of 73.66% was achieved with all the variables on the different projects in validation. The developed SVR model was sensitive to labour and loan.Originality/valueThe developed SVRA combines variation, defective works and labour with other financial constraints, which have been the variables used in previous studies. It will aid contractors in predicting profit on completion at commencement and also provide information on the effect of changes to cash-flow factors on profit.

Get full-text (via PubEx)

A proof-of-concept study applying machine learning methods to putative risk factors for eating disorders: results from the multi-centre European project on healthy eating

Psychological Medicine ◽

10.1017/s003329172100489x ◽

2021 ◽

pp. 1-10

Author(s):

I. Krug ◽

J. Linardon ◽

C. Greenwood ◽

G. Youssef ◽

J. Treasure ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Logistic Regression ◽

Predictive Accuracy ◽

Area Under The Curve ◽

Prediction Rule ◽

Predictive Performance ◽

Individual Risk ◽

European Project ◽

Wide Range

Abstract Background Despite a wide range of proposed risk factors and theoretical models, prediction of eating disorder (ED) onset remains poor. This study undertook the first comparison of two machine learning (ML) approaches [penalised logistic regression (LASSO), and prediction rule ensembles (PREs)] to conventional logistic regression (LR) models to enhance prediction of ED onset and differential ED diagnoses from a range of putative risk factors. Method Data were part of a European Project and comprised 1402 participants, 642 ED patients [52% with anorexia nervosa (AN) and 40% with bulimia nervosa (BN)] and 760 controls. The Cross-Cultural Risk Factor Questionnaire, which assesses retrospectively a range of sociocultural and psychological ED risk factors occurring before the age of 12 years (46 predictors in total), was used. Results All three statistical approaches had satisfactory model accuracy, with an average area under the curve (AUC) of 86% for predicting ED onset and 70% for predicting AN v. BN. Predictive performance was greatest for the two regression methods (LR and LASSO), although the PRE technique relied on fewer predictors with comparable accuracy. The individual risk factors differed depending on the outcome classification (EDs v. non-EDs and AN v. BN). Conclusions Even though the conventional LR performed comparably to the ML approaches in terms of predictive accuracy, the ML methods produced more parsimonious predictive models. ML approaches offer a viable way to modify screening practices for ED risk that balance accuracy against participant burden.

Get full-text (via PubEx)