Development and validation of a predictive model for 90-day readmission following elective spine surgery

OBJECTIVEHospital readmissions lead to a significant increase in the total cost of care in patients undergoing elective spine surgery. Understanding factors associated with an increased risk of postoperative readmission could facilitate a reduction in such occurrences. The aims of this study were to develop and validate a predictive model for 90-day hospital readmission following elective spine surgery.METHODSAll patients undergoing elective spine surgery for degenerative disease were enrolled in a prospective longitudinal registry. All 90-day readmissions were prospectively recorded. For predictive modeling, all covariates were selected by choosing those variables that were significantly associated with readmission and by incorporating other relevant variables based on clinical intuition and the Akaike information criterion. Eighty percent of the sample was randomly selected for model development and 20% for model validation. Multiple logistic regression analysis was performed with Bayesian model averaging (BMA) to model the odds of 90-day readmission. Goodness of fit was assessed via the C-statistic, that is, the area under the receiver operating characteristic curve (AUC), using the training data set. Discrimination (predictive performance) was assessed using the C-statistic, as applied to the 20% validation data set.RESULTSA total of 2803 consecutive patients were enrolled in the registry, and their data were analyzed for this study. Of this cohort, 227 (8.1%) patients were readmitted to the hospital (for any cause) within 90 days postoperatively. Variables significantly associated with an increased risk of readmission were as follows (OR [95% CI]): lumbar surgery 1.8 [1.1–2.8], government-issued insurance 2.0 [1.4–3.0], hypertension 2.1 [1.4–3.3], prior myocardial infarction 2.2 [1.2–3.8], diabetes 2.5 [1.7–3.7], and coagulation disorder 3.1 [1.6–5.8]. These variables, in addition to others determined a priori to be clinically relevant, comprised 32 inputs in the predictive model constructed using BMA. The AUC value for the training data set was 0.77 for model development and 0.76 for model validation.CONCLUSIONSIdentification of high-risk patients is feasible with the novel predictive model presented herein. Appropriate allocation of resources to reduce the postoperative incidence of readmission may reduce the readmission rate and the associated health care costs.

Download Full-text

Keep it simple - A case study of model development in the context of the Dynamic Stocks and Flows (DSF) task

Journal of Artificial General Intelligence ◽

10.2478/v10229-011-0008-2 ◽

2010 ◽

Vol 2 (2) ◽

pp. 38-51 ◽

Cited By ~ 1

Author(s):

Marc Halbrügge

Keyword(s):

Goodness Of Fit ◽

Model Development ◽

Cognitive Model ◽

Training Data ◽

Sequence Matching ◽

Data Set ◽

Depth Analysis ◽

Stocks And Flows ◽

Matching Techniques

Keep it simple - A case study of model development in the context of the Dynamic Stocks and Flows (DSF) taskThis paper describes the creation of a cognitive model submitted to the ‘Dynamic Stocks and Flows’ (DSF) modeling challenge. This challenge aims at comparing computational cognitive models for human behavior during an open ended control task. Participants in the modeling competition were provided with a simulation environment and training data for benchmarking their models while the actual specification of the competition task was withheld. To meet this challenge, the cognitive model described here was designed and optimized for generalizability. Only two simple assumptions about human problem solving were used to explain the empirical findings of the training data. In-depth analysis of the data set prior to the development of the model led to the dismissal of correlations or other parametric statistics as goodness-of-fit indicators. A new statistical measurement based on rank orders and sequence matching techniques is being proposed instead. This measurement, when being applied to the human sample, also identifies clusters of subjects that use different strategies for the task. The acceptability of the fits achieved by the model is verified using permutation tests.

Download Full-text

Prospective Validation of a Predictive Model for Early Anemia in Patients Receiving Cancer Chemotherapy.

Blood ◽

10.1182/blood.v108.11.460.460 ◽

2006 ◽

Vol 108 (11) ◽

pp. 460-460

Author(s):

Gary H. Lyman ◽

David C. Dale ◽

Nicole M. Kuderer ◽

Debra A. Wolff ◽

Eva Culakova ◽

...

Keyword(s):

Predictive Model ◽

Odds Ratio ◽

Risk Model ◽

Diagnostic Odds Ratio ◽

Performance Characteristics ◽

Model Fit ◽

Lower Sensitivity ◽

Cancer Type ◽

C Statistic ◽

Increased Risk

Abstract Anemia represents the most common hematological toxicity in cancer patients receiving chemotherapy and is associated with considerable morbidity and cost. ASH/ASCO guidelines call for intervention at a hemoglobin (hgb)<10 g/dL. A meta-analysis has demonstrated the clinical value of early (hgb≥10 g/dL) versus late (hgb<10 g/dL) intervention with an erythroid stimulating protein (ESP). An anemia predictive model may help guide intervention sufficiently early in the course of chemotherapy when it can be most effective. A prospective, nationwide study was undertaken to develop and validate risk models for hematologic toxicities of chemotherapy. The analysis presented here is based on 3,640 patients with cancer of the breast, lung, colon and ovary or malignant lymphoma receiving a new regimen prospectively registered at 117 randomly selected U.S. practices. A logistic regression model for hgb<10 g/dL was developed and validated using a 2:1 random selection split sample methodology. Predictive performance characteristics were estimated [±95% CL]. Nadir hgb over 4 cycles of chemotherapy was <8 g/dL in 113 (3%), 8–10 in 959 (26%), 10–12 in 1,847 (51%), and ≥12 in 721 (20%). No significant differences were observed between the two populations. Independent risk factors for nadir hgb<10 g/dL (ORs) were: female gender (1.66); ECOG >1 (1.70); CHF (1.54); history of vascular disease (2.66); ulcer disease (2.58); COPD (1.29); connective tissue disease (1.84); advanced cancer stage (1.19); cancer type and chemotherapy based on anthracyclines (2.15), carboplatin (2.40), gemcitabine (2.48), cyclophosphamide (1.60), etoposide (2.84), topotecan (4.21), or trastuzumab (1.43), planned cycle length >1 week (2.0), while normal baseline hemoglobin, platelet count and GFR were associated with a reduced risk. Model fit was good (P<.001), R2 = 0.35 and c-statistic = 0.81 [.79–.83, P<.0001]. Mean and median predicted risk for hgb<10 g/dL were 0.29 and 0.22, respectively. An increasing risk cutpoint was associated with lower sensitivity and higher specificity. In the highest risk half, quarter and quintile of patients, hgb<10 g/dL was experienced by 47% [45–50], 64% [60–68], and 70% [65–74], respectively. Model performance characteristics at the median risk included: sensitivity: 82% [78–84]; specificity: 64% [62–66]; and diagnostic odds ratio: 7.80 [6.28–9.68]. Most covariates significant in the derivation model remained significant in the validation population. Model fit was good [P<.001] with an R2=.40 and a c-statistic of 0.83 [.81–.86; P<.001]. In the highest risk half, quarter and quintile of patients, hgb<10 g/dL was experienced by 50% [46–54], 67% [62–71], and 70% [65–75], respectively. Test performance of the validation model at the median risk included: sensitivity of 83% [79–86], specificity of 62% [59–66], and a diagnostic odds ratio of 7.90 [5.84–10.69]. Based on good performance characteristics, this validated prediction model identified chemotherapy patients at increased risk for developing clinically significant anemia who may be candidates for early targeted intervention with an ESP. A conditional risk model for subsequent risk of hgb<10 g/dL which includes changes during cycle 1 of chemotherapy has also been developed and will be presented.

Download Full-text

Development and validation of an algorithm to estimate the risk of severe complications of COVID-19: a retrospective cohort study in primary care in the Netherlands

BMJ Open ◽

10.1136/bmjopen-2021-050059 ◽

2021 ◽

Vol 11 (12) ◽

pp. e050059

Author(s):

Ron M C Herings ◽

Karin M A Swart ◽

Bernard A M van der Zeijst ◽

Amber A van der Heijden ◽

Koos van der Velden ◽

...

Keyword(s):

Cohort Study ◽

Target Population ◽

Population Based ◽

Community Dwelling ◽

Training Data ◽

Prediction Algorithm ◽

Data Set ◽

C Statistic ◽

Confirmation Test ◽

Validation Set

ObjectiveTo develop an algorithm (sCOVID) to predict the risk of severe complications of COVID-19 in a community-dwelling population to optimise vaccination scenarios.DesignPopulation-based cohort study.Setting264 Dutch general practices contributing to the NL-COVID database.Participants6074 people aged 0–99 diagnosed with COVID-19.Main outcomesSevere complications (hospitalisation, institutionalisation, death). The algorithm was developed from a training data set comprising 70% of the patients and validated in the remaining 30%. Potential predictor variables included age, sex, chronic comorbidity score (CCS) based on risk factors for COVID-19 complications, obesity, neighbourhood deprivation score (NDS), first or second COVID-19 wave and confirmation test. Six population vaccination scenarios were explored: (1) random (naive), (2) random for persons above 60 years (60plus), (3) oldest patients first in age band of 5 years (oldest first), (4) target population of the annual influenza vaccination programme (influenza), (5) those 25–65 years of age first (worker), and (6) risk based using the prediction algorithm (sCOVID).ResultsSevere complications were reported in 243 (4.8%) people with 59 (20.3%) nursing home admissions, 181 (62.2%) hospitalisations and 51 (17.5%) deaths. The algorithm included age, sex, CCS, NDS, wave and confirmation test (c-statistic=0.91, 95% CI 0.88 to 0.94) in the validation set. Applied to different vaccination scenarios, the proportion of people needed to be vaccinated to reach a 50% reduction of severe complications was 67.5%, 50.0%, 26.1%, 16.0%, 10.0% and 8.4% for the worker, naive, influenza, 60plus, oldest first and sCOVID scenarios, respectively.ConclusionThe sCOVID algorithm performed well to predict the risk of severe complications of COVID-19 in the first and second waves of COVID-19 infections in this Dutch population. The regression estimates can and need to be adjusted for future predictions. The algorithm can be applied to identify persons with highest risks from data in the electronic health records of general practitioners (GPs).

Download Full-text

Non-removal strategy for outliers in predictive models: The PAELLA algorithm case

Logic Journal of IGPL ◽

10.1093/jigpal/jzz052 ◽

2019 ◽

Vol 28 (4) ◽

pp. 418-429

Author(s):

Manuel Castejón-limas ◽

Hector Alaiz-Moreton ◽

Laura Fernández-Robles ◽

Javier Alfonso-Cendón ◽

Camino Fernández-Llamas ◽

...

Keyword(s):

Predictive Model ◽

Robust Regression ◽

Training Data ◽

Data Set ◽

Outlier Identification ◽

Sampling Process ◽

Series Of Experiments ◽

Fourth Experiment ◽

Fitting In ◽

The Impact

Abstract This paper reports the experience of using the PAELLA algorithm as a helper tool in robust regression instead of as originally intended for outlier identification and removal. This novel usage of the algorithm takes advantage of the occurrence vector calculated by the algorithm in order to strengthen the effect of the more reliable samples and lessen the impact of those that otherwise would be considered outliers. Following that aim, a series of experiments is conducted in order to learn how to better use the information contained in the occurrence vector. Using a contrively difficult artificial data set, a reference predictive model is fit using the whole raw dataset. The second experiment reports the results of fitting a similar predictive model but discarding the samples marked as outliers by PAELLA. The third experiment uses the occurrence vector provided by PAELLA in order to classify the observations in multiple bins and fit every possible model changing which bins are considered for fitting and which are discarded in that particular model. The fourth experiment introduces a sampling process before fitting in which the occurrence vector represents the likelihood of being considered in the training data set. The fifth experiment considers the sampling process as an internal step to be performed interleaved between the training epochs. The last experiment compares our approach using weighted neural networks to a state of the art method.

Download Full-text

A machine learning-based treatment prediction model using whole genome variants of hepatitis C virus

PLoS ONE ◽

10.1371/journal.pone.0242028 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0242028

Author(s):

Hiroaki Haga ◽

Hidenori Sato ◽

Ayumi Koseki ◽

Takafumi Saito ◽

Kazuo Okumoto ◽

...

Keyword(s):

Machine Learning ◽

Hepatitis C Virus ◽

Hepatitis C ◽

Prediction Model ◽

Predictive Model ◽

Machine Learning Algorithms ◽

Training Data ◽

Whole Genome ◽

Validation Data ◽

Data Set

In recent years, the development of diagnostics using artificial intelligence (AI) has been remarkable. AI algorithms can go beyond human reasoning and build diagnostic models from a number of complex combinations. Using next-generation sequencing technology, we identified hepatitis C virus (HCV) variants resistant to directing-acting antivirals (DAA) by whole genome sequencing of full-length HCV genomes, and applied these variants to various machine-learning algorithms to evaluate a preliminary predictive model. HCV genomic RNA was extracted from serum from 173 patients (109 with subsequent sustained virological response [SVR] and 64 without) before DAA treatment. HCV genomes from the 109 SVR and 64 non-SVR patients were randomly divided into a training data set (57 SVR and 29 non-SVR) and a validation-data set (52 SVR and 35 non-SVR). The training data set was subject to nine machine-learning algorithms selected to identify the optimized combination of functional variants in relation to SVR status following DAA therapy. Subsequently, the prediction model was tested by the validation-data set. The most accurate learning method was the support vector machine (SVM) algorithm (validation accuracy, 0.95; kappa statistic, 0.90; F-value, 0.94). The second-most accurate learning algorithm was Multi-layer perceptron. Unfortunately, Decision Tree, and Naive Bayes algorithms could not be fitted with our data set due to low accuracy (< 0.8). Conclusively, with an accuracy rate of 95.4% in the generalization performance evaluation, SVM was identified as the best algorithm. Analytical methods based on genomic analysis and the construction of a predictive model by machine-learning may be applicable to the selection of the optimal treatment for other viral infections and cancer.

Download Full-text

Representing Small Commercial Building Faults in EnergyPlus, Part I: Model Development

Buildings ◽

10.3390/buildings9110233 ◽

2019 ◽

Vol 9 (11) ◽

pp. 233 ◽

Cited By ~ 3

Author(s):

Janghyun Kim ◽

Stephen Frank ◽

James E. Braun ◽

David Goldwasser

Keyword(s):

Model Development ◽

Cost Effective ◽

Research Area ◽

Commercial Buildings ◽

Fault Detection And Diagnosis ◽

Machine Learning Algorithms ◽

Training Data ◽

Fault Models ◽

Commercial Building ◽

Data Set

Small commercial buildings (those with less than approximately 1000 m2 of total floor area) often do not have access to cost-effective automated fault detection and diagnosis (AFDD) tools for maintaining efficient building operations. AFDD tools based on machine-learning algorithms hold promise for lowering cost barriers for AFDD in small commercial buildings; however, such algorithms require access to high-quality training data that is often difficult to obtain. To fill the gap in this research area, this study covers the development (Part I) and validation (Part II) of fault models that can be used with the building energy modeling software EnergyPlus® and OpenStudio® to generate a cost-effective training data set for developing AFDD algorithms. Part I (this paper) presents a library of fault models, including detailed descriptions of each fault model structure and their implementation with EnergyPlus. This paper also discusses a case study of training data set generation, representing an actual building.

Download Full-text

Cloning Safe Driving Behavior for Self-Driving Cars using Convolutional Neural Networks

Recent Patents on Computer Science ◽

10.2174/2213275911666181106160002 ◽

2019 ◽

Vol 12 (2) ◽

pp. 120-127 ◽

Cited By ~ 5

Author(s):

Wael Farag

Keyword(s):

Gradient Descent ◽

Autonomous Driving ◽

Driving Behavior ◽

Training Data ◽

Stochastic Gradient Descent ◽

Data Set ◽

Safe Driving ◽

Processing Pipeline ◽

Self Driving Cars ◽

And Training

Background: In this paper, a Convolutional Neural Network (CNN) to learn safe driving behavior and smooth steering manoeuvring, is proposed as an empowerment of autonomous driving technologies. The training data is collected from a front-facing camera and the steering commands issued by an experienced driver driving in traffic as well as urban roads. Methods: This data is then used to train the proposed CNN to facilitate what it is called “Behavioral Cloning”. The proposed Behavior Cloning CNN is named as “BCNet”, and its deep seventeen-layer architecture has been selected after extensive trials. The BCNet got trained using Adam’s optimization algorithm as a variant of the Stochastic Gradient Descent (SGD) technique. Results: The paper goes through the development and training process in details and shows the image processing pipeline harnessed in the development. Conclusion: The proposed approach proved successful in cloning the driving behavior embedded in the training data set after extensive simulations.

Download Full-text

Comparative Analysis of Machine Learning Techniques Using Predictive Modeling

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904164539 ◽

2020 ◽

Vol 13 ◽

Author(s):

Ritu Khandelwal ◽

Hemlata Goyal ◽

Rajveer Singh Shekhawat

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Data Science ◽

Training Data ◽

Machine Learning Techniques ◽

Future Trends ◽

Data Set ◽

Learning Stage ◽

Learning Techniques ◽

Different Types

Introduction: Machine learning is an intelligent technology that works as a bridge between businesses and data science. With the involvement of data science, the business goal focuses on findings to get valuable insights on available data. The large part of Indian Cinema is Bollywood which is a multi-million dollar industry. This paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Machine Learning techniques (classification and prediction) will be applied. To make classifier or prediction model first step is the learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. Methods: All the techniques related to classification and Prediction such as Support Vector Machine(SVM), Random Forest, Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, and Evaluate. Result: To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations Conclusion: This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the best possible model for predicting the movie Success. By using Advertisement Propaganda, they can plan for the best time to release the movie according to the predicted success rate to gain higher benefits. Discussion: Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems that come in business and helps to predict the forthcoming trends. This Prediction can help Production Houses for Advertisement Propaganda and also they can plan their costs and by assuring these factors they can make the movie more profitable.

Download Full-text

Building Damage Detection from Post-Event Aerial Imagery Using Single Shot Multibox Detector

Applied Sciences ◽

10.3390/app9061128 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1128 ◽

Cited By ~ 12

Author(s):

Yundong Li ◽

Wei Hu ◽

Han Dong ◽

Xueyan Zhang

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Hurricane Sandy ◽

Training Data ◽

Aerial Images ◽

Detection Methods ◽

Single Shot ◽

Data Set ◽

Augmentation Strategies ◽

Post Disaster

Using aerial cameras, satellite remote sensing or unmanned aerial vehicles (UAV) equipped with cameras can facilitate search and rescue tasks after disasters. The traditional manual interpretation of huge aerial images is inefficient and could be replaced by machine learning-based methods combined with image processing techniques. Given the development of machine learning, researchers find that convolutional neural networks can effectively extract features from images. Some target detection methods based on deep learning, such as the single-shot multibox detector (SSD) algorithm, can achieve better results than traditional methods. However, the impressive performance of machine learning-based methods results from the numerous labeled samples. Given the complexity of post-disaster scenarios, obtaining many samples in the aftermath of disasters is difficult. To address this issue, a damaged building assessment method using SSD with pretraining and data augmentation is proposed in the current study and highlights the following aspects. (1) Objects can be detected and classified into undamaged buildings, damaged buildings, and ruins. (2) A convolution auto-encoder (CAE) that consists of VGG16 is constructed and trained using unlabeled post-disaster images. As a transfer learning strategy, the weights of the SSD model are initialized using the weights of the CAE counterpart. (3) Data augmentation strategies, such as image mirroring, rotation, Gaussian blur, and Gaussian noise processing, are utilized to augment the training data set. As a case study, aerial images of Hurricane Sandy in 2012 were maximized to validate the proposed method’s effectiveness. Experiments show that the pretraining strategy can improve of 10% in terms of overall accuracy compared with the SSD trained from scratch. These experiments also demonstrate that using data augmentation strategies can improve mAP and mF1 by 72% and 20%, respectively. Finally, the experiment is further verified by another dataset of Hurricane Irma, and it is concluded that the paper method is feasible.

Download Full-text

2D–3D reconstruction of distal forearm bone from actual X-ray images of the wrist using convolutional neural networks

Scientific Reports ◽

10.1038/s41598-021-94634-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Ryoya Shiode ◽

Mototaka Kabashima ◽

Yuta Hiasa ◽

Kunihiro Oka ◽

Tsuyoshi Murase ◽

...

Keyword(s):

Wrist Joint ◽

High Accuracy ◽

Training Data ◽

Small Data ◽

Data Set ◽

Accuracy Estimation ◽

X Ray ◽

Learning Network ◽

Forearm Bone ◽

Deep Learning Network

AbstractThe purpose of the study was to develop a deep learning network for estimating and constructing highly accurate 3D bone models directly from actual X-ray images and to verify its accuracy. The data used were 173 computed tomography (CT) images and 105 actual X-ray images of a healthy wrist joint. To compensate for the small size of the dataset, digitally reconstructed radiography (DRR) images generated from CT were used as training data instead of actual X-ray images. The DRR-like images were generated from actual X-ray images in the test and adapted to the network, and high-accuracy estimation of a 3D bone model from a small data set was possible. The 3D shape of the radius and ulna were estimated from actual X-ray images with accuracies of 1.05 ± 0.36 and 1.45 ± 0.41 mm, respectively.

Download Full-text