How to Learn from Others: Transfer Machine Learning with Additive Regression Models to Improve Sales Forecasting

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB

Download Full-text

Comparison study of Regression Models for the prediction of post-Graduation admissions using Machine Learning Techniques

2021 11th International Conference on Cloud Computing, Data Science & Engineering (Confluence) ◽

10.1109/confluence51648.2021.9377162 ◽

2021 ◽

Author(s):

Naveen S. Sapare ◽

Sahana M. Beelagi

Keyword(s):

Machine Learning ◽

Regression Models ◽

Machine Learning Techniques ◽

Comparison Study ◽

Learning Techniques

Download Full-text

Searching for improvements in predicting human eye colour from DNA

International Journal of Legal Medicine ◽

10.1007/s00414-021-02645-5 ◽

2021 ◽

Author(s):

Magdalena Kukla-Bartoszek ◽

Paweł Teisseyre ◽

Ewelina Pośpiech ◽

Joanna Karłowska-Pik ◽

Piotr Zieliński ◽

...

Keyword(s):

Machine Learning ◽

Regression Models ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Sequencing Analysis ◽

Learning Approaches ◽

Human Eye ◽

Software Analysis ◽

Whole Exome ◽

Eye Colour

AbstractIncreasing understanding of human genome variability allows for better use of the predictive potential of DNA. An obvious direct application is the prediction of the physical phenotypes. Significant success has been achieved, especially in predicting pigmentation characteristics, but the inference of some phenotypes is still challenging. In search of further improvements in predicting human eye colour, we conducted whole-exome (enriched in regulome) sequencing of 150 Polish samples to discover new markers. For this, we adopted quantitative characterization of eye colour phenotypes using high-resolution photographic images of the iris in combination with DIAT software analysis. An independent set of 849 samples was used for subsequent predictive modelling. Newly identified candidates and 114 additional literature-based selected SNPs, previously associated with pigmentation, and advanced machine learning algorithms were used. Whole-exome sequencing analysis found 27 previously unreported candidate SNP markers for eye colour. The highest overall prediction accuracies were achieved with LASSO-regularized and BIC-based selected regression models. A new candidate variant, rs2253104, located in the ARFIP2 gene and identified with the HyperLasso method, revealed predictive potential and was included in the best-performing regression models. Advanced machine learning approaches showed a significant increase in sensitivity of intermediate eye colour prediction (up to 39%) compared to 0% obtained for the original IrisPlex model. We identified a new potential predictor of eye colour and evaluated several widely used advanced machine learning algorithms in predictive analysis of this trait. Our results provide useful hints for developing future predictive models for eye colour in forensic and anthropological studies.

Download Full-text

Utilization of Blood Culture in South Asia for the Diagnosis and Treatment of Febrile Illness

Clinical Infectious Diseases ◽

10.1093/cid/ciaa1322 ◽

2020 ◽

Vol 71 (Supplement_3) ◽

pp. S266-S275

Author(s):

Caitlin Hemlock ◽

Stephen P Luby ◽

Shampa Saha ◽

Farah Qamar ◽

Jason R Andrews ◽

...

Keyword(s):

Blood Culture ◽

Regression Models ◽

Enteric Fever ◽

Febrile Illness ◽

Improve Patient Care ◽

Initial Therapy ◽

Linear Regression Models ◽

Current Standard ◽

Middle Income ◽

Additive Regression

Abstract Background Blood culture is the current standard for diagnosing bacteremic illnesses, yet it is not clear how physicians in many low- and middle-income countries utilize blood culture for diagnostic purposes and to inform treatment decisions. Methods We screened suspected enteric fever cases from 6 hospitals in Bangladesh, Nepal, and Pakistan, and enrolled patients if blood culture was prescribed by the treating physician. We used generalized additive regression models to analyze the probability of receiving blood culture by age, and linear regression models to analyze changes by month to the proportion of febrile cases prescribed a blood culture compared with the burden of febrile illness, stratified by hospital. We used logistic regression to analyze predictors for receiving antibiotics empirically. We descriptively reviewed changes in antibiotic therapy by susceptibility patterns and coverage, stratified by country. Results We screened 30 809 outpatients resulting in 1819 enteric fever cases; 1935 additional cases were enrolled from other hospital locations. Younger outpatients were less likely to receive a blood culture. The association between the number of febrile outpatients and the proportion prescribed blood culture varied by hospital. Antibiotics prescribed empirically were associated with severity and provisional diagnoses, but 31% (1147/3754) of enteric fever cases were not covered by initial therapy; this was highest in Pakistan (50%) as many isolates were resistant to cephalosporins, which were commonly prescribed empirically. Conclusions Understanding hospital-level communication between laboratories and physicians may improve patient care and timeliness of appropriate antibiotics, which is important considering the rise of antimicrobial resistance.

Download Full-text

Application of multi-linear regression models and machine learning techniques for online voltage stability margin estimation

2010 IREP Symposium Bulk Power System Dynamics and Control - VIII (IREP) ◽

10.1109/irep.2010.5563288 ◽

2010 ◽

Cited By ~ 3

Author(s):

Bruno Leonardi ◽

Venkataramana Ajjarapu ◽

Miodrag Djukanovic ◽

Pei Zhang

Keyword(s):

Machine Learning ◽

Linear Regression ◽

Regression Models ◽

Voltage Stability ◽

Stability Margin ◽

Machine Learning Techniques ◽

Linear Regression Models ◽

Voltage Stability Margin ◽

Learning Techniques ◽

Multi Linear Regression

Download Full-text

Semiparametric mixture of additive regression models

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2017.1310243 ◽

2017 ◽

Vol 47 (3) ◽

pp. 681-697 ◽

Cited By ~ 1

Author(s):

Yi Zhang ◽

Qingle Zheng

Keyword(s):

Regression Models ◽

Semiparametric Mixture ◽

Additive Regression

Download Full-text

45 Application of Machine Learning Models to Thermal Burn Patient Outcome Predictions in the Aftermath of a Nuclear Event

Journal of Burn Care & Research ◽

10.1093/jbcr/irab032.049 ◽

2021 ◽

Vol 42 (Supplement_1) ◽

pp. S33-S34

Author(s):

Morgan A Taylor ◽

Randy D Kearns ◽

Jeffrey E Carter ◽

Mark H Ebell ◽

Curt A Harris

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Length Of Stay ◽

Regression Models ◽

Large Scale ◽

Prediction Models ◽

Burn Patients ◽

Thermal Burn ◽

Logistic Regression Models ◽

Burn Patient

Abstract Introduction A nuclear disaster would generate an unprecedented volume of thermal burn patients from the explosion and subsequent mass fires (Figure 1). Prediction models characterizing outcomes for these patients may better equip healthcare providers and other responders to manage large scale nuclear events. Logistic regression models have traditionally been employed to develop prediction scores for mortality of all burn patients. However, other healthcare disciplines have increasingly transitioned to machine learning (ML) models, which are automatically generated and continually improved, potentially increasing predictive accuracy. Preliminary research suggests ML models can predict burn patient mortality more accurately than commonly used prediction scores. The purpose of this study is to examine the efficacy of various ML methods in assessing thermal burn patient mortality and length of stay in burn centers. Methods This retrospective study identified patients with fire/flame burn etiologies in the National Burn Repository between the years 2009 – 2018. Patients were randomly partitioned into a 67%/33% split for training and validation. A random forest model (RF) and an artificial neural network (ANN) were then constructed for each outcome, mortality and length of stay. These models were then compared to logistic regression models and previously developed prediction tools with similar outcomes using a combination of classification and regression metrics. Results During the study period, 82,404 burn patients with a thermal etiology were identified in the analysis. The ANN models will likely tend to overfit the data, which can be resolved by ending the model training early or adding additional regularization parameters. Further exploration of the advantages and limitations of these models is forthcoming as metric analyses become available. Conclusions In this proof-of-concept study, we anticipate that at least one ML model will predict the targeted outcomes of thermal burn patient mortality and length of stay as judged by the fidelity with which it matches the logistic regression analysis. These advancements can then help disaster preparedness programs consider resource limitations during catastrophic incidents resulting in burn injuries.

Download Full-text

Predicting Survived Events in Nontraumatic Out-of-Hospital Cardiac Arrest: A Comparison Study on Machine Learning and Regression Models

Journal of Emergency Medicine ◽

10.1016/j.jemermed.2021.07.058 ◽

2021 ◽

Author(s):

Yat Hei Lo ◽

Yuet Chung Axel Siu

Keyword(s):

Machine Learning ◽

Cardiac Arrest ◽

Regression Models ◽

Comparison Study ◽

Hospital Cardiac Arrest

Download Full-text

Metalearners for estimating heterogeneous treatment effects using machine learning

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1804597116 ◽

2019 ◽

Vol 116 (10) ◽

pp. 4156-4165 ◽

Cited By ~ 33

Author(s):

Sören R. Künzel ◽

Jasjeet S. Sekhon ◽

Peter J. Bickel ◽

Bin Yu

Keyword(s):

Machine Learning ◽

Treatment Effects ◽

Field Experiments ◽

Average Treatment Effect ◽

Regularity Conditions ◽

Heterogeneous Treatment Effects ◽

Lipschitz Continuous ◽

Additive Regression ◽

Underlying Mechanisms ◽

And Control

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies. We describe a number of metaalgorithms that can take advantage of any supervised learning or regression method in machine learning and statistics to estimate the conditional average treatment effect (CATE) function. Metaalgorithms build on base algorithms—such as random forests (RFs), Bayesian additive regression trees (BARTs), or neural networks—to estimate the CATE, a function that the base algorithms are not designed to estimate directly. We introduce a metaalgorithm, the X-learner, that is provably efficient when the number of units in one treatment group is much larger than in the other and can exploit structural properties of the CATE function. For example, if the CATE function is linear and the response functions in treatment and control are Lipschitz-continuous, the X-learner can still achieve the parametric rate under regularity conditions. We then introduce versions of the X-learner that use RF and BART as base learners. In extensive simulation studies, the X-learner performs favorably, although none of the metalearners is uniformly the best. In two persuasion field experiments from political science, we demonstrate how our X-learner can be used to target treatment regimes and to shed light on underlying mechanisms. A software package is provided that implements our methods.

Download Full-text

Optimal estimation in additive regression models

Bernoulli ◽

10.3150/bj/1145993975 ◽

2006 ◽

Vol 12 (2) ◽

pp. 271-298 ◽

Cited By ~ 29

Author(s):

Joel Horowitz, ◽

Jussi Klemelä ◽

Enno Mammen

Keyword(s):

Regression Models ◽

Optimal Estimation ◽

Additive Regression

Download Full-text