scholarly journals SAT0587 MACHINE-LEARNING DERIVED ALGORITHMS FOR OUTCOMES PREDICTION IN RHEUMATIC DISEASES: APPLICATION TO RADIOGRAPHIC PROGRESSION IN EARLY AXIAL SPONDYLOARTHRITIS

2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 1252.2-1253
Author(s):  
R. Garofoli ◽  
M. Resche-Rigon ◽  
M. Dougados ◽  
D. Van der Heijde ◽  
C. Roux ◽  
...  

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB

Author(s):  
Mark David Walker ◽  
Mihály Sulyok

Abstract Background Restrictions on social interaction and movement were implemented by the German government in March 2020 to reduce the transmission of coronavirus disease 2019 (COVID-19). Apple's “Mobility Trends” (AMT) data details levels of community mobility; it is a novel resource of potential use to epidemiologists. Objective The aim of the study is to use AMT data to examine the relationship between mobility and COVID-19 case occurrence for Germany. Is a change in mobility apparent following COVID-19 and the implementation of social restrictions? Is there a relationship between mobility and COVID-19 occurrence in Germany? Methods AMT data illustrates mobility levels throughout the epidemic, allowing the relationship between mobility and disease to be examined. Generalized additive models (GAMs) were established for Germany, with mobility categories, and date, as explanatory variables, and case numbers as response. Results Clear reductions in mobility occurred following the implementation of movement restrictions. There was a negative correlation between mobility and confirmed case numbers. GAM using all three categories of mobility data accounted for case occurrence as well and was favorable (AIC or Akaike Information Criterion: 2504) to models using categories separately (AIC with “driving,” 2511. “transit,” 2513. “walking,” 2508). Conclusion These results suggest an association between mobility and case occurrence. Further examination of the relationship between movement restrictions and COVID-19 transmission may be pertinent. The study shows how new sources of online data can be used to investigate problems in epidemiology.


2019 ◽  
Author(s):  
Fred Hohman ◽  
Arjun Srinivasan ◽  
Steven M. Drucker

While machine learning (ML) continues to find success in solving previously-thought hard problems, interpreting and exploring ML models remains challenging. Recent work has shown that visualizations are a powerful tool to aid debugging, analyzing, and interpreting ML models. However, depending on the complexity of the model (e.g., number of features), interpreting these visualizations can be difficult and may require additional expertise. Alternatively, textual descriptions, or verbalizations, can be a simple, yet effective way to communicate or summarize key aspects about a model, such as the overall trend in a model’s predictions or comparisons between pairs of data instances. With the potential benefits of visualizations and verbalizations in mind, we explore how the two can be combined to aid ML interpretability. Specifically, we present a prototype system, TeleGam, that demonstrates how visualizations and verbalizations can collectively support interactive exploration of ML models, for example, generalized additive models (GAMs). We describe TeleGam’s interface and underlying heuristics to generate the verbalizations. We conclude by discussing how TeleGam can serve as a platform to conduct future studies for understanding user expectations and designing novel interfaces for interpretable ML.


2021 ◽  
Author(s):  
Francesco Battaglioli ◽  
Pieter Groenemeijer ◽  
Tomas Pucik ◽  
Uwe Ulbrich ◽  
Henning Rust ◽  
...  

&lt;p&gt;Convective hazards such as large hail, severe wind gusts, tornadoes, and heavy rainfall cause high economic damages, fatalities, and injuries across Europe.&amp;#160;There are insufficient observations to determine whether trends in such local phenomena exist, however recent studies suggest that the conditions supporting such hazards have become more frequent across large parts of Europe in recent decades.&lt;/p&gt;&lt;p&gt;We model the occurrence of these hazards using Generalized Additive Models (GAM) to investigate the existence of such long-term trends, and to enable objective probabilistic forecasts of these hazards. The models are trained with storm reports from the European Severe Weather Database (ESWD), lightning observations from the EUCLID network, and predictor parameters derived from the ERA5 reanalysis. Our work is based on the framework AR-CHaMo (Additive Regression Convective Hazard Models), previously developed at ESSL.&lt;/p&gt;&lt;p&gt;Preliminary results include a spatial depiction of the environmental conditions giving rise to convective hazards at a higher resolution than was possible before. The skill of hail models developed using AR-CHaMo has been shown to be superior to composite parameters used by weather forecasters for the occurrence of large hail, such as the Supercell Composite Parameter (SCP) and the Significant Hail Parameter (SHP). Likewise, for tornadoes, more skillful models can be constructed using the AR-CHaMo framework than predictors such as the Significant Tornado Parameter (STP).&lt;/p&gt;&lt;p&gt;The developed models have use both in climate studies and in medium-range severe weather forecasting. We will report on initial efforts to detect long term (1979-2019) trends of convective hazards and present how these models can be used to support severe weather forecasting using medium-range ensemble forecasts.&lt;/p&gt;


2016 ◽  
Vol 20 (7) ◽  
pp. 2611-2628 ◽  
Author(s):  
Julie E. Shortridge ◽  
Seth D. Guikema ◽  
Benjamin F. Zaitchik

Abstract. In the past decade, machine learning methods for empirical rainfall–runoff modeling have seen extensive development and been proposed as a useful complement to physical hydrologic models, particularly in basins where data to support process-based models are limited. However, the majority of research has focused on a small number of methods, such as artificial neural networks, despite the development of multiple other approaches for non-parametric regression in recent years. Furthermore, this work has often evaluated model performance based on predictive accuracy alone, while not considering broader objectives, such as model interpretability and uncertainty, that are important if such methods are to be used for planning and management decisions. In this paper, we use multiple regression and machine learning approaches (including generalized additive models, multivariate adaptive regression splines, artificial neural networks, random forests, and M5 cubist models) to simulate monthly streamflow in five highly seasonal rivers in the highlands of Ethiopia and compare their performance in terms of predictive accuracy, error structure and bias, model interpretability, and uncertainty when faced with extreme climate conditions. While the relative predictive performance of models differed across basins, data-driven approaches were able to achieve reduced errors when compared to physical models developed for the region. Methods such as random forests and generalized additive models may have advantages in terms of visualization and interpretation of model structure, which can be useful in providing insights into physical watershed function. However, the uncertainty associated with model predictions under extreme climate conditions should be carefully evaluated, since certain models (especially generalized additive models and multivariate adaptive regression splines) become highly variable when faced with high temperatures.


2019 ◽  
Author(s):  
Duarte S. Viana ◽  
Petr Keil ◽  
Alienor Jeliazkov

AbstractCommunity ecologists and macroecologists have long sought to evaluate the importance of environmental conditions and other drivers in determining species composition across sites. Different methods have been used to estimate species-environment relationships while accounting for or partitioning the variation attributed to environment and spatial autocorrelation, but their differences and respective reliability remain poorly known. We compared the performance of four families of statistical methods in estimating the contribution of the environment and space to explain variation in multi-species occurrence and abundance. These methods included distance-based regression (MRM), constrained ordination (RDA and CCA), generalised linear and additive models (GLM, GAM), and treebased machine learning (regression trees, boosted regression trees, and random forests). Depending on the method, the spatial model consisted of either Moran’s Eigenvector Maps (MEM; in constrained ordination and GLM), smooth spatial splines (in GAM), or tree-based non-linear modelling of spatial coordinates (in machine learning). We simulated typical ecological data to assess the methods’ performance in (1) fitting environmental and spatial effects, and (2) partitioning the variation explained by the environmental and spatial effects. Differences in the fitting performance among major model types – (G)LM, GAM, machine learning – were reflected in the variation partitioning performance of the different methods. Machine learning methods, namely boosted regression trees, performed overall better. GAM performed similarly well, though likelihood optimisation did not converge for some empirical test data. The remaining methods performed worse under most simulated data variations (depending on the type of species data, sample size and coverage, autocorrelation range, and response shape). Our results suggest that tree-based machine learning is a robust approach that can be widely used for variation partitioning. Our recommendations apply to single-species niche models, community ecology, and macroecology studies aiming at disentangling the relative contributions of space vs. environment and other drivers of variation in site-by-species matrices.


Sign in / Sign up

Export Citation Format

Share Document