TeleGam: Combining Visualization and Verbalization for Interpretable Machine Learning

While machine learning (ML) continues to find success in solving previously-thought hard problems, interpreting and exploring ML models remains challenging. Recent work has shown that visualizations are a powerful tool to aid debugging, analyzing, and interpreting ML models. However, depending on the complexity of the model (e.g., number of features), interpreting these visualizations can be difficult and may require additional expertise. Alternatively, textual descriptions, or verbalizations, can be a simple, yet effective way to communicate or summarize key aspects about a model, such as the overall trend in a model’s predictions or comparisons between pairs of data instances. With the potential benefits of visualizations and verbalizations in mind, we explore how the two can be combined to aid ML interpretability. Specifically, we present a prototype system, TeleGam, that demonstrates how visualizations and verbalizations can collectively support interactive exploration of ML models, for example, generalized additive models (GAMs). We describe TeleGam’s interface and underlying heuristics to generate the verbalizations. We conclude by discussing how TeleGam can serve as a platform to conduct future studies for understanding user expectations and designing novel interfaces for interpretable ML.

Download Full-text

Interpretable Machine Learning with Bitonic Generalized Additive Models and Automatic Feature Construction

Discovery Science - Lecture Notes in Computer Science ◽

10.1007/978-3-030-61527-7_26 ◽

2020 ◽

pp. 386-402

Author(s):

Noëlie Cherrier ◽

Michael Mayo ◽

Jean-Philippe Poli ◽

Maxime Defurne ◽

Franck Sabatié

Keyword(s):

Machine Learning ◽

Generalized Additive Models ◽

Additive Models ◽

Feature Construction ◽

Interpretable Machine Learning

Download Full-text

Three-dimensional modelling of alteration zones based on geochemical exploration data: An interpretable machine-learning approach via generalized additive models

Applied Geochemistry ◽

10.1016/j.apgeochem.2020.104781 ◽

2020 ◽

Vol 123 ◽

pp. 104781

Author(s):

Jin Chen ◽

Xiancheng Mao ◽

Hao Deng ◽

Zhankun Liu ◽

Qi Wang

Keyword(s):

Machine Learning ◽

Generalized Additive Models ◽

Three Dimensional ◽

Additive Models ◽

Learning Approach ◽

Geochemical Exploration ◽

Alteration Zones ◽

Interpretable Machine Learning ◽

Machine Learning Approach

Download Full-text

SAT0587 MACHINE-LEARNING DERIVED ALGORITHMS FOR OUTCOMES PREDICTION IN RHEUMATIC DISEASES: APPLICATION TO RADIOGRAPHIC PROGRESSION IN EARLY AXIAL SPONDYLOARTHRITIS

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.431 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 1252.2-1253

Author(s):

R. Garofoli ◽

M. Resche-Rigon ◽

M. Dougados ◽

D. Van der Heijde ◽

C. Roux ◽

...

Keyword(s):

Machine Learning ◽

Radiographic Progression ◽

Generalized Additive Models ◽

Regression Trees ◽

Information Criterion ◽

Additive Models ◽

Super Learner ◽

Additive Regression ◽

Selection Operator ◽

Lasso Method

Background:Axial spondyloarthritis (axSpA) is a chronic rheumatic disease that encompasses various clinical presentations: inflammatory chronic back pain, peripheral manifestations and extra-articular manifestations. The current nomenclature divides axSpA in radiographic (in the presence of radiographic sacroiliitis) and non-radiographic (in the absence of radiographic sacroiliitis, with or without MRI sacroiliitis. Given that the functional burden of the disease appears to be greater in patients with radiographic forms, it seems crucial to be able to predict which patients will be more likely to develop structural damage over time. Predictive factors for radiographic progression in axSpA have been identified through use of traditional statistical models like logistic regression. However, these models present some limitations. In order to overcome these limitations and to improve the predictive performance, machine learning (ML) methods have been developed.Objectives:To compare ML models to traditional models to predict radiographic progression in patients with early axSpA.Methods:Study design: prospective French multicentric cohort study (DESIR cohort) with 5years of follow-up. Patients: all patients included in the cohort, i.e. 708 patients with inflammatory back pain for >3 months but <3 years, highly suggestive of axSpA. Data on the first 5 years of follow-up was used. Statistical analyses: radiographic progression was defined as progression either at the spine (increase of at least 1 point per 2 years of mSASSS scores) or at the sacroiliac joint (worsening of at least one grade of the mNY score between 2 visits). Traditional modelling: we first performed a bivariate analysis between our outcome (radiographic progression) and explanatory variables at baseline to select the variables to be included in our models and then built a logistic regression model (M1). Variable selection for traditional models was performed with 2 different methods: stepwise selection based on Akaike Information Criterion (stepAIC) method (M2), and the Least Absolute Shrinkage and Selection Operator (LASSO) method (M3). We also performed sensitivity analysis on all patients with manual backward method (M4) after multiple imputation of missing data. Machine learning modelling: using the “SuperLearner” package on R, we modelled radiographic progression with stepAIC, LASSO, random forest, Discrete Bayesian Additive Regression Trees Samplers (DBARTS), Generalized Additive Models (GAM), multivariate adaptive polynomial spline regression (polymars), Recursive Partitioning And Regression Trees (RPART) and Super Learner. Finally, the accuracy of traditional and ML models was compared based on their 10-foldcross-validated AUC (cv-AUC).Results:10-fold cv-AUC for traditional models were 0.79 and 0.78 for M2 and M3, respectively. The 3 best models in the ML algorithm were the GAM, the DBARTS and the Super Learner models, with 10-fold cv-AUC of: 0.77, 0.76 and 0.74, respectively (Table 1).Table 1.Comparison of 10-fold cross-validated AUC between best traditional and machine learning models.Best modelsCross-validated AUCTraditional models M2 (step AIC method)0.79 M3 (LASSO method)0.78Machine learning approach SL Discrete Bayesian Additive Regression Trees Samplers (DBARTS)0.76 SL Generalized Additive Models (GAM)0.77 Super Learner0.74AUC: Area Under the Curve; AIC: Akaike Information Criterion; LASSO: Least Absolute Shrinkage and Selection Operator; SL: SuperLearner. N = 295.Conclusion:Traditional models predicted better radiographic progression than ML models in this early axSpA population. Further ML algorithms image-based or with other artificial intelligence methods (e.g. deep learning) might perform better than traditional models in this setting.Acknowledgments:Thanks to the French National Society of Rheumatology and the DESIR cohort.Disclosure of Interests:Romain Garofoli: None declared, Matthieu resche-rigon: None declared, Maxime Dougados Grant/research support from: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Consultant of: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Speakers bureau: AbbVie, Eli Lilly, Merck, Novartis, Pfizer and UCB Pharma, Désirée van der Heijde Consultant of: AbbVie, Amgen, Astellas, AstraZeneca, BMS, Boehringer Ingelheim, Celgene, Cyxone, Daiichi, Eisai, Eli-Lilly, Galapagos, Gilead Sciences, Inc., Glaxo-Smith-Kline, Janssen, Merck, Novartis, Pfizer, Regeneron, Roche, Sanofi, Takeda, UCB Pharma; Director of Imaging Rheumatology BV, Christian Roux: None declared, Anna Moltó Grant/research support from: Pfizer, UCB, Consultant of: Abbvie, BMS, MSD, Novartis, Pfizer, UCB

Download Full-text

Machine learning methods for empirical streamflow simulation: a comparison of model accuracy, interpretability, and uncertainty in seasonal watersheds

Hydrology and Earth System Sciences ◽

10.5194/hess-20-2611-2016 ◽

2016 ◽

Vol 20 (7) ◽

pp. 2611-2628 ◽

Cited By ~ 70

Author(s):

Julie E. Shortridge ◽

Seth D. Guikema ◽

Benjamin F. Zaitchik

Keyword(s):

Machine Learning ◽

Predictive Accuracy ◽

Generalized Additive Models ◽

Additive Models ◽

Multivariate Adaptive Regression Splines ◽

Regression Splines ◽

Climate Conditions ◽

Machine Learning Methods ◽

Adaptive Regression ◽

Adaptive Regression Splines

Abstract. In the past decade, machine learning methods for empirical rainfall–runoff modeling have seen extensive development and been proposed as a useful complement to physical hydrologic models, particularly in basins where data to support process-based models are limited. However, the majority of research has focused on a small number of methods, such as artificial neural networks, despite the development of multiple other approaches for non-parametric regression in recent years. Furthermore, this work has often evaluated model performance based on predictive accuracy alone, while not considering broader objectives, such as model interpretability and uncertainty, that are important if such methods are to be used for planning and management decisions. In this paper, we use multiple regression and machine learning approaches (including generalized additive models, multivariate adaptive regression splines, artificial neural networks, random forests, and M5 cubist models) to simulate monthly streamflow in five highly seasonal rivers in the highlands of Ethiopia and compare their performance in terms of predictive accuracy, error structure and bias, model interpretability, and uncertainty when faced with extreme climate conditions. While the relative predictive performance of models differed across basins, data-driven approaches were able to achieve reduced errors when compared to physical models developed for the region. Methods such as random forests and generalized additive models may have advantages in terms of visualization and interpretation of model structure, which can be useful in providing insights into physical watershed function. However, the uncertainty associated with model predictions under extreme climate conditions should be carefully evaluated, since certain models (especially generalized additive models and multivariate adaptive regression splines) become highly variable when faced with high temperatures.

Download Full-text

Prediction of PM2.5 concentrations at the locations of monitoring sites measuring PM10 and NOx, using generalized additive models and machine learning methods: A case study in London

Atmospheric Environment ◽

10.1016/j.atmosenv.2020.117757 ◽

2020 ◽

Vol 240 ◽

pp. 117757 ◽

Cited By ~ 1

Author(s):

Antonis Analitis ◽

Benjamin Barratt ◽

David Green ◽

Andrew Beddows ◽

Evangelia Samoli ◽

...

Keyword(s):

Machine Learning ◽

Generalized Additive Models ◽

Additive Models ◽

Learning Methods ◽

Machine Learning Methods

Download Full-text

Machine learning in neurosurgery: a global survey

Acta Neurochirurgica ◽

10.1007/s00701-020-04532-1 ◽

2020 ◽

Vol 162 (12) ◽

pp. 3081-3091 ◽

Cited By ~ 1

Author(s):

Victor E. Staartjes ◽

Vittorio Stumpo ◽

Julius M. Kernbach ◽

Anita M. Klukowska ◽

Pravesh S. Gadjradj ◽

...

Keyword(s):

Machine Learning ◽

North America ◽

Online Survey ◽

Response Rate ◽

Future Studies ◽

Factors Associated ◽

Technological Advances ◽

Academic Settings ◽

Comprehensive Survey ◽

Potential Benefits

Abstract Background Recent technological advances have led to the development and implementation of machine learning (ML) in various disciplines, including neurosurgery. Our goal was to conduct a comprehensive survey of neurosurgeons to assess the acceptance of and attitudes toward ML in neurosurgical practice and to identify factors associated with its use. Methods The online survey consisted of nine or ten mandatory questions and was distributed in February and March 2019 through the European Association of Neurosurgical Societies (EANS) and the Congress of Neurosurgeons (CNS). Results Out of 7280 neurosurgeons who received the survey, we received 362 responses, with a response rate of 5%, mainly in Europe and North America. In total, 103 neurosurgeons (28.5%) reported using ML in their clinical practice, and 31.1% in research. Adoption rates of ML were relatively evenly distributed, with 25.6% for North America, 30.9% for Europe, 33.3% for Latin America and the Middle East, 44.4% for Asia and Pacific and 100% for Africa with only two responses. No predictors of clinical ML use were identified, although academic settings and subspecialties neuro-oncology, functional, trauma and epilepsy predicted use of ML in research. The most common applications were for predicting outcomes and complications, as well as interpretation of imaging. Conclusions This report provides a global overview of the neurosurgical applications of ML. A relevant proportion of the surveyed neurosurgeons reported clinical experience with ML algorithms. Future studies should aim to clarify the role and potential benefits of ML in neurosurgery and to reconcile these potential advantages with bioethical considerations.

Download Full-text