Ada-WHIPS: Explaining AdaBoost Classification with Applications in the Health Sciences

10.21203/rs.2.19113/v2 ◽

2020 ◽

Cited By ~ 1

Author(s):

Julian Hatwell ◽

Mohamed Medhat Gaber ◽

R.M. Atif Azad

Keyword(s):

Machine Learning ◽

Statistical Tests ◽

Black Box ◽

Classification Rule ◽

Diagnostic Tools ◽

Rule Based ◽

Medical Practitioners ◽

Research Areas ◽

Computer Aided ◽

Novel Algorithm

Abstract Background Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning (ML) models and high dimensional data sources (electronic health records, MRI scans, cardiotocograms, etc). These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years, because it addresses the interpretability and trust concerns of medical practitioners and other critical decision makers. Method In this work, we focus on AdaBoost, a black box model that has been widely adopted in the CAD literature. We address the challenge -- to explain AdaBoost classification -- with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, \textit{Adaptive-Weighted High Importance Path Snippets} (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights; using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes at the internals of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure \textit{stability} that is better suited to the XAI setting. Results In this paper, our experimental results demonstrate the benefits of using our novel algorithm for explaining AdaBoost classification. The simple rule-based explanations have better generalisation (mean coverage 15\%-68\%) while remaining competitive for specificity (mean precision 80\%-99\%). A very small trade-off in specificity is shown to guard against over-fitting. Conclusions This research demonstrates that interpretable, classification rule-based explanations can be generated for computer aided diagnostic tools based on AdaBoost, and that a tightly coupled, AdaBoost-specific approach can outperform model-agnostic methods.

Download Full-text

Ada-WHIPS: Explaining AdaBoost Classification with Applications in the Health Sciences

10.21203/rs.2.19113/v4 ◽

2020 ◽

Author(s):

Julian Hatwell ◽

Mohamed Medhat Gaber ◽

R.M. Atif Azad

Keyword(s):

Machine Learning ◽

Statistical Tests ◽

Black Box ◽

Decision Makers ◽

Medical Practitioners ◽

Research Areas ◽

Novel Approach ◽

Mri Scans ◽

Practice Methods ◽

Logical Rules

Abstract Background Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning (ML) models and high dimensional data sources (electronic health records, MRI scans, cardiotocograms, etc). These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. Methods In this work, we focus on AdaBoost, a black box ML model that has been widely adopted in the CAD literature. We address the challenge -- to explain AdaBoost classification -- with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees (DT) of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting .

Download Full-text

Ada-WHIPS: Explaining AdaBoost Classification with Applications in the Health Sciences

10.21203/rs.2.19113/v3 ◽

2020 ◽

Author(s):

Julian Hatwell ◽

Mohamed Medhat Gaber ◽

R.M. Atif Azad

Keyword(s):

Machine Learning ◽

Statistical Tests ◽

Black Box ◽

Decision Makers ◽

Medical Practitioners ◽

Research Areas ◽

Novel Approach ◽

Mri Scans ◽

Practice Methods ◽

Logical Rules

Abstract Background Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning (ML) models and high dimensional data sources (electronic health records, MRI scans, cardiotocograms, etc). These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. Methods In this work, we focus on AdaBoost, a black box ML model that has been widely adopted in the CAD literature. We address the challenge -- to explain AdaBoost classification -- with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees (DT) of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting .

Download Full-text

Ada-WHIPS: explaining AdaBoost classification with applications in the health sciences

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01201-2 ◽

2020 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Julian Hatwell ◽

Mohamed Medhat Gaber ◽

R. Muhammad Atif Azad

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Statistical Tests ◽

Black Box ◽

The State ◽

Related Data ◽

Research Areas ◽

Novel Approach ◽

Tightly Coupled ◽

Novel Algorithm

Abstract Background Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients’ disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning models and high dimensional data sources such as electronic health records, magnetic resonance imaging scans, cardiotocograms, etc. These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. Methods In this work, we focus on AdaBoost, a black box model that has been widely adopted in the CAD literature. We address the challenge – to explain AdaBoost classification – with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost’s adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model’s decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting. Results Experiments on 9 CAD-related data sets showed that Ada-WHIPS explanations consistently generalise better (mean coverage 15%-68%) than the state of the art while remaining competitive for specificity (mean precision 80%-99%). A very small trade-off in specificity is shown to guard against over-fitting which is a known problem in the state of the art methods. Conclusions The experimental results demonstrate the benefits of using our novel algorithm for explaining CAD AdaBoost classifiers widely found in the literature. Our tightly coupled, AdaBoost-specific approach outperforms model-agnostic explanation methods and should be considered by practitioners looking for an XAI solution for this class of models.

Download Full-text

Ada-WHIPS: Explaining AdaBoost Classification with Applications in the Health Sciences

10.21203/rs.2.19113/v5 ◽

2020 ◽

Author(s):

Julian Hatwell ◽

Mohamed Medhat Gaber ◽

R.M. Atif Azad

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Statistical Tests ◽

Black Box ◽

The State ◽

Related Data ◽

Research Areas ◽

Novel Approach ◽

Tightly Coupled ◽

Novel Algorithm

Abstract Background Computer Aided Diagnostics (CAD) can support medical practitioners to make critical decisions about their patients' disease conditions. Practitioners require access to the chain of reasoning behind CAD to build trust in the CAD advice and to supplement their own expertise. Yet, CAD systems might be based on black box machine learning (ML) models and high dimensional data sources (electronic health records, MRI scans, cardiotocograms, etc). These foundations make interpretation and explanation of the CAD advice very challenging. This challenge is recognised throughout the machine learning research community. eXplainable Artificial Intelligence (XAI) is emerging as one of the most important research areas of recent years because it addresses the interpretability and trust concerns of critical decision makers, including those in clinical and medical practice. Methods In this work, we focus on AdaBoost, a black box ML model that has been widely adopted in the CAD literature. We address the challenge -- to explain AdaBoost classification -- with a novel algorithm that extracts simple, logical rules from AdaBoost models. Our algorithm, Adaptive-Weighted High Importance Path Snippets (Ada-WHIPS), makes use of AdaBoost's adaptive classifier weights. Using a novel formulation, Ada-WHIPS uniquely redistributes the weights among individual decision nodes of the internal decision trees (DT) of the AdaBoost model. Then, a simple heuristic search of the weighted nodes finds a single rule that dominated the model's decision. We compare the explanations generated by our novel approach with the state of the art in an experimental study. We evaluate the derived explanations with simple statistical tests of well-known quality measures, precision and coverage, and a novel measure stability that is better suited to the XAI setting.Results Experiments on 9 CAD-related data sets showed that Ada-WHIPS explanations consistently generalise better (mean coverage 15%-68%) than the state of the art while remaining competitive for speciﬁcity (mean precision 80%-99%). A very small trade-oﬀ in speciﬁcity is shown to guard againstover-ﬁtting which is a known problem in the state of the art methods.Conclusions The experimental results demonstrate the beneﬁts of using our novel algorithm for explaining CAD AdaBoost classiﬁers widely found in the literature. Our tightly coupled, AdaBoost-speciﬁc approach outperforms model-agnostic explanation methods and should be considered by practitioners looking for an XAI solution for this class of models.

Download Full-text

Computer-Aided Diagnosis in Colorectal Cancer: Current Concepts and Future Prospects

Journal of Interdisciplinary Medicine ◽

10.1515/jim-2017-0057 ◽

2017 ◽

Vol 2 (3) ◽

pp. 245-249

Author(s):

Andrei-Constantin Ioanovici ◽

Andrei-Marian Feier ◽

Ioan Țilea ◽

Daniela Dobru

Keyword(s):

Colorectal Cancer ◽

Machine Learning ◽

Computational Models ◽

Positive Impact ◽

Screening Method ◽

Colorectal Polyps ◽

Machine Learning Algorithms ◽

Diagnostic Tools ◽

Computer Aided ◽

Important Health

Abstract Colorectal cancer is an important health issue, both in terms of the number of people affected and the associated costs. Colonoscopy is an important screening method that has a positive impact on the survival of patients with colorectal cancer. The association of colonoscopy with computer-aided diagnostic tools is currently under researchers’ focus, as various methods have already been proposed and show great potential for a better management of this disease. We performed a review of the literature and present a series of aspects, such as the basics of machine learning algorithms, different computational models as well as their benchmarks expressed through measurements such as positive prediction value and accuracy of detection, and the classification of colorectal polyps. Introducing computer-aided diagnostic tools can help clinicians obtain results with a high degree of confidence when performing colonoscopies. The growing field of machine learning in medicine will have a big impact on patient management in the future.

Download Full-text

Competence region estimation for black-box surrogate models

The International FLAIRS Conference Proceedings ◽

10.32473/flairs.v34i1.128571 ◽

2021 ◽

Vol 34 (1) ◽

Author(s):

Tapan Shah

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Black Box ◽

Future Research ◽

Stochastic Quantization ◽

Learning Models ◽

Performance Loss ◽

Research Areas ◽

Upper Level ◽

Machine Learning Models

With advances in edge applications for industry andhealthcare, machine learning models are increasinglytrained on the edge. However, storage and memory in-frastructure at the edge are often primitive, due to costand real-estate constraints. A simple, effective methodis to learn machine learning models from quantized datastored with low arithmetic precision (1-8 bits). In thiswork, we introduce two stochastic quantization meth-ods, dithering and stochastic rounding. In dithering, ad-ditive noise from a uniform distribution is added tothe sample before quantization. In stochastic rounding,each sample is quantized to the upper level with prob-ability p and to a lower level with probability 1-p. Thekey contributions of the paper are For 3 standard machine learning models, Support Vec-tor Machines, Decision Trees and Linear (Logistic)Regression, we compare the performance loss for astandard static quantization and stochastic quantiza-tion for 55 classification and 30 regression datasetswith 1-8 bits quantization. We showcase that for 4- and 8-bit quantization overregression datasets, stochastic quantization demon-strates statistically significant improvement. We investigate the performance loss as a function ofdataset attributes viz. number of features, standard de-viation, skewness. This helps create a transfer functionwhich will recommend the best quantizer for a givendataset. We propose 2 future research areas, a) dynamic quan-tizer update where the model is trained using stream-ing data and the quantizer is updated after each batchand b) precision re-allocation under budget constraintswhere different precision is used for different features.

Download Full-text

Ranking Rule-Based Automatic Explanations for Machine Learning Predictions on Asthma Hospital Encounters in Patients With Asthma: Retrospective Cohort Study

JMIR Medical Informatics ◽

10.2196/28287 ◽

2021 ◽

Vol 9 (8) ◽

pp. e28287

Author(s):

Xiaoyi Zhang ◽

Gang Luo

Keyword(s):

Machine Learning ◽

Health Care ◽

Health Care Systems ◽

Secondary Analysis ◽

Model Performance ◽

Black Box ◽

Ranking Method ◽

Test Case ◽

Record System ◽

Rule Based

Background Asthma hospital encounters impose a heavy burden on the health care system. To improve preventive care and outcomes for patients with asthma, we recently developed a black-box machine learning model to predict whether a patient with asthma will have one or more asthma hospital encounters in the succeeding 12 months. Our model is more accurate than previous models. However, black-box machine learning models do not explain their predictions, which forms a barrier to widespread clinical adoption. To solve this issue, we previously developed a method to automatically provide rule-based explanations for the model’s predictions and to suggest tailored interventions without sacrificing model performance. For an average patient correctly predicted by our model to have future asthma hospital encounters, our explanation method generated over 5000 rule-based explanations, if any. However, the user of the automated explanation function, often a busy clinician, will want to quickly obtain the most useful information for a patient by viewing only the top few explanations. Therefore, a methodology is required to appropriately rank the explanations generated for a patient. However, this is currently an open problem. Objective The aim of this study is to develop a method to appropriately rank the rule-based explanations that our automated explanation method generates for a patient. Methods We developed a ranking method that struck a balance among multiple factors. Through a secondary analysis of 82,888 data instances of adults with asthma from the University of Washington Medicine between 2011 and 2018, we demonstrated our ranking method on the test case of predicting asthma hospital encounters in patients with asthma. Results For each patient predicted to have asthma hospital encounters in the succeeding 12 months, the top few explanations returned by our ranking method typically have high quality and low redundancy. Many top-ranked explanations provide useful insights on the various aspects of the patient’s situation, which cannot be easily obtained by viewing the patient’s data in the current electronic health record system. Conclusions The explanation ranking module is an essential component of the automated explanation function, and it addresses the interpretability issue that deters the widespread adoption of machine learning predictive models in clinical practice. In the next few years, we plan to test our explanation ranking method on predictive modeling problems addressing other diseases as well as on data from other health care systems. International Registered Report Identifier (IRRID) RR2-10.2196/5039

Download Full-text

Novel algorithm to screen for heart murmurs using computer-aided auscultation in neonates: a prospective single center pilot observational study

Minerva Pediatrica ◽

10.23736/s0026-4946.18.04974-5 ◽

2019 ◽

Vol 71 (3) ◽

Author(s):

Renata Grgic-Mustafic ◽

Nariae Baik-Schneditz ◽

Bernhard Schwaberger ◽

Lukas Mileder ◽

Corinna Binder-Heschl ◽

...

Keyword(s):

Observational Study ◽

Single Center ◽

Heart Murmurs ◽

Computer Aided ◽

Novel Algorithm

Download Full-text

A Brief Survey on Text Classification Using Various Machine Learning Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v8i1.521 ◽

2018 ◽

Vol 8 (1) ◽

pp. 14

Author(s):

Padmavathi .S ◽

M. Chidambaram

Keyword(s):

Machine Learning ◽

Text Classification ◽

Fixed Number ◽

Machine Learning Techniques ◽

Online Information ◽

Rule Based ◽

Learning Techniques ◽

Machine Learning Approach ◽

Rule Based Approach

Text classification has grown into more significant in managing and organizing the text data due to tremendous growth of online information. It does classification of documents in to fixed number of predefined categories. Rule based approach and Machine learning approach are the two ways of text classification. In rule based approach, classification of documents is done based on manually defined rules. In Machine learning based approach, classification rules or classifier are defined automatically using example documents. It has higher recall and quick process. This paper shows an investigation on text classification utilizing different machine learning techniques.

Download Full-text