Uncovering and Correcting Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis

Machine learning models have been successfully applied for analysis of skin images. However, due to the black box nature of such deep learning models, it is difficult to understand their underlying reasoning. This prevents a human from validating whether the model is right for the right reasons. Spurious correlations and other biases in data can cause a model to base its predictions on such artefacts rather than on the true relevant information. These learned shortcuts can in turn cause incorrect performance estimates and can result in unexpected outcomes when the model is applied in clinical practice. This study presents a method to detect and quantify this shortcut learning in trained classifiers for skin cancer diagnosis, since it is known that dermoscopy images can contain artefacts. Specifically, we train a standard VGG16-based skin cancer classifier on the public ISIC dataset, for which colour calibration charts (elliptical, coloured patches) occur only in benign images and not in malignant ones. Our methodology artificially inserts those patches and uses inpainting to automatically remove patches from images to assess the changes in predictions. We find that our standard classifier partly bases its predictions of benign images on the presence of such a coloured patch. More importantly, by artificially inserting coloured patches into malignant images, we show that shortcut learning results in a significant increase in misdiagnoses, making the classifier unreliable when used in clinical practice. With our results, we, therefore, want to increase awareness of the risks of using black box machine learning models trained on potentially biased datasets. Finally, we present a model-agnostic method to neutralise shortcut learning by removing the bias in the training dataset by exchanging coloured patches with benign skin tissue using image inpainting and re-training the classifier on this de-biased dataset.

Download Full-text

Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data

Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'19 ◽

10.1145/3328519.3329126 ◽

2019 ◽

Author(s):

Sergey Redyuk ◽

Sebastian Schelter ◽

Tammo Rukat ◽

Volker Markl ◽

Felix Biessmann

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Unseen Data ◽

Machine Learning Models

Download Full-text

Explainable AI: A Review of Machine Learning Interpretability Methods

Entropy ◽

10.3390/e23010018 ◽

2020 ◽

Vol 23 (1) ◽

pp. 18

Author(s):

Pantelis Linardatos ◽

Vasilis Papastefanopoulos ◽

Sotiris Kotsiantis

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Black Box ◽

Learning Systems ◽

Model Complexity ◽

Learning Models ◽

New Methods ◽

Industrial Adoption ◽

Machine Learning Models ◽

The Way

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.

Download Full-text

Query-efficient label-only attacks against black-box machine learning models

Computers & Security ◽

10.1016/j.cose.2019.101698 ◽

2020 ◽

Vol 90 ◽

pp. 101698

Author(s):

Yizhi Ren ◽

Qi Zhou ◽

Zhen Wang ◽

Ting Wu ◽

Guohua Wu ◽

...

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Comparative Study of Different Machine Learning Models for Breast Cancer Diagnosis

Innovations in Soft Computing and Information Technology ◽

10.1007/978-981-13-3185-5_3 ◽

2019 ◽

pp. 17-25 ◽

Cited By ~ 2

Author(s):

Aman Kumar ◽

M. Poonkodi

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Comparative Study ◽

Cancer Diagnosis ◽

Breast Cancer Diagnosis ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Nature Machine Intelligence ◽

10.1038/s42256-019-0048-x ◽

2019 ◽

Vol 1 (5) ◽

pp. 206-215 ◽

Cited By ~ 296

Author(s):

Cynthia Rudin

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

High Stakes ◽

Interpretable Models ◽

Machine Learning Models

Download Full-text

Application of Natural Language Processing with Supervised Machine Learning Techniques to Predict the Overall Drugs Performance

AJIT-e Online Academic Journal of Information Technology ◽

10.5824/ajite.2020.01.001.x ◽

2020 ◽

Vol 11 (40) ◽

pp. 8-23

Author(s):

Pius MARTHIN ◽

Duygu İÇEN

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Random Forest ◽

Semantic Analysis ◽

Classification Tree ◽

Supervised Machine Learning ◽

Training Dataset ◽

Support Vector ◽

Learning Models ◽

Machine Learning Models

Online product reviews have become a valuable source of information which facilitate customer decision with respect to a particular product. With the wealthy information regarding user's satisfaction and experiences about a particular drug, pharmaceutical companies make the use of online drug reviews to improve the quality of their products. Machine learning has enabled scientists to train more efficient models which facilitate decision making in various fields. In this manuscript we applied a drug review dataset used by (Gräβer, Kallumadi, Malberg,& Zaunseder, 2018), available freely from machine learning repository website of the University of California Irvine (UCI) to identify best machine learning model which provide a better prediction of the overall drug performance with respect to users' reviews. Apart from several manipulations done to improve model accuracy, all necessary procedures required for text analysis were followed including text cleaning and transformation of texts to numeric format for easy training machine learning models. Prior to modeling, we obtained overall sentiment scores for the reviews. Customer's reviews were summarized and visualized using a bar plot and word cloud to explore the most frequent terms. Due to scalability issues, we were able to use only the sample of the dataset. We randomly sampled 15000 observations from the 161297 training dataset and 10000 observations were randomly sampled from the 53766 testing dataset. Several machine learning models were trained using 10 folds cross-validation performed under stratified random sampling. The trained models include Classification and Regression Trees (CART), classification tree by C5.0, logistic regression (GLM), Multivariate Adaptive Regression Spline (MARS), Support vector machine (SVM) with both radial and linear kernels and a classification tree using random forest (Random Forest). Model selection was done through a comparison of accuracies and computational efficiency. Support vector machine (SVM) with linear kernel was significantly best with an accuracy of 83% compared to the rest. Using only a small portion of the dataset, we managed to attain reasonable accuracy in our models by applying the TF-IDF transformation and Latent Semantic Analysis (LSA) technique to our TDM.

Download Full-text

Data Synthesis for Testing Black-Box Machine Learning Models

10.1145/3493700.3493704 ◽

2022 ◽

Author(s):

Diptikalyan Saha ◽

Aniya Aggarwal ◽

Sandeep Hans

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Data Synthesis ◽

Machine Learning Models

Download Full-text

Statistical Inference for Clustering Results Interpretation in Clinical Practice

10.3233/shti210580 ◽

2021 ◽

Author(s):

Alexander Kanonirov ◽

Ksenia Balabaeva ◽

Sergey Kovalchuk

Keyword(s):

Machine Learning ◽

Clinical Practice ◽

Bayesian Inference ◽

Statistical Inference ◽

Clinical Pathways ◽

Learning Models ◽

The Difference ◽

Characteristic Features ◽

Machine Learning Models

The relevance of this study lies in improvement of machine learning models understanding. We present a method for interpreting clustering results and apply it to the case of clinical pathways modeling. This method is based on statistical inference and allows to get the description of the clusters, determining the influence of a particular feature on the difference between them. Based on the proposed approach, it is possible to determine the characteristic features for each cluster. Finally, we compare the method with the Bayesian inference explanation and with the interpretation of medical experts [1].

Download Full-text

Counterfactual explanations for black box machine learning models

10.14711/thesis-991012752659503412 ◽

2019 ◽

Author(s):

Yingpeng Zhu

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Interpretable machine learning with reject option

at - Automatisierungstechnik ◽

10.1515/auto-2017-0123 ◽

2018 ◽

Vol 66 (4) ◽

pp. 283-290 ◽

Cited By ~ 7

Author(s):

Johannes Brinkrolf ◽

Barbara Hammer

Keyword(s):

Machine Learning ◽

Vector Quantization ◽

Random Forests ◽

Black Box ◽

Learning Models ◽

Process Automation ◽

Reject Option ◽

Interpretable Machine Learning ◽

Adversarial Examples ◽

Machine Learning Models

Abstract Classification by means of machine learning models constitutes one relevant technology in process automation and predictive maintenance. However, common techniques such as deep networks or random forests suffer from their black box characteristics and possible adversarial examples. In this contribution, we give an overview about a popular alternative technology from machine learning, namely modern variants of learning vector quantization, which, due to their combined discriminative and generative nature, incorporate interpretability and the possibility of explicit reject options for irregular samples. We give an explicit bound on minimum changes required for a change of the classification in case of LVQ networks with reject option, and we demonstrate the efficiency of reject options in two examples.

Download Full-text