Explaining Machine Learning by Bootstrapping Partial Dependence Functions and Shapley Values

Artificial Intelligence and its subdomain, Machine Learning (ML), have shown the potential to make an unprecedented impact in healthcare. Federated Learning (FL) has been introduced to alleviate some of the limitations of ML, particularly the capability to train on larger datasets for improved performance, which is usually cumbersome for an inter-institutional collaboration due to existing patient protection laws and regulations. Moreover, FL may also play a crucial role in circumventing ML’s exigent bias problem by accessing underrepresented groups’ data spanning geographically distributed locations. In this paper, we have discussed three FL challenges, namely: privacy of the model exchange, ethical perspectives, and legal considerations. Lastly, we have proposed a model that could aide in assessing data contributions of a FL implementation. In light of the expediency and adaptability of using the Sørensen–Dice Coefficient over the more limited (e.g., horizontal FL) and computationally expensive Shapley Values, we sought to demonstrate a new paradigm that we hope, will become invaluable for sharing any profit and responsibilities that may accompany a FL endeavor.

Download Full-text

The Comparison and Interpretation of Machine-Learning Models in Post-Stroke Functional Outcome Prediction

Diagnostics ◽

10.3390/diagnostics11101784 ◽

2021 ◽

Vol 11 (10) ◽

pp. 1784

Author(s):

Shih-Chieh Chang ◽

Chan-Lin Chu ◽

Chih-Kuang Chen ◽

Hsiang-Ning Chang ◽

Alice M. K. Wong ◽

...

Keyword(s):

Machine Learning ◽

Area Under The Curve ◽

Superior Performance ◽

Support Vector ◽

Balance Test ◽

Post Stroke ◽

Feature Importance ◽

Value Range ◽

Importance Analysis ◽

Partial Dependence

Prediction of post-stroke functional outcomes is crucial for allocating medical resources. In this study, a total of 577 patients were enrolled in the Post-Acute Care-Cerebrovascular Disease (PAC-CVD) program, and 77 predictors were collected at admission. The outcome was whether a patient could achieve a Barthel Index (BI) score of >60 upon discharge. Eight machine-learning (ML) methods were applied, and their results were integrated by stacking method. The area under the curve (AUC) of the eight ML models ranged from 0.83 to 0.887, with random forest, stacking, logistic regression, and support vector machine demonstrating superior performance. The feature importance analysis indicated that the initial Berg Balance Test (BBS-I), initial BI (BI-I), and initial Concise Chinese Aphasia Test (CCAT-I) were the top three predictors of BI scores at discharge. The partial dependence plot (PDP) and individual conditional expectation (ICE) plot indicated that the predictors’ ability to predict outcomes was the most pronounced within a specific value range (e.g., BBS-I < 40 and BI-I < 60). BI at discharge could be predicted by information collected at admission with the aid of various ML models, and the PDP and ICE plots indicated that the predictors could predict outcomes at a certain value range.

Download Full-text

An interpretable machine learning method for detecting novel pathogens

10.21203/rs.2.20477/v1 ◽

2020 ◽

Author(s):

Xiaoyong Zhao ◽

Ningning Wang

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

Bacterial Pathogen ◽

World Health ◽

Machine Learning Method ◽

Learning Method ◽

Human Pathogens ◽

Healthcare Applications ◽

Interpretable Machine Learning ◽

Shapley Values

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results:In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens. Keywords: interpretable; machine learning; bacterial pathogen;

Download Full-text

An interpretable machine learning method for detecting novel pathogens

10.21203/rs.2.20477/v2 ◽

2020 ◽

Author(s):

Xiaoyong Zhao ◽

Ningning Wang

Keyword(s):

Machine Learning ◽

Infectious Diseases ◽

World Health ◽

Machine Learning Method ◽

Learning Method ◽

Human Pathogens ◽

Healthcare Applications ◽

Interpretable Machine Learning ◽

Shapley Values ◽

Opportunistic Pathogenic

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results :In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens.

Download Full-text

Is Infidelity Predictable? Using Interpretable Machine Learning to Identify the Most Important Predictors of Infidelity

10.31234/osf.io/4crxu ◽

2020 ◽

Author(s):

Laura Marika Vowels ◽

Matthew J Vowels ◽

Kristen P Mark

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Large Body ◽

Predictor Variable ◽

Well Being ◽

Relationship Length ◽

Three Samples ◽

Online Infidelity ◽

Shapley Values ◽

Game Theoretic

Infidelity is a common occurrence in relationships and can have a devastating impact on both partners’ well-being. A large body of literature have attempted to factors that can explain or predict infidelity but have been unable to estimate the relative importance of each predictor. We used a machine learning algorithm, random forest (a type of interpretable highly non-linear decision tree), to predict in-person and online infidelity and intentions toward future infidelity across three samples (two dyadic samples; N = 1846). We also used a game theoretic explanation technique, Shapley values, which allowed us to estimate the effect size of each predictor variable on infidelity. The present study showed that infidelity was somewhat predictable overall with interpersonal factors (relationship satisfaction, love, desire, relationship length) being the most predictive. The results suggest that addressing relationship difficulties early in the relationship can help prevent future infidelity.

Download Full-text

Rock Type Classification Models Interpretability Using Shapley Values

10.2118/207707-ms ◽

2021 ◽

Author(s):

Anton Georgievich Voskresenskiy ◽

Nikita Vladimirovich Bukhanov ◽

Maria Alexandrovna Kuntsevich ◽

Oksana Anatolievna Popova ◽

Alexey Sergeevich Goncharov

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Causal Inference ◽

Rock Type ◽

Well Logs ◽

Well Log ◽

Classification Models ◽

Rock Types ◽

Shapley Values ◽

Type Classification

Abstract We propose a methodology to improve rock type classification using machine learning (ML) techniques and to reveal causal inferences between reservoir quality and well log measurements. Rock type classification is an essential step in accurate reservoir modeling and forecasting. Machine learning approaches allow to automate rock type classification based on different well logs and core data. In order to choose the best model which does not progradate uncertainty further into the workflow it is important to interpret machine learning results. Feature importance and feature selection methods are usually employed for that. We propose an extension to existing approaches - model agnostic sensitivity algorithm based on Shapley values. The paper describes a full workflow to rock type prediction using well log data: from data preparation, model building, feature selection to causal inference analysis. We made ML models that classify rock types using well logs (sonic, gamma, density, photoelectric and resistivity) from 21 wells as predictors and conduct a causal inference analysis between reservoir quality and well logs responses using Shapley values (a concept from a game theory). As a result of feature selection, we obtained predictors which are statistically significant and at the same time relevant in causal relation context. Macro F1-score of the best obtained models for both cases is 0.79 and 0.85 respectively. It was found that the ML models can infer domain knowledge, which allows us to confirm the adequacy of the built ML model for rock types prediction. Our insight was to recognize the need to properly account for the underlying causal structure between the features and rock types in order to derive meaningful and relevant predictors that carry a significant amount of information contributing to the final outcome. Also, we demonstrate the robustness of revealed patterns by applying the Shapley values methodology to a number of ML models and show consistency in order of the most important predictors. Our analysis shows that machine learning classifiers gaining high accuracy tend to mimic physical principles behind different logging tools, in particular: the longer the travel time of an acoustic wave the higher probability that media is represented by reservoir rock and vice versa. On the contrary lower values of natural radioactivity and density of rock highlight the presence of a reservoir. The article presents causal inference analysis of ML classification models using Shapley values on 2 real-world reservoirs. The rock class labels from core data are used to train a supervised machine learning algorithm to predict classes from well log response. The aim of supervised learning is to label a small portion of a dataset and allow the algorithm to automate the rest. Such data-driven analysis may optimize well logging, coring, and core analysis programs. This algorithm can be extended to any other reservoir to improve rock type prediction. The novelty of the paper is that such analysis reveals the nature of decisions made by the ML model and allows to apply truly robust and reliable petrophysics-consistent ML models for rock type classification.

Download Full-text

Understanding machine learning classifier decisions in automated radiotherapy quality assurance

Physics in Medicine and Biology ◽

10.1088/1361-6560/ac3e0e ◽

2021 ◽

Author(s):

Yunsheng Chen ◽

Dionne M Aleman ◽

Thomas G Purdie ◽

Chris McIntosh

Keyword(s):

Machine Learning ◽

Quality Assurance ◽

Treatment Plan ◽

Region Of Interest ◽

Expert Assessment ◽

Learning Approaches ◽

Ensure Patient Safety ◽

Shapley Values ◽

Clinically Significant ◽

Radiotherapy Treatment

Abstract The complexity of generating radiotherapy treatments demands a rigorous quality assurance (QA) process to ensure patient safety and to avoid clinically significant errors. Machine learning classifiers have been explored to augment the scope and efficiency of the traditional radiotherapy treatment planning QA process. However, one important gap in relying on classifiers for QA of radiotherapy treatment plans is the lack of understanding behind a specific classifier prediction. We develop explanation methods to understand the decisions of two automated QA classifiers: (1) a region of interest (ROI) segmentation/labeling classifier, and (2) a treatment plan acceptance classifier. For each classifier, a local interpretable model-agnostic explanation (LIME) framework and a novel adaption of team-based Shapley values framework are constructed. We test these methods in datasets for two radiotherapy treatment sites (prostate and breast), and demonstrate the importance of evaluating QA classifiers using interpretable machine learning approaches. We additionally develop a notion of explanation consistency to assess classifier performance. Our explanation method allows for easy visualization and human expert assessment of classifier decisions in radiotherapy QA. Notably, we find that our team-based Shapley approach is more consistent than LIME. The ability to explain and validate automated decision-making is critical in medical treatments. This analysis allows us to conclude that both QA classifiers are moderately trustworthy and can be used to confirm expert decisions, though the current QA classifiers should not be viewed as a replacement for the human QA process.

Download Full-text

Assessment of Classification Models and Relevant Features on Nonalcoholic Steatohepatitis Using Random Forest

Entropy ◽

10.3390/e23060763 ◽

2021 ◽

Vol 23 (6) ◽

pp. 763

Author(s):

Rafael García-Carretero ◽

Roberto Holgado-Cuadrado ◽

Óscar Barquero-Pérez

Keyword(s):

Machine Learning ◽

Insulin Resistance ◽

Liver Disease ◽

Random Forest ◽

Nonalcoholic Steatohepatitis ◽

Performance Metrics ◽

Cell Damage ◽

Developed Countries ◽

University Hospital ◽

Partial Dependence

Nonalcoholic fatty liver disease (NAFLD) is the hepatic manifestation of metabolic syndrome and is the most common cause of chronic liver disease in developed countries. Certain conditions, including mild inflammation biomarkers, dyslipidemia, and insulin resistance, can trigger a progression to nonalcoholic steatohepatitis (NASH), a condition characterized by inflammation and liver cell damage. We demonstrate the usefulness of machine learning with a case study to analyze the most important features in random forest (RF) models for predicting patients at risk of developing NASH. We collected data from patients who attended the Cardiovascular Risk Unit of Mostoles University Hospital (Madrid, Spain) from 2005 to 2021. We reviewed electronic health records to assess the presence of NASH, which was used as the outcome. We chose RF as the algorithm to develop six models using different pre-processing strategies. The performance metrics was evaluated to choose an optimized model. Finally, several interpretability techniques, such as feature importance, contribution of each feature to predictions, and partial dependence plots, were used to understand and explain the model to help obtain a better understanding of machine learning-based predictions. In total, 1525 patients met the inclusion criteria. The mean age was 57.3 years, and 507 patients had NASH (prevalence of 33.2%). Filter methods (the chi-square and Mann–Whitney–Wilcoxon tests) did not produce additional insight in terms of interactions, contributions, or relationships among variables and their outcomes. The random forest model correctly classified patients with NASH to an accuracy of 0.87 in the best model and to 0.79 in the worst one. Four features were the most relevant: insulin resistance, ferritin, serum levels of insulin, and triglycerides. The contribution of each feature was assessed via partial dependence plots. Random forest-based modeling demonstrated that machine learning can be used to improve interpretability, produce understanding of the modeled behavior, and demonstrate how far certain features can contribute to predictions.

Download Full-text