Interpretable Machine Learning: Shapley Values (Seminar Slides)

2020 ◽  
Author(s):  
Marcos López de Prado
2020 ◽  
Author(s):  
Xiaoyong Zhao ◽  
Ningning Wang

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results:In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens. Keywords: interpretable; machine learning; bacterial pathogen;


2020 ◽  
Author(s):  
Xiaoyong Zhao ◽  
Ningning Wang

Abstract Background: According to the World Health Organization (WHO), infectious diseases continue to one of the leading causes of death worldwide. Since the core microbiota flora of humans is largely diverse and horizontal gene transfer (HGT), it is very challenging to determine whether a particular bacterial strain is commensal or pathogenic to humans. With the latest advances in next-generation sequencing (NGS) technology, bioinformatics tools and techniques using NGS data have increasingly been used for the diagnosis and monitoring of infectious diseases. Even if the biological background is not available, the machine learning method can still infer the pathogenic phenotype from the NGS readings, independent of the database of known organisms, and being studied intensively.However, previous methods have not considered opportunistic pathogenic and interpretability of black box model, are not well suited for clinical requirements. Results :In this study, we proposed a novel interpretable machine learning approach (IMLA) to identify the pathogenicity of bacterial genomes: human pathogens (HP), opportunistic pathogenicity (OHP) or non-pathogenicity(NHP), then use the following model-agnostic interpretation methods to interpret model: feature importance, accumulated local effects and Shapley values, due to the model interpretability is essential for healthcare applications. To our knowledge, our paper is the first attempt to infer opportunistic pathogenicity and explain the model. Conclusions: According to the simulation results, our approach IMLA can be a great addition to detect novel pathogens.


2019 ◽  
Vol 63 (1) ◽  
pp. 68-77 ◽  
Author(s):  
Mengnan Du ◽  
Ninghao Liu ◽  
Xia Hu

2021 ◽  
Vol 428 ◽  
pp. 110074
Author(s):  
Rem-Sophia Mouradi ◽  
Cédric Goeury ◽  
Olivier Thual ◽  
Fabrice Zaoui ◽  
Pablo Tassi

2019 ◽  
Vol 333 ◽  
pp. 273-283 ◽  
Author(s):  
Yawen Li ◽  
Liu Yang ◽  
Bohan Yang ◽  
Ning Wang ◽  
Tian Wu

2021 ◽  
Author(s):  
Spiridon Kasapis ◽  
Lulu Zhao ◽  
Yang Chen ◽  
Xiantong Wang ◽  
Monica Bobra ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document