Explaining decisions of Graph Convolutional Neural Networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer

AbstractMotivationContemporary deep learning approaches show cutting-edge performance in a variety of complex prediction tasks. Nonetheless, the application of deep learning in healthcare remains limited since deep learning methods are often considered as non-interpretable black-box models. Layer-wise Relevance Propagation (LRP) is a technique to explain decisions of deep learning methods. It is widely used to interpret Convolutional Neural Networks (CNNs) applied on image data. Recently, CNNs started to extend towards non-euclidean domains like graphs. Molecular networks are commonly represented as graphs detailing interactions between molecules. Gene expression data can be assigned to the vertices of these graphs. In other words, gene expression data can be structured by utilizing molecular network information as prior knowledge. Graph-CNNs can be applied to structured gene expression data, for example, to predict metastatic events in breast cancer. Therefore, there is a need for explanations showing which part of a molecular network is relevant for predicting an event, e.g. distant metastasis in cancer, for each individual patient.ResultsWe extended the procedure of LRP to make it available for Graph-CNN and tested its applicability on a large breast cancer dataset. We present Graph Layer-wise Relevance Propagation (GLRP) as a new method to explain the decisions made by Graph-CNNs. We demonstrate a sanity check of the developed GLRP on a hand-written digits dataset, and then applied the method on gene expression data. We show that GLRP provides patient-specific molecular subnetworks that largely agree with clinical knowledge and identify common as well as novel, and potentially druggable, drivers of tumor progression. As a result this method could be potentially highly useful on interpreting classification results on the individual patient level, as for example in precision medicine approaches or a molecular tumor board.Availabilityhttps://gitlab.gwdg.de/UKEBpublic/graph-lrphttps://frankkramer-lab.github.io/MetaRelSubNetVis/[email protected]

Download Full-text

Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer

Genome Medicine ◽

10.1186/s13073-021-00845-7 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Hryhorii Chereda ◽

Annalen Bleckmann ◽

Kerstin Menck ◽

Júlia Perera-Bel ◽

Philip Stegmaier ◽

...

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Neural Networks ◽

Deep Learning ◽

Precision Medicine ◽

Prior Knowledge ◽

Gene Expression Data ◽

Molecular Networks ◽

Patient Specific ◽

Expression Data

Abstract Background Contemporary deep learning approaches show cutting-edge performance in a variety of complex prediction tasks. Nonetheless, the application of deep learning in healthcare remains limited since deep learning methods are often considered as non-interpretable black-box models. However, the machine learning community made recent elaborations on interpretability methods explaining data point-specific decisions of deep learning techniques. We believe that such explanations can assist the need in personalized precision medicine decisions via explaining patient-specific predictions. Methods Layer-wise Relevance Propagation (LRP) is a technique to explain decisions of deep learning methods. It is widely used to interpret Convolutional Neural Networks (CNNs) applied on image data. Recently, CNNs started to extend towards non-Euclidean domains like graphs. Molecular networks are commonly represented as graphs detailing interactions between molecules. Gene expression data can be assigned to the vertices of these graphs. In other words, gene expression data can be structured by utilizing molecular network information as prior knowledge. Graph-CNNs can be applied to structured gene expression data, for example, to predict metastatic events in breast cancer. Therefore, there is a need for explanations showing which part of a molecular network is relevant for predicting an event, e.g., distant metastasis in cancer, for each individual patient. Results We extended the procedure of LRP to make it available for Graph-CNN and tested its applicability on a large breast cancer dataset. We present Graph Layer-wise Relevance Propagation (GLRP) as a new method to explain the decisions made by Graph-CNNs. We demonstrate a sanity check of the developed GLRP on a hand-written digits dataset and then apply the method on gene expression data. We show that GLRP provides patient-specific molecular subnetworks that largely agree with clinical knowledge and identify common as well as novel, and potentially druggable, drivers of tumor progression. Conclusions The developed method could be potentially highly useful on interpreting classification results in the context of different omics data and prior knowledge molecular networks on the individual patient level, as for example in precision medicine approaches or a molecular tumor board.

Download Full-text

A convolutional neural network for predicting transcriptional regulators of genes in Arabidopsis transcriptome data reveals classification based on positive regulatory interactions

10.1101/618926 ◽

2019 ◽

Cited By ~ 3

Author(s):

Dan MacLean

Keyword(s):

Neural Network ◽

Gene Expression ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Expression Profiles ◽

Biological Data ◽

Expression Data ◽

Data Set

AbstractGene Regulatory networks that control gene expression are widely studied yet the interactions that make them up are difficult to predict from high throughput data. Deep Learning methods such as convolutional neural networks can perform surprisingly good classifications on a variety of data types and the matrix-like gene expression profiles would seem to be ideal input data for deep learning approaches. In this short study I compiled training sets of expression data using the Arabidopsis AtGenExpress global stress expression data set and known transcription factor-target interactions from the Arabidopsis PLACE database. I built and optimised convolutional neural networks with a best model providing 95 % accuracy of classification on a held-out validation set. Investigation of the activations within this model revealed that classification was based on positive correlation of expression profiles in short sections. This result shows that a convolutional neural network can be used to make classifications and reveal the basis of those calssifications for gene expression data sets, indicating that a convolutional neural network is a useful and interpretable tool for exploratory classification of biological data. The final model is available for download and as a web application.

Download Full-text

A review on convolutional neural network based deep learning methods in gene expression data for disease diagnosis

Materials Today Proceedings ◽

10.1016/j.matpr.2020.10.263 ◽

2020 ◽

Author(s):

C. Gunavathi ◽

K. Sivasubramanian ◽

P. Keerthika ◽

C. Paramasivam

Keyword(s):

Neural Network ◽

Gene Expression ◽

Deep Learning ◽

Convolutional Neural Network ◽

Gene Expression Data ◽

Disease Diagnosis ◽

Expression Data ◽

Learning Methods

Download Full-text

A Review on Recent Progress in Machine Learning and Deep Learning Methods for Cancer Classification on Gene Expression Data

Processes ◽

10.3390/pr9081466 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1466

Author(s):

Aina Umairah Mazlan ◽

Noor Azida Sahabudin ◽

Muhammad Akmal Remli ◽

Nor Syahidatul Nadiah Ismail ◽

Mohd Saberi Mohamad ◽

...

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Deep Learning ◽

Gene Expression Data ◽

Recent Progress ◽

Cancer Classification ◽

Expression Data ◽

Classification Methods ◽

Healthcare Applications ◽

Learning Methods

Data-driven model with predictive ability are important to be used in medical and healthcare. However, the most challenging task in predictive modeling is to construct a prediction model, which can be addressed using machine learning (ML) methods. The methods are used to learn and trained the model using a gene expression dataset without being programmed explicitly. Due to the vast amount of gene expression data, this task becomes complex and time consuming. This paper provides a recent review on recent progress in ML and deep learning (DL) for cancer classification, which has received increasing attention in bioinformatics and computational biology. The development of cancer classification methods based on ML and DL is mostly focused on this review. Although many methods have been applied to the cancer classification problem, recent progress shows that most of the successful techniques are those based on supervised and DL methods. In addition, the sources of the healthcare dataset are also described. The development of many machine learning methods for insight analysis in cancer classification has brought a lot of improvement in healthcare. Currently, it seems that there is highly demanded further development of efficient classification methods to address the expansion of healthcare applications.

Download Full-text

Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data

PLoS ONE ◽

10.1371/journal.pone.0230536 ◽

2020 ◽

Vol 15 (3) ◽

pp. e0230536

Author(s):

Guillermo López-García ◽

José M. Jerez ◽

Leonardo Franco ◽

Francisco J. Veredas

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Transfer Learning ◽

Convolutional Neural Networks ◽

Gene Expression Data ◽

Cancer Survival ◽

Survival Prediction ◽

Expression Data

Download Full-text

A novel method for classification of tabular data using convolutional neural networks

10.1101/2020.05.02.074203 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ljubomir Buturović ◽

Dejan Miljković

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Gene Expression Data ◽

Viral Infections ◽

Expression Data ◽

Tabular Data ◽

Fixed Base ◽

Novel Method

ABSTRACTConvolutional neural networks (CNNs) represent a major breakthrough in image classification. However, there has not been similar progress in applying CNNs, or neural networks of any kind, to classification of tabular data. We developed and evaluated a novel method, TAbular Convolution (TAC), for classification of such data using CNNs by transforming tabular data to images and then classifying the images using CNNs. The transformation is performed by treating each row of tabular data (i.e., vector of features) as an image filter (kernel), and applying the filter to a fixed base image. A CNN is then trained to classify the filtered images. We applied TAC to classification of gene expression data derived from blood samples of patients with bacterial or viral infections. Our results demonstrate that off-the-shelf ResNet can classify the gene expression data as accurately as the current non-CNN state-of-the-art classifiers.

Download Full-text

Recursive Convolutional Neural Networks for Epigenomics

10.1101/2020.04.02.021519 ◽

2020 ◽

Author(s):

Aikaterini Symeonidi ◽

Anguelos Nicolaou ◽

Frank Johannes ◽

Vincent Christlein

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Deep Learning ◽

Functional Genomics ◽

Convolutional Neural Networks ◽

Histone Modifications ◽

State Of The Art ◽

Learning Methods ◽

Arbitrary Size

AbstractDeep learning methods have proved to be powerful classification tools in the fields of structural and functional genomics. In this paper, we introduce a Recursive Convolutional Neural Networks (RCNN) for the analysis of epigenomic data. We focus on the task of predicting gene expression from the intensity of histone modifications. The proposed RCNN architecture can be applied to data of an arbitrary size, and has a single meta-parameter that quantifies the models capacity, thus making it flexible for experimenting. The proposed architecture outperforms state-of-the-art systems, while having several orders of magnitude fewer parameters.

Download Full-text

Gene Expression Data Based Deep Learning Model for Accurate Prediction of Drug-Induced Liver Injury in Advance

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.9b00143 ◽

2019 ◽

Vol 59 (7) ◽

pp. 3240-3250 ◽

Cited By ~ 3

Author(s):

Chunlai Feng ◽

Hengwei Chen ◽

Xianqin Yuan ◽

Mengqiu Sun ◽

Kexin Chu ◽

...

Keyword(s):

Gene Expression ◽

Deep Learning ◽

Liver Injury ◽

Gene Expression Data ◽

Learning Model ◽

Accurate Prediction ◽

Expression Data ◽

Drug Induced ◽

Drug Induced Liver Injury ◽

Deep Learning Model

Download Full-text

Imaging Biomarkers and Gene Expression Data Correlation Framework for Lung Cancer Radiogenomics Analysis Based on Deep Learning

IEEE Access ◽

10.1109/access.2021.3071466 ◽

2021 ◽

pp. 1-1

Author(s):

Dong Sui ◽

Maozu Guo ◽

Xiaoxuan Ma ◽

Julian Baptiste ◽

Lei Zhang

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Deep Learning ◽

Gene Expression Data ◽

Imaging Biomarkers ◽

Expression Data ◽

Data Correlation

Download Full-text

Immune microenvironment and intrinsic subtyping in hormone receptor-positive/HER2-negative breast cancer

npj Breast Cancer ◽

10.1038/s41523-021-00223-x ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Gaia Griguolo ◽

Maria Vittoria Dieci ◽

Laia Paré ◽

Federica Miglietta ◽

Daniele Giulio Generali ◽

...

Keyword(s):

Breast Cancer ◽

Gene Expression ◽

Gene Expression Data ◽

Hormone Receptor ◽

Tumor Biology ◽

Expression Data ◽

Immune Microenvironment ◽

Immune Infiltrate ◽

Hormone Receptor Positive ◽

Her2 Breast Cancer

AbstractLittle is known regarding the interaction between immune microenvironment and tumor biology in hormone receptor (HR)+/HER2− breast cancer (BC). We here assess pretreatment gene-expression data from 66 HR+/HER2− early BCs from the LETLOB trial and show that non-luminal tumors (HER2-enriched, Basal-like) present higher tumor-infiltrating lymphocyte levels than luminal tumors. Moreover, significant differences in immune infiltrate composition, assessed by CIBERSORT, were observed: non-luminal tumors showed a more proinflammatory antitumor immune infiltrate composition than luminal ones.

Download Full-text