Interpreting Deep Neural Networks Beyond Attribution Methods: Quantifying Global Importance of Genomic Features

AbstractDespite deep neural networks (DNNs) having found great success at improving performance on various prediction tasks in computational genomics, it remains difficult to understand why they make any given prediction. In genomics, the main approaches to interpret a high-performing DNN are to visualize learned representations via weight visualizations and attribution methods. While these methods can be informative, each has strong limitations. For instance, attribution methods only uncover the independent contribution of single nucleotide variants in a given sequence. Here we discuss and argue for global importance analysis which can quantify population-level importance of putative features and their interactions learned by a DNN. We highlight recent work that has benefited from this interpretability approach and then discuss connections between global importance analysis and causality.

Download Full-text

Predicting the impact of single nucleotide variants on splicing via sequence‐based deep neural networks and genomic features

Human Mutation ◽

10.1002/humu.23794 ◽

2019 ◽

Vol 40 (9) ◽

pp. 1261-1269 ◽

Cited By ~ 1

Author(s):

Tatsuhiko Naito

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Genomic Features ◽

The Impact

Download Full-text

Global importance analysis: An interpretability method to quantify importance of genomic features in deep neural networks

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008925 ◽

2021 ◽

Vol 17 (5) ◽

pp. e1008925

Author(s):

Peter K. Koo ◽

Antonio Majdandzic ◽

Matthew Ploenzke ◽

Praveen Anand ◽

Steffan B. Paul

Keyword(s):

Neural Networks ◽

Protein Interactions ◽

Effect Size ◽

Deep Neural Networks ◽

Rna Binding ◽

Rna Binding Proteins ◽

Population Level ◽

Sequence Motifs ◽

Convolutional Network ◽

Importance Analysis

Deep neural networks have demonstrated improved performance at predicting the sequence specificities of DNA- and RNA-binding proteins compared to previous methods that rely on k-mers and position weight matrices. To gain insights into why a DNN makes a given prediction, model interpretability methods, such as attribution methods, can be employed to identify motif-like representations along a given sequence. Because explanations are given on an individual sequence basis and can vary substantially across sequences, deducing generalizable trends across the dataset and quantifying their effect size remains a challenge. Here we introduce global importance analysis (GIA), a model interpretability method that quantifies the population-level effect size that putative patterns have on model predictions. GIA provides an avenue to quantitatively test hypotheses of putative patterns and their interactions with other patterns, as well as map out specific functions the network has learned. As a case study, we demonstrate the utility of GIA on the computational task of predicting RNA-protein interactions from sequence. We first introduce a convolutional network, we call ResidualBind, and benchmark its performance against previous methods on RNAcompete data. Using GIA, we then demonstrate that in addition to sequence motifs, ResidualBind learns a model that considers the number of motifs, their spacing, and sequence context, such as RNA secondary structure and GC-bias.

Download Full-text

Global Importance Analysis: A Method to Quantify Importance of Genomic Features in Deep Neural Networks

10.1101/2020.09.08.288068 ◽

2020 ◽

Author(s):

Peter K. Koo ◽

Matthew Ploenzke ◽

Praveen Anand ◽

Steffan B. Paul ◽

Antonio Majdandzic

Keyword(s):

Neural Networks ◽

Effect Size ◽

Deep Neural Networks ◽

Rna Binding ◽

Rna Binding Proteins ◽

Sequence Motifs ◽

Single Nucleotide Variants ◽

Convolutional Network ◽

Model Predictions ◽

Importance Analysis

ABSTRACTDeep neural networks have demonstrated improved performance at predicting the sequence specificities of DNA- and RNA-binding proteins compared to previous methods that rely on k-mers and position weight matrices. For model interpretability, attribution methods have been employed to reveal learned patterns that resemble sequence motifs. First-order attribution methods only quantify the independent importance of single nucleotide variants in a given sequence – it does not provide the effect size of motifs (or their interactions with other patterns) on model predictions. Here we introduce global importance analysis (GIA), a new model interpretability method that quantifies the population-level effect size that putative patterns have on model predictions. GIA provides an avenue to quantitatively test hypotheses of putative patterns and their interactions with other patterns, as well as map out specific functions the network has learned. As a case study, we demonstrate the utility of GIA on the computational task of predicting RNA-protein interactions from sequence. We first introduce a new convolutional network, we call ResidualBind, and benchmark its performance against previous methods on RNAcompete data. Using GIA, we then demonstrate that in addition to sequence motifs, ResidualBind learns a model that considers the number of motifs, their spacing, and sequence context, such as RNA secondary structure and GC-bias.

Download Full-text

Evaluating Dropout Placements in Bayesian Regression Resnet

Journal of Artificial Intelligence and Soft Computing Research ◽

10.2478/jaiscr-2022-0005 ◽

2021 ◽

Vol 12 (1) ◽

pp. 61-73

Author(s):

Lei Shi ◽

Cosmin Copot ◽

Steve Vanlanduit

Keyword(s):

Neural Networks ◽

Model Uncertainty ◽

Coverage Probability ◽

Deep Neural Networks ◽

Bayesian Regression ◽

Great Success ◽

Network Architectures ◽

Deep Architecture ◽

Bayesian Approximation ◽

Interval Coverage

Abstract Deep Neural Networks (DNNs) have shown great success in many fields. Various network architectures have been developed for different applications. Regardless of the complexities of the networks, DNNs do not provide model uncertainty. Bayesian Neural Networks (BNNs), on the other hand, is able to make probabilistic inference. Among various types of BNNs, Dropout as a Bayesian Approximation converts a Neural Network (NN) to a BNN by adding a dropout layer after each weight layer in the NN. This technique provides a simple transformation from a NN to a BNN. However, for DNNs, adding a dropout layer to each weight layer would lead to a strong regularization due to the deep architecture. Previous researches [1, 2, 3] have shown that adding a dropout layer after each weight layer in a DNN is unnecessary. However, how to place dropout layers in a ResNet for regression tasks are less explored. In this work, we perform an empirical study on how different dropout placements would affect the performance of a Bayesian DNN. We use a regression model modified from ResNet as the DNN and place the dropout layers at different places in the regression ResNet. Our experimental results show that it is not necessary to add a dropout layer after every weight layer in the Regression ResNet to let it be able to make Bayesian Inference. Placing Dropout layers between the stacked blocks i.e. Dense+Identity+Identity blocks has the best performance in Predictive Interval Coverage Probability (PICP). Placing a dropout layer after each stacked block has the best performance in Root Mean Square Error (RMSE).

Download Full-text

Chaotic System Prediction Using Data Assimilation and Machine Learning

E3S Web of Conferences ◽

10.1051/e3sconf/202018502025 ◽

2020 ◽

Vol 185 ◽

pp. 02025

Author(s):

Guo Yanan ◽

Cao Xiaoqun ◽

Peng Kecheng

Keyword(s):

Machine Learning ◽

Numerical Simulation ◽

Neural Networks ◽

Data Assimilation ◽

Deep Neural Networks ◽

Chaotic Systems ◽

Prediction Method ◽

Great Success ◽

Simulation Methods ◽

Numerical Simulation Methods

Atmospheric systems are typically chaotic and their chaotic nature is an important limiting factor for weather forecasting and climate prediction. So far, there have been many studies on the simulation and prediction of chaotic systems using numerical simulation methods. However, there are many intractable problems in predicting chaotic systems using numerical simulation methods, such as initial value sensitivity, error accumulation, and unreasonable parameterization of physical processes, which often lead to forecast failure. With the continuous improvement of observational techniques, data assimilation has gradually become an effective method to improve the numerical simulation prediction. In addition, with the advent of big data and the enhancement of computing resources, machine learning has achieved great success. Studies have shown that deep neural networks are capable of mining and extracting the complex physical relationships behind large amounts of data to build very good forecasting models. Therefore, in this paper, we propose a prediction method for chaotic systems that combines deep neural networks and data assimilation. To test the effectiveness of the method, we use the model to perform forecasting experiments on the Lorenz96 model. The experimental results show that the prediction method that combines neural network and data assimilation is very effective in predicting the amount of state of Lorenz96. However, Lorenz96 is a relatively simple model, and our next step will be to continue the experiments on the complex system model to test the effectiveness of the proposed method in this paper and to further optimize and improve the proposed method.

Download Full-text

Improved Training of Deep Convolutional Networks via Minimum-Variance Regularized Adaptive Sampling

10.21203/rs.3.rs-983472/v1 ◽

2021 ◽

Author(s):

Alfonso Rojas-Domínguez ◽

Ivvan Valdez ◽

Manuel Ornelas-Rodríguez ◽

Martín Carpio

Keyword(s):

Neural Networks ◽

Adaptive Sampling ◽

Sampling Method ◽

Deep Neural Networks ◽

Computational Cost ◽

Stochastic Gradient Descent ◽

Experimental Comparison ◽

Great Success ◽

Convolutional Networks ◽

Training Examples

Abstract Fostered by technological and theoretical developments, deep neural networks have achieved great success in many applications, but their training by means of mini-batch stochastic gradient descent (SGD) can be very costly due to the possibly tens of millions of parameters to be optimized and the large amounts of training examples that must be processed. Said computational cost is exacerbated by the inefficiency of the uniform sampling method typically used by SGD to form the training mini-batches: since not all training examples are equally relevant for training, sampling these under a uniform distribution is far from optimal. A better strategy is to form the mini-batches by sampling the training examples under a distribution where the probability of being selected is proportional to the relevance of each individual example. This can be achieved through Importance Sampling (IS), which also achieves the minimization of the gradients’ variance w.r.t. the network parameters, further improving convergence. In this paper, an IS-based adaptive sampling method is studied that exploits side information to construct the required probability distribution. Said method is modified to enable its application to deep neural networks, and the improved method is dubbed Regularized Adaptive Sampling (RAS). Experimental comparison (using deep convolutional networks for classification of the MNIST and CIFAR-10 datasets) of RAS against SGD and against another sampling method in the state of the art, shows that RAS achieves relative improvements of the training process, without incurring significant overhead or affecting the accuracy of the networks.

Download Full-text

Viral infection and transmission in a large, well-traced outbreak caused by the SARS-CoV-2 Delta variant

10.21203/rs.3.rs-738164/v1 ◽

2021 ◽

Author(s):

Jing Lu ◽

Baisheng Li ◽

Aiping Deng ◽

Kuibiao Li ◽

Yao Hu ◽

...

Keyword(s):

Mainland China ◽

Population Level ◽

Epidemiological Data ◽

Single Nucleotide Variants ◽

Symptomatic Infection ◽

Single Nucleotide ◽

Viral Loads ◽

Rapid Spread ◽

B Lineage ◽

Local Transmission

Abstract We report the first local transmission of the SARS-CoV-2 Delta variant in mainland China. All 167 infections could be traced back to the first index case. Daily sequential PCR testing of the quarantined subjects indicated that the viral loads of Delta infections, when they first become PCR+, were on average ~1000 times greater compared to A/B lineage infections during initial epidemic wave in China in early 2020, suggesting potentially faster viral replication and greater infectiousness of Delta during early infection. We performed high-quality sequencing on samples from 126 individuals. Reliable epidemiological data meant that, for 111 transmission events, the donor and recipient cases were known. The estimated transmission bottleneck size was 1-3 virions with most minor intra-host single nucleotide variants (iSNVs) failing to transmit to the recipients. However, transmission heterogeneity of SARS-CoV-2 was also observed. The transmission of minor iSNVs resulted in at least 4 of the 30 substitutions identified in the outbreak, highlighting the contribution of intra-host variants to population level viral diversity during rapid spread. Disease control activities, such as the frequency of population testing, quarantine during pre-symptomatic infection, and level of virus genomic surveillance should be adjusted in order to account for the increasing prevalence of the Delta variant worldwide.

Download Full-text

Medical Knowledge Graph in Chinese Using Deep Semantic Mobile Computation Based on IoT and WoT

Wireless Communications and Mobile Computing ◽

10.1155/2021/5590754 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Wanheng Liu ◽

Ling Yin ◽

Cong Wang ◽

Fulin Liu ◽

Zhiyu Ni

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Medical Knowledge ◽

Disease Diagnosis ◽

Knowledge Graph ◽

Great Success ◽

Smart Healthcare ◽

Made In

In this paper, a novel medical knowledge graph in Chinese approach applied in smart healthcare based on IoT and WoT is presented, using deep neural networks combined with self-attention to generate medical knowledge graph to make it more convenient for performing disease diagnosis and providing treatment advisement. Although great success has been made in the medical knowledge graph in recent studies, the issue of comprehensive medical knowledge graph in Chinese appropriate for telemedicine or mobile devices have been ignored. In our study, it is a working theory which is based on semantic mobile computing and deep learning. When several experiments have been carried out, it is demonstrated that it has better performance in generating various types of medical knowledge graph in Chinese, which is similar to that of the state-of-the-art. Also, it works well in the accuracy and comprehensive, which is much higher and highly consisted with the predictions of the theoretical model. It proves to be inspiring and encouraging that our work involving studies of medical knowledge graph in Chinese, which can stimulate the smart healthcare development.

Download Full-text

Tumour mutations in long noncoding RNAs that enhance cell fitness

10.1101/2021.11.06.467555 ◽

2021 ◽

Author(s):

Roberta Esposito ◽

Andres Lanzos ◽

Taisia Polidori ◽

Hugo Guillen-Ramirez ◽

Bernard Merlin ◽

...

Keyword(s):

Noncoding Rnas ◽

Long Noncoding Rnas ◽

Driver Mutations ◽

Cancer Genes ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Protein Coding ◽

Genomic Features ◽

Coding Regions ◽

Lncrna Neat1

Tumour DNA contains thousands of single nucleotide variants (SNVs) in non-protein-coding regions, yet it remains unclear which are driver mutations that promote cell fitness. Amongst the most highly mutated non-coding elements are long noncoding RNAs (lncRNAs), which can promote cancer and may be targeted therapeutically. We here searched for evidence that driver mutations may act through alteration of lncRNA function. Using an integrative driver discovery algorithm, we analysed single nucleotide variants (SNVs) from 2583 primary tumours and 3527 metastases to reveal 54 candidate driver lncRNAs (FDR<0.1). Their relevance is supported by enrichment for previously-reported cancer genes and by clinical and genomic features. Using knockdown and transgene overexpression, we show that tumour SNVs in two novel lncRNAs can boost cell fitness. Researchers have noted particularly high yet unexplained mutation rates in the iconic cancer lncRNA, NEAT1. We apply in cellulo mutagenesis by CRISPR-Cas9 to identify vulnerable regions of NEAT1 where SNVs reproducibly increase cell fitness in both transformed and normal backgrounds. In particular, mutations in the 5-prime region of NEAT1 alter ribonucleoprotein assembly and boost the population of subnuclear paraspeckles. Together, this work reveals function-altering somatic lncRNA mutations as a new route to enhanced cell fitness during transformation and metastasis.

Download Full-text

The telomere length landscape of prostate cancer

Nature Communications ◽

10.1038/s41467-021-27223-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Julie Livingstone ◽

Yu-Jia Shiah ◽

Takafumi N. Yamaguchi ◽

Lawrence E. Heisler ◽

Vincent Huang ◽

...

Keyword(s):

Prostate Cancer ◽

Telomere Length ◽

Localized Prostate Cancer ◽

Structural Variants ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Genomic Features ◽

Multi Level ◽

Telomere Lengthening

AbstractReplicative immortality is a hallmark of cancer, and can be achieved through telomere lengthening and maintenance. Although the role of telomere length in cancer has been well studied, its association to genomic features is less well known. Here, we report the telomere lengths of 392 localized prostate cancer tumours and characterize their relationship to genomic, transcriptomic and proteomic features. Shorter tumour telomere lengths are associated with elevated genomic instability, including single-nucleotide variants, indels and structural variants. Genes involved in cell proliferation and signaling are correlated with tumour telomere length at all levels of the central dogma. Telomere length is also associated with multiple clinical features of a tumour. Longer telomere lengths in non-tumour samples are associated with a lower rate of biochemical relapse. In summary, we describe the multi-level integration of telomere length, genomics, transcriptomics and proteomics in localized prostate cancer.

Download Full-text