Representation learning of genomic sequence motifs with convolutional neural networks

AbstractAlthough convolutional neural networks (CNNs) have been applied to a variety of computational genomics problems, there remains a large gap in our understanding of how they build representations of regulatory genomic sequences. Here we perform systematic experiments on synthetic sequences to reveal how CNN architecture, specifically convolutional filter size and max-pooling, influences the extent that sequence motif representations are learned by first layer filters. We find that CNNs designed to foster hierarchical representation learning of sequence motifs - assembling partial features into whole features in deeper layers - tend to learn distributed representations, i.e. partial motifs. On the other hand, CNNs that are designed to limit the ability to hierarchically build sequence motif representations in deeper layers tend to learn more interpretable localist representations, i.e. whole motifs. We then validate that this representation learning principle established from synthetic sequences generalizes to in vivo sequences.

Download Full-text

Intrusion Detection Using Convolutional Neural Networks for Representation Learning

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-319-70139-4_87 ◽

2017 ◽

pp. 858-866 ◽

Cited By ~ 52

Author(s):

Zhipeng Li ◽

Zheng Qin ◽

Kai Huang ◽

Xiao Yang ◽

Shuxiong Ye

Keyword(s):

Neural Networks ◽

Intrusion Detection ◽

Convolutional Neural Networks ◽

Representation Learning

Download Full-text

CSNNs: Unsupervised, Backpropagation-Free Convolutional Neural Networks for Representation Learning

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) ◽

10.1109/icmla.2019.00265 ◽

2019 ◽

Author(s):

Bonifaz Stuhr ◽

Jurgen Brauer

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Representation Learning

Download Full-text

Dependence Representation Learning with Convolutional Neural Networks and 2D Histograms

Applied Sciences ◽

10.3390/app10030955 ◽

2020 ◽

Vol 10 (3) ◽

pp. 955

Author(s):

Taejun Kim ◽

Han-joon Kim

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Classification Accuracy ◽

Visual Representations ◽

Representation Learning ◽

Single Image ◽

Feature Generation ◽

Average Classification Accuracy ◽

Scatter Plots ◽

Better Than

Researchers frequently use visualizations such as scatter plots when trying to understand how random variables are related to each other, because a single image represents numerous pieces of information. Dependency measures have been widely used to automatically detect dependencies, but these measures only take into account a few types of data, such as the strength and direction of the dependency. Based on advances in the applications of deep learning to vision, we believe that convolutional neural networks (CNNs) can come to understand dependencies by analyzing visualizations, as humans do. In this paper, we propose a method that uses CNNs to extract dependency representations from 2D histograms. We carried out three sorts of experiments and found that CNNs can learn from visual representations. In the first experiment, we used a synthetic dataset to show that CNNs can perfectly classify eight types of dependency. Then, we showed that CNNs can predict correlations based on 2D histograms of real datasets and visualize the learned dependency representation space. Finally, we applied our method and demonstrated that it performs better than the AutoLearn feature generation algorithm in terms of average classification accuracy, while generating half as many features.

Download Full-text

Faculty Opinions recommendation of Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.737976542.793577284 ◽

2020 ◽

Author(s):

Erich Bornberg-Bauer ◽

Daniel Dowling

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Genomic Sequence ◽

Mrna Abundance ◽

Deep Convolutional Neural Networks

Download Full-text

Faculty Opinions recommendation of Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.737976542.793586931 ◽

2021 ◽

Author(s):

Roderic Guigo ◽

Manuel Muñoz Aguirre

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Genomic Sequence ◽

Mrna Abundance ◽

Deep Convolutional Neural Networks

Download Full-text

Representation learning for mammography mass lesion classification with convolutional neural networks

Computer Methods and Programs in Biomedicine ◽

10.1016/j.cmpb.2015.12.014 ◽

2016 ◽

Vol 127 ◽

pp. 248-257 ◽

Cited By ~ 167

Author(s):

John Arevalo ◽

Fabio A. González ◽

Raúl Ramos-Pollán ◽

Jose L. Oliveira ◽

Miguel Angel Guevara Lopez

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Representation Learning ◽

Mass Lesion ◽

Lesion Classification

Download Full-text

Representation learning of knowledge graphs using convolutional neural networks

Neural Network World ◽

10.14311/nnw.2020.30.011 ◽

2020 ◽

Vol 30 (3) ◽

pp. 145-160

Author(s):

Wang Gao ◽

Yuan Fang ◽

Fan Zhang ◽

Zhifeng Yang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Representation Learning ◽

Knowledge Graphs

Download Full-text

Improving representations of genomic sequence motifs in convolutional networks with exponential activations

10.1101/2020.06.14.150706 ◽

2020 ◽

Cited By ~ 3

Author(s):

Peter K. Koo ◽

Matt Ploenzke

Keyword(s):

Neural Networks ◽

Dna Sequences ◽

Test Performance ◽

Genomic Sequence ◽

Comprehensive Analysis ◽

Sequence Motifs ◽

Deep Convolutional Neural Networks ◽

Convolutional Networks ◽

Learned Features

ABSTRACTDeep convolutional neural networks (CNNs) trained on regulatory genomic sequences tend to build representations in a distributed manner, making it a challenge to extract learned features that are biologically meaningful, such as sequence motifs. Here we perform a comprehensive analysis on synthetic sequences to investigate the role that CNN activations have on model interpretability. We show that employing an exponential activation to first layer filters consistently leads to interpretable and robust representations of motifs compared to other commonly used activations. Strikingly, we demonstrate that CNNs with better test performance do not necessarily imply more interpretable representations with attribution methods. We find that CNNs with exponential activations significantly improve the efficacy of recovering biologically meaningful representations with attribution methods. We demonstrate these results generalise to real DNA sequences across several in vivo datasets. Together, this work demonstrates how a small modification to existing CNNs, i.e. setting exponential activations in the first layer, can significantly improve the robustness and interpretabilty of learned representations directly in convolutional filters and indirectly with attribution methods.

Download Full-text

Multivariate Business Process Representation Learning Utilizing Gramian Angular Fields and Convolutional Neural Networks

10.1007/978-3-030-85469-0_21 ◽

2021 ◽

pp. 327-344

Author(s):

Peter Pfeiffer ◽

Johannes Lahann ◽

Peter Fettke

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Business Process ◽

Representation Learning ◽

Process Representation

Download Full-text