Discovering epistatic feature interactions from neural network models of regulatory DNA sequences

AbstractThe relationship between noncoding DNA sequence and gene expression is not well-understood. Massively parallel reporter assays (MPRAs), which quantify the regulatory activity of large libraries of DNA sequences in parallel, are a powerful approach to characterize this relationship. We present MPRA-DragoNN, a convolutional neural network (CNN)-based framework to predict and interpret the regulatory activity of DNA sequences as measured by MPRAs. While our method is generally applicable to a variety of MPRA designs, here we trained our model on the Sharpr-MPRA dataset that measures the activity of ~500,000 constructs tiling 15,720 regulatory regions in human K562 and HepG2 cell lines. MPRA-DragoNN predictions were moderately correlated (Spearman ρ = 0.28) with measured activity and were within range of replicate concordance of the assay. State-of-the-art model interpretation methods revealed high-resolution predictive regulatory sequence features that overlapped transcription factor (TF) binding motifs. We used the model to investigate the cell type and chromatin state preferences of predictive TF motifs. We explored the ability of our model to predict the allelic effects of regulatory variants in an independent MPRA experiment and fine map putative functional SNPs in loci associated with lipid traits. Our results suggest that interpretable deep learning models trained on MPRA data have the potential to reveal meaningful patterns in regulatory DNA sequences and prioritize regulatory genetic variants, especially as larger, higher-quality datasets are produced.

Download Full-text

Discovering epistatic feature interactions from neural network models of regulatory DNA sequences

10.1101/302711 ◽

2018 ◽

Cited By ~ 2

Author(s):

Peyton Greenside ◽

Tyler Shimko ◽

Polly Fordyce ◽

Anshul Kundaje

Keyword(s):

Dna Sequence ◽

Dna Sequences ◽

Chromatin Accessibility ◽

Feature Interaction ◽

Core Motif ◽

Feature Interactions ◽

Binding Models ◽

Regulatory Dna Sequences ◽

Regulatory Dna

AbstractMotivationTranscription factors bind regulatory DNA sequences in a combinatorial manner to modulate gene expression. Deep neural networks (DNNs) can learn the cis-regulatory grammars encoded in regulatory DNA sequences associated with transcription factor binding and chromatin accessibility. Several feature attribution methods have been developed for estimating the predictive importance of individual features (nucleotides or motifs) in any input DNA sequence to its associated output prediction from a DNN model. However, these methods do not reveal higher-order feature interactions encoded by the models.ResultsWe present a new method called Deep Feature Interaction Maps (DFIM) to efficiently estimate interactions between all pairs of features in any input DNA sequence. DFIM accurately identifies ground truth motif interactions embedded in simulated regulatory DNA sequences. DFIM identifies synergistic interactions between GATA1 and TAL1 motifs from in vivo TF binding models. DFIM reveals epistatic interactions involving nucleotides flanking the core motif of the Cbf1 TF in yeast from in vitro TF binding models. We also apply DFIM to regulatory sequence models of in vivo chromatin accessibility to reveal interactions between regulatory genetic variants and proximal motifs of target TFs as validated by TF binding quantitative trait loci. Our approach makes significant strides in improving the interpretability of deep learning models for genomics.AvailabilityCode is available at: https://github.com/kundajelab/dfim.Contact: [email protected]

Download Full-text

Deep exploration networks for rapid engineering of functional DNA sequences

10.1101/864363 ◽

2019 ◽

Cited By ~ 2

Author(s):

Johannes Linder ◽

Nicholas Bogard ◽

Alexander B. Rosenberg ◽

Georg Seelig

Keyword(s):

Neural Network ◽

Dna Sequences ◽

Network Models ◽

Regulatory Elements ◽

Differential Splicing ◽

Neural Network Models ◽

Gradient Ascent ◽

Functional Dna ◽

The Cost ◽

Deep Exploration

Engineering gene sequences with defined functional properties is a major goal of synthetic biology. Deep neural network models, together with gradient ascent-style optimization, show promise for sequence generation. The generated sequences can however get stuck in local minima, have low diversity and their fitness depends heavily on initialization. Here, we develop deep exploration networks (DENs), a type of generative model tailor-made for searching a sequence space to minimize the cost of a neural network fitness predictor. By making the network compete with itself to control sequence diversity during training, we obtain generators capable of sampling hundreds of thousands of high-fitness sequences. We demonstrate the power of DENs in the context of engineering RNA isoforms, including polyadenylation and cell type-specific differential splicing. Using DENs, we engineered polyadenylation signals with more than 10-fold higher selection odds than the best gradient ascent-generated patterns and identified splice regulatory elements predicted to result in highly differential splicing between cell lines.

Download Full-text

Method of complex copper-zinc ore typification using neural network models

MINING INFORMATIONAL AND ANALYTICAL BULLETIN ◽

10.25018/0236-1493-2020-5-0-140-147 ◽

2020 ◽

Vol 5 ◽

pp. 140-147 ◽

Cited By ~ 1

Author(s):

T.N. Aleksandrova ◽

◽

E.K. Ushakov ◽

A.V. Orlova ◽

◽

...

Keyword(s):

Neural Network ◽

Network Models ◽

Neural Network Models ◽

Copper Zinc ◽

Complex Copper

Download Full-text

Digital twin of equipment as a basis for the consumer in digital production

Automation. Modern Techologies ◽

10.36652/0869-4931-2020-74-9-394-402 ◽

2020 ◽

Keyword(s):

Neural Network ◽

Tool Wear ◽

Chip Formation ◽

Network Models ◽

Machining Accuracy ◽

Neural Network Models ◽

Digital Twin ◽

The Neural Network ◽

Digital Production ◽

Cyberphysical System

The neural network models series used in the development of an aggregated digital twin of equipment as a cyber-physical system are presented. The twins of machining accuracy, chip formation and tool wear are examined in detail. On their basis, systems for stabilization of the chip formation process during cutting and diagnose of the cutting too wear are developed. Keywords cyberphysical system; neural network model of equipment; big data, digital twin of the chip formation; digital twin of the tool wear; digital twin of nanostructured coating choice

Download Full-text

Universal approximation with error bounds for dynamic artificial neural network models: A tutorial and some new results

2011 IEEE International Symposium on Computer-Aided Control System Design (CACSD) ◽

10.1109/cacsd.2011.6044542 ◽

2011 ◽

Cited By ~ 4

Author(s):

Kwang Ki Kevin Kim ◽

Ernesto Rios Patron ◽

Richard D. Braatz

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Error Bounds ◽

Network Models ◽

Universal Approximation ◽

Neural Network Models ◽

Artificial Neural ◽

Artificial Neural Network Models

Download Full-text

Bridging the Analytical and Artificial Neural Network Models for Keyhole Formation with Experimental Verification in Laser-melting Deposition: A Novel Approach

Results in Physics ◽

10.1016/j.rinp.2021.104440 ◽

2021 ◽

pp. 104440

Author(s):

Muhammad Arif Mahmood ◽

Andrei C. Popescu ◽

Mihai Oane ◽

Asma Channa ◽

Sabin Mihai ◽

...

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Experimental Verification ◽

Network Models ◽

Laser Melting ◽

Neural Network Models ◽

Novel Approach ◽

Laser Melting Deposition ◽

Artificial Neural ◽

Artificial Neural Network Models

Download Full-text

Prediction of Stress in Power Transformer Winding Conductors Using Artificial Neural Networks: Hyperparameter Analysis

Energies ◽

10.3390/en14144242 ◽

2021 ◽

Vol 14 (14) ◽

pp. 4242

Author(s):

Fausto Valencia ◽

Hugo Arcos ◽

Franklin Quilumba

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Finite Element ◽

Power Transformer ◽

Network Models ◽

Activation Function ◽

Neural Network Models ◽

Transformer Winding ◽

Artificial Neural ◽

Femm Software

The purpose of this research is the evaluation of artificial neural network models in the prediction of stresses in a 400 MVA power transformer winding conductor caused by the circulation of fault currents. The models were compared considering the training, validation, and test data errors’ behavior. Different combinations of hyperparameters were analyzed based on the variation of architectures, optimizers, and activation functions. The data for the process was created from finite element simulations performed in the FEMM software. The design of the Artificial Neural Network was performed using the Keras framework. As a result, a model with one hidden layer was the best suited architecture for the problem at hand, with the optimizer Adam and the activation function ReLU. The final Artificial Neural Network model predictions were compared with the Finite Element Method results, showing good agreement but with a much shorter solution time.

Download Full-text

The Effectiveness of Ensemble-Neural Network Techniques to Predict Peak Uplift Resistance of Buried Pipes in Reinforced Sand

Applied Sciences ◽

10.3390/app11030908 ◽

2021 ◽

Vol 11 (3) ◽

pp. 908

Author(s):

Jie Zeng ◽

Panagiotis G. Asteris ◽

Anna P. Mamou ◽

Ahmed Salih Mohammed ◽

Emmanuil A. Golias ◽

...

Keyword(s):

Neural Network ◽

Numerical Models ◽

Correlation Coefficients ◽

Network Models ◽

Small Scale ◽

Offshore Platforms ◽

Neural Network Models ◽

Buried Pipes ◽

Ensemble Techniques ◽

Uplift Resistance

Buried pipes are extensively used for oil transportation from offshore platforms. Under unfavorable loading combinations, the pipe’s uplift resistance may be exceeded, which may result in excessive deformations and significant disruptions. This paper presents findings from a series of small-scale tests performed on pipes buried in geogrid-reinforced sands, with the measured peak uplift resistance being used to calibrate advanced numerical models employing neural networks. Multilayer perceptron (MLP) and Radial Basis Function (RBF) primary structure types have been used to train two neural network models, which were then further developed using bagging and boosting ensemble techniques. Correlation coefficients in excess of 0.954 between the measured and predicted peak uplift resistance have been achieved. The results show that the design of pipelines can be significantly improved using the proposed novel, reliable and robust soft computing models.

Download Full-text