scholarly journals Multi-resBind: a residual network-based multi-label classifier for in vivo RNA binding prediction and preference visualization

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shitao Zhao ◽  
Michiaki Hamada

Abstract Background Protein-RNA interactions play key roles in many processes regulating gene expression. To understand the underlying binding preference, ultraviolet cross-linking and immunoprecipitation (CLIP)-based methods have been used to identify the binding sites for hundreds of RNA-binding proteins (RBPs) in vivo. Using these large-scale experimental data to infer RNA binding preference and predict missing binding sites has become a great challenge. Some existing deep-learning models have demonstrated high prediction accuracy for individual RBPs. However, it remains difficult to avoid significant bias due to the experimental protocol. The DeepRiPe method was recently developed to solve this problem via introducing multi-task or multi-label learning into this field. However, this method has not reached an ideal level of prediction power due to the weak neural network architecture. Results Compared to the DeepRiPe approach, our Multi-resBind method demonstrated substantial improvements using the same large-scale PAR-CLIP dataset with respect to an increase in the area under the receiver operating characteristic curve and average precision. We conducted extensive experiments to evaluate the impact of various types of input data on the final prediction accuracy. The same approach was used to evaluate the effect of loss functions. Finally, a modified integrated gradient was employed to generate attribution maps. The patterns disentangled from relative contributions according to context offer biological insights into the underlying mechanism of protein-RNA interactions. Conclusions Here, we propose Multi-resBind as a new multi-label deep-learning approach to infer protein-RNA binding preferences and predict novel interactions. The results clearly demonstrate that Multi-resBind is a promising tool to predict unknown binding sites in vivo and gain biology insights into why the neural network makes a given prediction.

2018 ◽  
Author(s):  
Kaiming Zhang ◽  
Xiaoyong Pan ◽  
Yang Yang ◽  
Hong-Bin Shen

AbstractCircular RNAs (circRNAs), with their crucial roles in gene regulation and disease development, have become a rising star in the RNA world. A lot of previous wet-lab studies focused on the interaction mechanisms between circRNAs and RNA-binding proteins (RBPs), as the knowledge of circRNA-RBP association is very important for understanding functions of circRNAs. Recently, the abundant CLIP-Seq experimental data has made the large-scale identification and analysis of circRNA-RBP interactions possible, while no computational tool based on machine learning has been developed yet.We present a new deep learning-based method, CRIP (CircRNAs Interact with Proteins), for the prediction of RBP binding sites on circRNAs, using only the RNA sequences. In order to fully exploit the sequence information, we propose a stacked codon-based encoding scheme and a hybrid deep learning architecture, in which a convolutional neural network (CNN) learns high-level abstract features and a recurrent neural network (RNN) learns long dependency in the sequences. We construct 37 datasets including sequence fragments of binding sites on circRNAs, and each set corresponds to one RBP. The experimental results show that the new encoding scheme is superior to the existing feature representation methods for RNA sequences, and the hybrid network outperforms conventional classifiers by a large margin, where both the CNN and RNN components contribute to the performance improvement. To the best of our knowledge, CRIP is the first machine learning-based tool specialized in the prediction of circRNA-RBP interactions, which is expected to play an important role for large-scale function analysis of circRNAs.


2017 ◽  
Author(s):  
Žiga Avsec ◽  
Mohammadamin Barekatain ◽  
Jun Cheng ◽  
Julien Gagneur

AbstractMotivationRegulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries, or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed.ResultsHere we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 114 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox.AvailabilitySpline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at goo.gl/[email protected]; [email protected]


2018 ◽  
Author(s):  
Ilan Ben-Bassat ◽  
Benny Chor ◽  
Yaron Orenstein

AbstractMotivationThe complexes formed by binding of proteins to RNAs play key roles in many biological processes, such as splicing, gene expression regulation, translation, and viral replication. Understanding protein-RNA binding may thus provide important insights to the functionality and dynamics of many cellular processes. This has sparked substantial interest in exploring protein-RNA binding experimentally, and predicting it computationally. The key computational challenge is to efficiently and accurately infer RNA-binding models that will enable prediction of novel protein-RNA interactions to additional transcripts of interest.ResultsWe developed DLPRB, a new deep neural network (DNN) approach for learning protein-RNA binding preferences and predicting novel interactions. We present two different network architectures: a convolutional neural network (CNN), and a recurrent neural network (RNN). The novelty of our network hinges upon two key aspects: (i) the joint analysis of both RNA sequence and structure, which is represented as a probability vector of different RNA structural contexts; (ii) novel features in the architecture of the networks, such as the application of RNNs to RNA-binding prediction, and the combination of hundreds of variable-length filters in the CNN. Our results in inferring accurate RNA-binding models from high-throughput in vitro data exhibit substantial improvements, compared to all previous approaches for protein-RNA binding prediction (both DNN and non-DNN based). A highly significant improvement is achieved for in vitro binding prediction, and a more modest, yet statistically significant,improvement for in vivo binding prediction. When incorporating experimentally-measured RNA structure compared to predicted one, the improvement on in vivo data increases. By visualizing the binding specificities, we can gain novel biological insights underlying the mechanism of protein RNA-binding.AvailabilityThe source code is publicly available at https://github.com/ilanbb/[email protected] informationSupplementary data are available at Bioinformatics online.


2016 ◽  
Author(s):  
Xiaoyong Pan ◽  
Hong-Bin Shen

AbstractBackgroundRNAs play key roles in cells through the interactions with proteins known as the RNA-binding proteins (RBP) and their binding motifs enable crucial understanding of the post-transcriptional regulation of RNAs. How the RBPs correctly recognize the target RNAs and why they bind specific positions is still far from clear. Machine learning-based algorithms are widely acknowledged to be capable of speeding up this process. Although many automatic tools have been developed to predict the RNA-protein binding sites from the rapidly growing multi-resource data, e.g. sequence, structure, their domain specific features and formats have posed significant computational challenges. One of current difficulties is that the cross-source shared common knowledge is at a higher abstraction level beyond the observed data, resulting in a low efficiency of direct integration of observed data across domains. The other difficulty is how to interpret the prediction results. Existing approaches tend to terminate after outputting the potential discrete binding sites on the sequences, but how to assemble them into the meaningful binding motifs is a topic worth of further investigation.ResultsIn viewing of these challenges, we propose a deep learning-based framework (iDeep) by using a novel hybrid convolutional neural network and deep belief network to predict the RBP interaction sites and motifs on RNAs. This new protocol is featured by transforming the original observed data into a high-level abstraction feature space using multiple layers of learning blocks, where the shared representations across different domains are integrated. To validate our iDeep method, we performed experiments on 31 large-scale CLIP-seq datasets, and our results show that by integrating multiple sources of data, the average AUC can be improved by 8% compared to the best single-source-based predictor; and through cross-domain knowledge integration at an abstraction level, it outperforms the state-of-the-art predictors by 6%. Besides the overall enhanced prediction performance, the convolutional neural network module embedded in iDeep is also able to automatically capture the interpretable binding motifs for RBPs. Large-scale experiments demonstrate that these mined binding motifs agree well with the experimentally verified results, suggesting iDeep is a promising approach in the real-world applications.ConclusionThe iDeep framework not only can achieve promising performance than the state-of-the-art predictors, but also easily capture interpretable binding motifs. iDeep is available at http://www.csbio.sjtu.edu.cn/bioinf/iDeep


Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 414 ◽  
Author(s):  
Nagaraj Prabhu ◽  
Desmond Loy Jia Jun ◽  
Putu Dananjaya ◽  
Wen Lew ◽  
Eng Toh ◽  
...  

In this work, we explore the use of the resistive random access memory (RRAM) device as a synapse for mimicking the trained weights linking neurons in a deep learning neural network (DNN) (AlexNet). The RRAM devices were fabricated in-house and subjected to 1000 bipolar read-write cycles to measure the resistances recorded for Logic-0 and Logic-1 (we demonstrate the feasibility of achieving eight discrete resistance states in the same device depending on the RESET stop voltage). DNN simulations have been performed to compare the relative error between the output of AlexNet Layer 1 (Convolution) implemented with the standard backpropagation (BP) algorithm trained weights versus the weights that are encoded using the measured resistance distributions from RRAM. The IMAGENET dataset is used for classification purpose here. We focus only on the Layer 1 weights in the AlexNet framework with 11 × 11 × 96 filters values coded into a binary floating point and substituted with the RRAM resistance values corresponding to Logic-0 and Logic-1. The impact of variability in the resistance states of RRAM for the low and high resistance states on the accuracy of image classification is studied by formulating a look-up table (LUT) for the RRAM (from measured I-V data) and comparing the convolution computation output of AlexNet Layer 1 with the standard outputs from the BP-based pre-trained weights. This is one of the first studies dedicated to exploring the impact of RRAM device resistance variability on the prediction accuracy of a convolutional neural network (CNN) on an AlexNet platform through a framework that requires limited actual device switching test data.


2021 ◽  
Author(s):  
Neel Patel ◽  
Haimeng Bai ◽  
William Bush

A large proportion of non-coding variants are present within binding sites of transcription factors(TFs), which play a significant role in gene regulation. Thus, deriving the impact of non-coding variants on TF binding is the first step towards unravelling their regulatory roles within their associated disease traits. Most of the modern algorithms used for this purpose are based on convolutional neural network(CNN) architectures. However, these models are incapable of capturing the positional effect of different sub-sequences within the TF binding sites on the binding affinity. In this paper, we utilize the attentive gated neural network(AGNet) architecture to build a set of TF-AGNet models for predicting in vivo TF binding intensities in the GM12878 lymphoblastoid cells. These models have novel layers capable of deriving the impact of relative positions of different DNA sub-sequences, within a binding site, on TF binding affinity, and of extracting the most relevant prediction features. We show that the TF-AGNet models are able to outperform conventional CNNs for predicting continuous values of TF binding affinity. We also train additional TF-AGNet models for 20 TFs using data from 4 other cell-lines to assess the generalizability of their prediction accuracy. Lastly, we show that the TF-AGNet based models more accurately classify non-coding variants that significantly affect TF binding compared to models based on 7 variant annotation tools. This accuracy can be leveraged to derive gene regulatory roles of millions of non-coding variants across the genome to further examine their mechanistic associations with complex disease traits.


Cell Reports ◽  
2013 ◽  
Vol 3 (2) ◽  
pp. 301-308 ◽  
Author(s):  
Fadia Ibrahim ◽  
Manolis Maragkakis ◽  
Panagiotis Alexiou ◽  
Margaret A. Maronski ◽  
Marc A. Dichter ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (15) ◽  
pp. 4953
Author(s):  
Sara Al-Emadi ◽  
Abdulla Al-Ali ◽  
Abdulaziz Al-Ali

Drones are becoming increasingly popular not only for recreational purposes but in day-to-day applications in engineering, medicine, logistics, security and others. In addition to their useful applications, an alarming concern in regard to the physical infrastructure security, safety and privacy has arisen due to the potential of their use in malicious activities. To address this problem, we propose a novel solution that automates the drone detection and identification processes using a drone’s acoustic features with different deep learning algorithms. However, the lack of acoustic drone datasets hinders the ability to implement an effective solution. In this paper, we aim to fill this gap by introducing a hybrid drone acoustic dataset composed of recorded drone audio clips and artificially generated drone audio samples using a state-of-the-art deep learning technique known as the Generative Adversarial Network. Furthermore, we examine the effectiveness of using drone audio with different deep learning algorithms, namely, the Convolutional Neural Network, the Recurrent Neural Network and the Convolutional Recurrent Neural Network in drone detection and identification. Moreover, we investigate the impact of our proposed hybrid dataset in drone detection. Our findings prove the advantage of using deep learning techniques for drone detection and identification while confirming our hypothesis on the benefits of using the Generative Adversarial Networks to generate real-like drone audio clips with an aim of enhancing the detection of new and unfamiliar drones.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2852
Author(s):  
Parvathaneni Naga Srinivasu ◽  
Jalluri Gnana SivaSai ◽  
Muhammad Fazal Ijaz ◽  
Akash Kumar Bhoi ◽  
Wonjoon Kim ◽  
...  

Deep learning models are efficient in learning the features that assist in understanding complex patterns precisely. This study proposed a computerized process of classifying skin disease through deep learning based MobileNet V2 and Long Short Term Memory (LSTM). The MobileNet V2 model proved to be efficient with a better accuracy that can work on lightweight computational devices. The proposed model is efficient in maintaining stateful information for precise predictions. A grey-level co-occurrence matrix is used for assessing the progress of diseased growth. The performance has been compared against other state-of-the-art models such as Fine-Tuned Neural Networks (FTNN), Convolutional Neural Network (CNN), Very Deep Convolutional Networks for Large-Scale Image Recognition developed by Visual Geometry Group (VGG), and convolutional neural network architecture that expanded with few changes. The HAM10000 dataset is used and the proposed method has outperformed other methods with more than 85% accuracy. Its robustness in recognizing the affected region much faster with almost 2× lesser computations than the conventional MobileNet model results in minimal computational efforts. Furthermore, a mobile application is designed for instant and proper action. It helps the patient and dermatologists identify the type of disease from the affected region’s image at the initial stage of the skin disease. These findings suggest that the proposed system can help general practitioners efficiently and effectively diagnose skin conditions, thereby reducing further complications and morbidity.


Author(s):  
Leonardo Tanzi ◽  
Pietro Piazzolla ◽  
Francesco Porpiglia ◽  
Enrico Vezzetti

Abstract Purpose The current study aimed to propose a Deep Learning (DL) and Augmented Reality (AR) based solution for a in-vivo robot-assisted radical prostatectomy (RARP), to improve the precision of a published work from our group. We implemented a two-steps automatic system to align a 3D virtual ad-hoc model of a patient’s organ with its 2D endoscopic image, to assist surgeons during the procedure. Methods This approach was carried out using a Convolutional Neural Network (CNN) based structure for semantic segmentation and a subsequent elaboration of the obtained output, which produced the needed parameters for attaching the 3D model. We used a dataset obtained from 5 endoscopic videos (A, B, C, D, E), selected and tagged by our team’s specialists. We then evaluated the most performing couple of segmentation architecture and neural network and tested the overlay performances. Results U-Net stood out as the most effecting architectures for segmentation. ResNet and MobileNet obtained similar Intersection over Unit (IoU) results but MobileNet was able to elaborate almost twice operations per seconds. This segmentation technique outperformed the results from the former work, obtaining an average IoU for the catheter of 0.894 (σ = 0.076) compared to 0.339 (σ = 0.195). This modifications lead to an improvement also in the 3D overlay performances, in particular in the Euclidean Distance between the predicted and actual model’s anchor point, from 12.569 (σ= 4.456) to 4.160 (σ = 1.448) and in the Geodesic Distance between the predicted and actual model’s rotations, from 0.266 (σ = 0.131) to 0.169 (σ = 0.073). Conclusion This work is a further step through the adoption of DL and AR in the surgery domain. In future works, we will overcome the limits of this approach and finally improve every step of the surgical procedure.


Sign in / Sign up

Export Citation Format

Share Document