scholarly journals DeCban: Prediction of circRNA-RBP Interaction Sites by Using Double Embeddings and Cross-Branch Attention Networks

2021 ◽  
Vol 11 ◽  
Author(s):  
Liangliang Yuan ◽  
Yang Yang

Circular RNAs (circRNAs), as a rising star in the RNA world, play important roles in various biological processes. Understanding the interactions between circRNAs and RNA binding proteins (RBPs) can help reveal the functions of circRNAs. For the past decade, the emergence of high-throughput experimental data, like CLIP-Seq, has made the computational identification of RNA-protein interactions (RPIs) possible based on machine learning methods. However, as the underlying mechanisms of RPIs have not been fully understood yet and the information sources of circRNAs are limited, the computational tools for predicting circRNA-RBP interactions have been very few. In this study, we propose a deep learning method to identify circRNA-RBP interactions, called DeCban, which is featured by hybrid double embeddings for representing RNA sequences and a cross-branch attention neural network for classification. To capture more information from RNA sequences, the double embeddings include pre-trained embedding vectors for both RNA segments and their converted amino acids. Meanwhile, the cross-branch attention network aims to address the learning of very long sequences by integrating features of different scales and focusing on important information. The experimental results on 37 benchmark datasets show that both double embeddings and the cross-branch attention model contribute to the improvement of performance. DeCban outperforms the mainstream deep learning-based methods on not only prediction accuracy but also computational efficiency. The data sets and source code of this study are freely available at: https://github.com/AaronYll/DECban.

BMC Genomics ◽  
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Xiaoyong Pan ◽  
Yi Fang ◽  
Xianfeng Li ◽  
Yang Yang ◽  
Hong-Bin Shen

Abstract Background RNA-binding proteins (RBPs) play crucial roles in various biological processes. Deep learning-based methods have been demonstrated powerful on predicting RBP sites on RNAs. However, the training of deep learning models is very time-intensive and computationally intensive. Results Here we present a deep learning-based RBPsuite, an easy-to-use webserver for predicting RBP binding sites on linear and circular RNAs. For linear RNAs, RBPsuite predicts the RBP binding scores with them using our updated iDeepS. For circular RNAs (circRNAs), RBPsuite predicts the RBP binding scores with them using our developed CRIP. RBPsuite first breaks the input RNA sequence into segments of 101 nucleotides and scores the interaction between the segments and the RBPs. RBPsuite further detects the verified motifs on the binding segments gives the binding scores distribution along the full-length sequence. Conclusions RBPsuite is an easy-to-use online webserver for predicting RBP binding sites and freely available at http://www.csbio.sjtu.edu.cn/bioinf/RBPsuite/.


2018 ◽  
Author(s):  
Kaiming Zhang ◽  
Xiaoyong Pan ◽  
Yang Yang ◽  
Hong-Bin Shen

AbstractCircular RNAs (circRNAs), with their crucial roles in gene regulation and disease development, have become a rising star in the RNA world. A lot of previous wet-lab studies focused on the interaction mechanisms between circRNAs and RNA-binding proteins (RBPs), as the knowledge of circRNA-RBP association is very important for understanding functions of circRNAs. Recently, the abundant CLIP-Seq experimental data has made the large-scale identification and analysis of circRNA-RBP interactions possible, while no computational tool based on machine learning has been developed yet.We present a new deep learning-based method, CRIP (CircRNAs Interact with Proteins), for the prediction of RBP binding sites on circRNAs, using only the RNA sequences. In order to fully exploit the sequence information, we propose a stacked codon-based encoding scheme and a hybrid deep learning architecture, in which a convolutional neural network (CNN) learns high-level abstract features and a recurrent neural network (RNN) learns long dependency in the sequences. We construct 37 datasets including sequence fragments of binding sites on circRNAs, and each set corresponds to one RBP. The experimental results show that the new encoding scheme is superior to the existing feature representation methods for RNA sequences, and the hybrid network outperforms conventional classifiers by a large margin, where both the CNN and RNN components contribute to the performance improvement. To the best of our knowledge, CRIP is the first machine learning-based tool specialized in the prediction of circRNA-RBP interactions, which is expected to play an important role for large-scale function analysis of circRNAs.


2007 ◽  
Vol 28 (2) ◽  
pp. 678-686 ◽  
Author(s):  
Raymond A. Lewis ◽  
James A. Gagnon ◽  
Kimberly L. Mowry

ABSTRACT Transport of specific mRNAs to defined regions within the cell cytoplasm is a fundamental mechanism for regulating cell and developmental polarity. In the Xenopus oocyte, Vg1 RNA is transported to the vegetal cytoplasm, where localized expression of the encoded protein is critical for embryonic polarity. The Vg1 localization pathway is directed by interactions between key motifs within Vg1 RNA and protein factors recognizing those RNA sequences. We have investigated how RNA-protein interactions could be modulated to trigger distinct steps in the localization pathway and found that the Vg1 RNP is remodeled during cytoplasmic RNA transport. Our results implicate two RNA-binding proteins with key roles in Vg1 RNA localization, PTB/hnRNP I and Vg1RBP/vera, in this process. We show that PTB/hnRNP I is required for remodeling of the interaction between Vg1 RNA and Vg1RBP/vera. Critically, mutations that block this remodeling event also eliminate vegetal localization of the RNA, suggesting that RNP remodeling is required for localization.


2004 ◽  
Vol 165 (2) ◽  
pp. 203-211 ◽  
Author(s):  
Tracy L. Kress ◽  
Young J. Yoon ◽  
Kimberly L. Mowry

Cytoplasmic localization of mRNAs is a widespread mechanism for generating cell polarity and can provide the basis for patterning during embryonic development. A prominent example of this is localization of maternal mRNAs in Xenopus oocytes, a process requiring recognition of essential RNA sequences by protein components of the localization machinery. However, it is not yet clear how and when such protein factors associate with localized RNAs to carry out RNA transport. To trace the RNA–protein interactions that mediate RNA localization, we analyzed RNP complexes from the nucleus and cytoplasm. We find that an early step in the localization pathway is recognition of localized RNAs by specific RNA-binding proteins in the nucleus. After transport into the cytoplasm, the RNP complex is remodeled and additional transport factors are recruited. These results suggest that cytoplasmic RNA localization initiates in the nucleus and that binding of specific RNA-binding proteins in the nucleus may act to target RNAs to their appropriate destinations in the cytoplasm.


2021 ◽  
Vol 7 (3) ◽  
pp. 48
Author(s):  
Arundhati Das ◽  
Tanvi Sinha ◽  
Sharmishtha Shyamal ◽  
Amaresh Chandra Panda

Circular RNAs (circRNAs) are emerging as novel regulators of gene expression in various biological processes. CircRNAs regulate gene expression by interacting with cellular regulators such as microRNAs and RNA binding proteins (RBPs) to regulate downstream gene expression. The accumulation of high-throughput RNA–protein interaction data revealed the interaction of RBPs with the coding and noncoding RNAs, including recently discovered circRNAs. RBPs are a large family of proteins known to play a critical role in gene expression by modulating RNA splicing, nuclear export, mRNA stability, localization, and translation. However, the interaction of RBPs with circRNAs and their implications on circRNA biogenesis and function has been emerging in the last few years. Recent studies suggest that circRNA interaction with target proteins modulates the interaction of the protein with downstream target mRNAs or proteins. This review outlines the emerging mechanisms of circRNA–protein interactions and their functional role in cell physiology.


2020 ◽  
Author(s):  
Trine Line Hauge Okholm ◽  
Shashank Sathe ◽  
Samuel S. Park ◽  
Andreas Bjerregaard Kamstrup ◽  
Asta Mannstaedt Rasmussen ◽  
...  

AbstractCircular RNAs (circRNAs) are stable, often highly expressed RNA transcripts with potential to modulate other regulatory RNAs. A few circRNAs have been shown to bind RNA binding proteins (RBPs), however, little is known about the prevalence and strength of these interactions in different biological contexts. Here, we comprehensively evaluate the interplay between circRNAs and RBPs in the ENCODE cell lines, HepG2 and K562, by profiling the expression of circRNAs in fractionated total RNA-sequencing samples and analyzing binding sites of 150 RBPs in large eCLIP data sets. We show that KHSRP binding sites are enriched in flanking introns of circRNAs in both HepG2 and K562 cells, and that KHSRP depletion affects circRNA biogenesis. Additionally, we show that exons forming circRNAs are generally enriched with RBP binding sites compared to non-circularizing exons. To detect individual circRNAs with regulatory potency, we computationally identify circRNAs that are highly covered by RBP binding sites and experimentally validate circRNA-RBP interactions by RNA immunoprecipitations. We characterize circCDYL, a highly expressed circRNA with clinical and functional implications in bladder cancer, which is covered with GRWD1 binding sites. We confirm that circCDYL binds GRWD1 in vivo and functionally characterizes the effect of circCDYL-GRWD1 interactions on target genes in HepG2. Furthermore, we confirm interactions between circCDYL and RBPs in bladder cancer cells and demonstrate that circCDYL depletion affects hallmarks of cancer and perturbs the expression of key cancer genes, e.g. TP53 and MYC. Finally, we show that elevated levels of highly RBP-covered circRNAs, including circCDYL, are associated with overall survival of bladder cancer patients. Our study demonstrates transcriptome-wide and cell-type-specific circRNA-RBP interactions that could play important regulatory roles in tumorigenesis.


2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Trine Line Hauge Okholm ◽  
Shashank Sathe ◽  
Samuel S. Park ◽  
Andreas Bjerregaard Kamstrup ◽  
Asta Mannstaedt Rasmussen ◽  
...  

Abstract Background Circular RNAs (circRNAs) are stable, often highly expressed RNA transcripts with potential to modulate other regulatory RNAs. A few circRNAs have been shown to bind RNA-binding proteins (RBPs); however, little is known about the prevalence and distribution of these interactions in different biological contexts. Methods We conduct an extensive screen of circRNA-RBP interactions in the ENCODE cell lines HepG2 and K562. We profile circRNAs in deep-sequenced total RNA samples and analyze circRNA-RBP interactions using a large set of eCLIP data with binding sites of 150 RBPs. We validate interactions for select circRNAs and RBPs by performing RNA immunoprecipitation and functionally characterize our most interesting candidates by conducting knockdown studies followed by RNA-Seq. Results We generate a comprehensive catalog of circRNA-RBP interactions in HepG2 and K562 cells. We show that KHSRP binding sites are enriched in flanking introns of circRNAs and that KHSRP depletion affects circRNA biogenesis. We identify circRNAs that are highly covered by RBP binding sites and experimentally validate individual circRNA-RBP interactions. We show that circCDYL, a highly expressed circRNA with clinical and functional implications in bladder cancer, is almost completely covered with GRWD1 binding sites in HepG2 cells, and that circCDYL depletion counteracts the effect of GRWD1 depletion. Furthermore, we confirm interactions between circCDYL and RBPs in bladder cancer cells and demonstrate that circCDYL depletion affects hallmarks of cancer and perturbs the expression of key cancer genes, e.g., TP53. Finally, we show that elevated levels of circCDYL are associated with overall survival of bladder cancer patients. Conclusions Our study demonstrates transcriptome-wide and cell-type-specific circRNA-RBP interactions that could play important regulatory roles in tumorigenesis.


2021 ◽  
Author(s):  
Keisuke Yamada ◽  
Michiaki Hamada

AbstractMotivationThe accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from Transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations.ResultsHere, we propose BERT-RBP as a model to predict RNA-RBP interactions by adapting the BERT architecture pre-trained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only from sequential information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems.AvailabilityPython source codes are freely available athttps://github.com/kkyamada/[email protected]


Author(s):  
Yuning Yang ◽  
Zilong Hou ◽  
Zhiqiang Ma ◽  
Xiangtao Li ◽  
Ka-Chun Wong

Abstract Circular RNAs (circRNAs) are widely expressed in eukaryotes. The genome-wide interactions between circRNAs and RNA-binding proteins (RBPs) can be probed from cross-linking immunoprecipitation with sequencing data. Therefore, computational methods have been developed for identifying RBP binding sites on circRNAs. Unfortunately, those computational methods often suffer from the low discriminative power of feature representations, numerical instability and poor scalability. To address those limitations, we propose a novel computational method called iCircRBP-DHN using deep hierarchical network for discriminating circRNA-RBP binding sites. The network architecture can be regarded as a deep multi-scale residual network followed by bidirectional gated recurrent units (BiGRUs) with the self-attention mechanism, which can simultaneously extract local and global contextual information. Meanwhile, we propose novel encoding schemes by integrating CircRNA2Vec and the K-tuple nucleotide frequency pattern to represent different degrees of nucleotide dependencies. To validate the effectiveness of our proposed iCircRBP-DHN, we compared its performance with other computational methods on 37 circRNAs datasets and 31 linear RNAs datasets, respectively. The experimental results reveal that iCircRBP-DHN can achieve superior performance over those state-of-the-art algorithms. Moreover, we perform motif analysis on circRNAs bound by those different RBPs, demonstrating that our proposed CircRNA2Vec encoding scheme can be promising. The iCircRBP-DHN method is made available at https://github.com/houzl3416/iCircRBP-DHN.


Sign in / Sign up

Export Citation Format

Share Document