scholarly journals Identifying the sequence specificities of circRNA-binding proteins based on a capsule network architecture

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zhengfeng Wang ◽  
Xiujuan Lei

Abstract Background Circular RNAs (circRNAs) are widely expressed in cells and tissues and are involved in biological processes and human diseases. Recent studies have demonstrated that circRNAs can interact with RNA-binding proteins (RBPs), which is considered an important aspect for investigating the function of circRNAs. Results In this study, we design a slight variant of the capsule network, called circRB, to identify the sequence specificities of circRNAs binding to RBPs. In this model, the sequence features of circRNAs are extracted by convolution operations, and then, two dynamic routing algorithms in a capsule network are employed to discriminate between different binding sites by analysing the convolution features of binding sites. The experimental results show that the circRB method outperforms the existing computational methods. Afterwards, the trained models are applied to detect the sequence motifs on the seven circRNA-RBP bound sequence datasets and matched to known human RNA motifs. Some motifs on circular RNAs overlap with those on linear RNAs. Finally, we also predict binding sites on the reported full-length sequences of circRNAs interacting with RBPs, attempting to assist current studies. We hope that our model will contribute to better understanding the mechanisms of the interactions between RBPs and circRNAs. Conclusion In view of the poor studies about the sequence specificities of circRNA-binding proteins, we designed a classification framework called circRB based on the capsule network. The results show that the circRB method is an effective method, and it achieves higher prediction accuracy than other methods.

Author(s):  
Yuning Yang ◽  
Zilong Hou ◽  
Zhiqiang Ma ◽  
Xiangtao Li ◽  
Ka-Chun Wong

Abstract Circular RNAs (circRNAs) are widely expressed in eukaryotes. The genome-wide interactions between circRNAs and RNA-binding proteins (RBPs) can be probed from cross-linking immunoprecipitation with sequencing data. Therefore, computational methods have been developed for identifying RBP binding sites on circRNAs. Unfortunately, those computational methods often suffer from the low discriminative power of feature representations, numerical instability and poor scalability. To address those limitations, we propose a novel computational method called iCircRBP-DHN using deep hierarchical network for discriminating circRNA-RBP binding sites. The network architecture can be regarded as a deep multi-scale residual network followed by bidirectional gated recurrent units (BiGRUs) with the self-attention mechanism, which can simultaneously extract local and global contextual information. Meanwhile, we propose novel encoding schemes by integrating CircRNA2Vec and the K-tuple nucleotide frequency pattern to represent different degrees of nucleotide dependencies. To validate the effectiveness of our proposed iCircRBP-DHN, we compared its performance with other computational methods on 37 circRNAs datasets and 31 linear RNAs datasets, respectively. The experimental results reveal that iCircRBP-DHN can achieve superior performance over those state-of-the-art algorithms. Moreover, we perform motif analysis on circRNAs bound by those different RBPs, demonstrating that our proposed CircRNA2Vec encoding scheme can be promising. The iCircRBP-DHN method is made available at https://github.com/houzl3416/iCircRBP-DHN.


2021 ◽  
Vol 22 (14) ◽  
pp. 7477
Author(s):  
Rok Razpotnik ◽  
Petra Nassib ◽  
Tanja Kunej ◽  
Damjana Rozman ◽  
Tadeja Režen

Circular RNAs (circRNAs) are increasingly recognized as having a role in cancer development. Their expression is modified in numerous cancers, including hepatocellular carcinoma (HCC); however, little is known about the mechanisms of their regulation. The aim of this study was to identify regulators of circRNAome expression in HCC. Using publicly available datasets, we identified RNA binding proteins (RBPs) with enriched motifs around the splice sites of differentially expressed circRNAs in HCC. We confirmed the binding of some of the candidate RBPs using ChIP-seq and eCLIP datasets in the ENCODE database. Several of the identified RBPs were found to be differentially expressed in HCC and/or correlated with the overall survival of HCC patients. According to our bioinformatics analyses and published evidence, we propose that NONO, PCPB2, PCPB1, ESRP2, and HNRNPK are candidate regulators of circRNA expression in HCC. We confirmed that the knocking down the epithelial splicing regulatory protein 2 (ESRP2), known to be involved in the maintenance of the adult liver phenotype, significantly changed the expression of candidate circRNAs in a model HCC cell line. By understanding the systemic changes in transcriptome splicing, we can identify new proteins involved in the molecular pathways leading to HCC development and progression.


2018 ◽  
Author(s):  
Alina Munteanu ◽  
Neelanjan Mukherjee ◽  
Uwe Ohler

AbstractMotivationRNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized.ResultsWe developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3‘UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP.AvailabilitySSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/[email protected]


PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0250592
Author(s):  
Hiren Banerjee ◽  
Ravinder Singh

Background Downstream targets for a large number of RNA-binding proteins remain to be identified. The Drosophila master sex-switch protein Sex-lethal (SXL) is an RNA-binding protein that controls splicing, polyadenylation, or translation of certain mRNAs to mediate female-specific sexual differentiation. Whereas some targets of SXL are known, previous studies indicate that additional targets of SXL have escaped genetic screens. Methodology/Principal findings Here, we have used an alternative molecular approach of GEnomic Selective Enrichment of Ligands by Exponential enrichment (GESELEX) using both the genomic DNA and cDNA pools from several Drosophila developmental stages to identify new potential targets of SXL. Our systematic analysis provides a comprehensive view of the Drosophila transcriptome for potential SXL-binding sites. Conclusion/Significance We have successfully identified new SXL-binding sites in the Drosophila transcriptome. We discuss the significance of our analysis and that the newly identified binding sites and sequences could serve as a useful resource for the research community. This approach should also be applicable to other RNA-binding proteins for which downstream targets are unknown.


2004 ◽  
Vol 24 (14) ◽  
pp. 6241-6252 ◽  
Author(s):  
Kristina L. Carroll ◽  
Dennis A. Pradhan ◽  
Josh A. Granek ◽  
Neil D. Clarke ◽  
Jeffry L. Corden

ABSTRACT RNA polymerase II (Pol II) termination is triggered by sequences present in the nascent transcript. Termination of pre-mRNA transcription is coupled to recognition of cis-acting sequences that direct cleavage and polyadenylation of the pre-mRNA. Termination of nonpolyadenylated [non-poly(A)] Pol II transcripts in Saccharomyces cerevisiae requires the RNA-binding proteins Nrd1 and Nab3. We have used a mutational strategy to characterize non-poly(A) termination elements downstream of the SNR13 and SNR47 snoRNA genes. This approach detected two common RNA sequence motifs, GUA[AG] and UCUU. The first motif corresponds to the known Nrd1-binding site, which we have verified here by gel mobility shift assays. We also show that Nab3 protein binds specifically to RNA containing the UCUU motif. Taken together, our data suggest that Nrd1 and Nab3 binding sites play a significant role in defining non-poly(A) terminators. As is the case with poly(A) terminators, there is no strong consensus for non-poly(A) terminators, and the arrangement of Nrd1p and Nab3p binding sites varies considerably. In addition, the organization of these sequences is not strongly conserved among even closely related yeasts. This indicates a large degree of genetic variability. Despite this variability, we were able to use a computational model to show that the binding sites for Nrd1 and Nab3 can identify genes for which transcription termination is mediated by these proteins.


1993 ◽  
Vol 13 (9) ◽  
pp. 5323-5330 ◽  
Author(s):  
S A Amero ◽  
M J Matunis ◽  
E L Matunis ◽  
J W Hockensmith ◽  
G Raychaudhuri ◽  
...  

The protein on ecdysone puffs (PEP) is associated preferentially with active ecdysone-inducible puffs on Drosophila polytene chromosomes and contains sequence motifs characteristic of transcription factors and RNA-binding proteins (S. A. Amero, S. C. R. Elgin, and A. L. Beyer, Genes Dev. 5:188-200, 1991). PEP is associated with RNA in vivo, as demonstrated here by the sensitivity of PEP-specific chromosomal immunostaining in situ to RNase digestion and by the immunopurification of PEP in Drosophila cell extract with heterogeneous nuclear ribonucleoprotein (hnRNP) complexes. As revealed by sequential immunostaining, PEP is found on a subset of chromosomal sites bound by the HRB (heterogeneous nuclear RNA-binding) proteins, which are basic Drosophila hnRNPs. These observations lead us to suggest that a unique, PEP-containing hnRNP complex assembles preferentially on the transcripts of ecdysone-regulated genes in Drosophila melanogaster presumably to expedite the transcription and/or processing of these transcripts.


Sign in / Sign up

Export Citation Format

Share Document