scholarly journals Deep neural networks for interpreting RNA binding protein target preferences

2019 ◽  
Author(s):  
Mahsa Ghanbari ◽  
Uwe Ohler

AbstractDeep learning has become a powerful paradigm to analyze the binding sites of regulatory factors including RNA-binding proteins (RBPs), owing to its strength to learn complex features from possibly multiple sources of raw data. However, the interpretability of these models, which is crucial to improve our understanding of RBP binding preferences and functions, has not yet been investigated in significant detail. We have designed a multitask and multimodal deep neural network for characterizing in vivo RBP binding preferences. The model incorporates not only the sequence but also the region type of the binding sites as input, which helps the model to boost the prediction performance. To interpret the model, we quantified the contribution of the input features to the predictive score of each RBP. Learning across multiple RBPs at once, we are able to avoid experimental biases and to identify the RNA sequence motifs and transcript context patterns that are the most important for the predictions of each individual RBP. Our findings are consistent with known motifs and binding behaviors of RBPs and can provide new insights about the regulatory functions of RBPs.

2018 ◽  
Author(s):  
Alina Munteanu ◽  
Neelanjan Mukherjee ◽  
Uwe Ohler

AbstractMotivationRNA-binding proteins (RBPs) regulate every aspect of RNA metabolism and function. There are hundreds of RBPs encoded in the eukaryotic genomes, and each recognize its RNA targets through a specific mixture of RNA sequence and structure properties. For most RBPs, however, only a primary sequence motif has been determined, while the structure of the binding sites is uncharacterized.ResultsWe developed SSMART, an RNA motif finder that simultaneously models the primary sequence and the structural properties of the RNA targets sites. The sequence-structure motifs are represented as consensus strings over a degenerate alphabet, extending the IUPAC codes for nucleotides to account for secondary structure preferences. Evaluation on synthetic data showed that SSMART is able to recover both sequence and structure motifs implanted into 3‘UTR-like sequences, for various degrees of structured/unstructured binding sites. In addition, we successfully used SSMART on high-throughput in vivo and in vitro data, showing that we not only recover the known sequence motif, but also gain insight into the structural preferences of the RBP.AvailabilitySSMART is freely available at https://ohlerlab.mdc-berlin.de/software/SSMART_137/[email protected]


2004 ◽  
Vol 24 (14) ◽  
pp. 6241-6252 ◽  
Author(s):  
Kristina L. Carroll ◽  
Dennis A. Pradhan ◽  
Josh A. Granek ◽  
Neil D. Clarke ◽  
Jeffry L. Corden

ABSTRACT RNA polymerase II (Pol II) termination is triggered by sequences present in the nascent transcript. Termination of pre-mRNA transcription is coupled to recognition of cis-acting sequences that direct cleavage and polyadenylation of the pre-mRNA. Termination of nonpolyadenylated [non-poly(A)] Pol II transcripts in Saccharomyces cerevisiae requires the RNA-binding proteins Nrd1 and Nab3. We have used a mutational strategy to characterize non-poly(A) termination elements downstream of the SNR13 and SNR47 snoRNA genes. This approach detected two common RNA sequence motifs, GUA[AG] and UCUU. The first motif corresponds to the known Nrd1-binding site, which we have verified here by gel mobility shift assays. We also show that Nab3 protein binds specifically to RNA containing the UCUU motif. Taken together, our data suggest that Nrd1 and Nab3 binding sites play a significant role in defining non-poly(A) terminators. As is the case with poly(A) terminators, there is no strong consensus for non-poly(A) terminators, and the arrangement of Nrd1p and Nab3p binding sites varies considerably. In addition, the organization of these sequences is not strongly conserved among even closely related yeasts. This indicates a large degree of genetic variability. Despite this variability, we were able to use a computational model to show that the binding sites for Nrd1 and Nab3 can identify genes for which transcription termination is mediated by these proteins.


1993 ◽  
Vol 13 (9) ◽  
pp. 5323-5330 ◽  
Author(s):  
S A Amero ◽  
M J Matunis ◽  
E L Matunis ◽  
J W Hockensmith ◽  
G Raychaudhuri ◽  
...  

The protein on ecdysone puffs (PEP) is associated preferentially with active ecdysone-inducible puffs on Drosophila polytene chromosomes and contains sequence motifs characteristic of transcription factors and RNA-binding proteins (S. A. Amero, S. C. R. Elgin, and A. L. Beyer, Genes Dev. 5:188-200, 1991). PEP is associated with RNA in vivo, as demonstrated here by the sensitivity of PEP-specific chromosomal immunostaining in situ to RNase digestion and by the immunopurification of PEP in Drosophila cell extract with heterogeneous nuclear ribonucleoprotein (hnRNP) complexes. As revealed by sequential immunostaining, PEP is found on a subset of chromosomal sites bound by the HRB (heterogeneous nuclear RNA-binding) proteins, which are basic Drosophila hnRNPs. These observations lead us to suggest that a unique, PEP-containing hnRNP complex assembles preferentially on the transcripts of ecdysone-regulated genes in Drosophila melanogaster presumably to expedite the transcription and/or processing of these transcripts.


2020 ◽  
Author(s):  
Clémentine Delan-Forino ◽  
Christos Spanos ◽  
Juri Rappsilber ◽  
David Tollervey

ABSTRACTDuring nuclear surveillance in yeast, the RNA exosome functions together with the TRAMP complexes. These include the DEAH-box RNA helicase Mtr4 together with an RNA-binding protein (Air1 or Air2) and a poly(A) polymerase (Trf4 or Trf5). To better determine how RNA substrates are targeted, we analyzed protein and RNA interactions for TRAMP components. Mass spectrometry identified three distinct TRAMP complexes formed in vivo. These complexes preferentially assemble on different classes of transcripts. Unexpectedly, on many substrates, including pre-rRNAs and pre-mRNAs, binding specificity was apparently conferred by Trf4 and Trf5. Clustering of mRNAs by TRAMP association showed co-enrichment for mRNAs with functionally related products, supporting the significance of surveillance in regulating gene expression. We compared binding sites of TRAMP components with multiple nuclear RNA binding proteins, revealing preferential colocalization of subsets of factors. TRF5 deletion reduced Mtr4 recruitment and increased RNA abundance for mRNAs specifically showing high Trf5 binding.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zhengfeng Wang ◽  
Xiujuan Lei

Abstract Background Circular RNAs (circRNAs) are widely expressed in cells and tissues and are involved in biological processes and human diseases. Recent studies have demonstrated that circRNAs can interact with RNA-binding proteins (RBPs), which is considered an important aspect for investigating the function of circRNAs. Results In this study, we design a slight variant of the capsule network, called circRB, to identify the sequence specificities of circRNAs binding to RBPs. In this model, the sequence features of circRNAs are extracted by convolution operations, and then, two dynamic routing algorithms in a capsule network are employed to discriminate between different binding sites by analysing the convolution features of binding sites. The experimental results show that the circRB method outperforms the existing computational methods. Afterwards, the trained models are applied to detect the sequence motifs on the seven circRNA-RBP bound sequence datasets and matched to known human RNA motifs. Some motifs on circular RNAs overlap with those on linear RNAs. Finally, we also predict binding sites on the reported full-length sequences of circRNAs interacting with RBPs, attempting to assist current studies. We hope that our model will contribute to better understanding the mechanisms of the interactions between RBPs and circRNAs. Conclusion In view of the poor studies about the sequence specificities of circRNA-binding proteins, we designed a classification framework called circRB based on the capsule network. The results show that the circRB method is an effective method, and it achieves higher prediction accuracy than other methods.


2017 ◽  
Author(s):  
Jonathan M. Howard ◽  
Hai Lin ◽  
Garam Kim ◽  
Jolene M Draper ◽  
Maximilian Haeussler ◽  
...  

AbstractAlternative pre-mRNA splicing plays a major role in expanding the transcript output of human genes. This process is regulated, in part, by the interplay of trans-acting RNA binding proteins (RBPs) with myriad cis-regulatory elements scattered throughout pre-mRNAs. These molecular recognition events are critical for defining the protein coding sequences (exons) within pre-mRNAs and directing spliceosome assembly on non-coding regions (introns). One of the earliest events in this process is recognition of the 3’ splice site by U2 small nuclear RNA auxiliary factor 2 (U2AF2). Splicing regulators, such as the heterogeneous nuclear ribonucleoprotein A1 (HNRNPA1), influence spliceosome assembly both in vitro and in vivo, but their mechanisms of action remain poorly described on a global scale. HNRNPA1 also promotes proof reading of 3’ss sequences though a direct interaction with the U2AF heterodimer. To determine how HNRNPA1 regulates U2AF-RNA interactions in vivo, we analyzed U2AF2 RNA binding specificity using individual-nucleotide resolution crosslinking immunoprecipitation (iCLIP) in control- and HNRNPA1 over-expression cells. We observed changes in the distribution of U2AF2 crosslinking sites relative to the 3’ splice sites of alternative cassette exons but not constitutive exons upon HNRNPA1 over-expression. A subset of these events shows a concomitant increase of U2AF2 crosslinking at distal intronic regions, suggesting a shift of U2AF2 to “decoy” binding sites. Of the many non-canonical U2AF2 binding sites, Alu-derived RNA sequences represented one of the most abundant classes of HNRNPA1-dependent decoys. Splicing reporter assays demonstrated that mutation of U2AF2 decoy sites inhibited HNRNPA1-dependent exon skipping in vivo. We propose that HNRNPA1 regulates exon definition by modulating the interaction of U2AF2 with decoy or bona fide 3’ splice sites.


2018 ◽  
Author(s):  
Michael A. Rieger ◽  
Dana M. King ◽  
Barak A. Cohen ◽  
Joseph D. Dougherty

AbstractCELF6 is a RNA-binding protein in a family of proteins with roles in human health and disease, however little is known about the mRNA targets or in vivo function of this protein. We utilized CLIP-Seq to identify, for the first time, in vivo targets of CELF6 and identify hundreds of transcripts bound by CELF6 in the brain. We found these are disproportionately mRNAs coding for synaptic proteins. We then conducted functional validation of these targets, testing greater than 400 CELF6 bound sequence elements for their activity, applying a massively parallel reporter assay framework to evaluation of the CLIP data. We also mutated potential binding motifs within these elements and tested their impact. This comprehensive analysis led us to ascribe a previously unknown function to CELF6: we found bound elements were generally repressive of translation, that CELF6 further enhances this repression via decreasing RNA abundance, and this process was dependent on UGU-rich sequence motifs. This greatly extends the known role for CELF6, which had previously been defined only as a splicing factor. We further extend these findings by demonstrating the same function for CELF3, CELF4, and CELF5. Finally, we demonstrate that the CELF6 targets are derepressed in CELF6 mutant mice in vivo, confirming this new role in the brain. Thus, our study demonstrates that CELF6 and other sub-family members are repressive CNS RNA-binding proteins, and CELF6 downregulates specific mRNAs in vivo.


2020 ◽  
Author(s):  
Kotaro Chihara ◽  
Lars Barquist ◽  
Kenichi Takasugi ◽  
Naohiro Noda ◽  
Satoshi Tsuneda

ABSTRACTPosttranscriptional regulation of gene expression in bacteria is performed by a complex and hierarchical signaling cascade. Pseudomonas aeruginosa harbors two redundant RNA-binding proteins RsmA/RsmN (RsmA/N), which play a critical role in balancing acute and chronic infections. However, in vivo binding sites on target transcripts and the overall impact on the physiology remains unclear. In this study, we applied in vivo UV crosslinking immunoprecipitation followed by RNA-sequencing (UV CLIP-seq) to detect RsmA/N binding sites at single-nucleotide resolution and mapped more than 500 peaks to approximately 400 genes directly bound by RsmA/N in P. aeruginosa. This also demonstrated the ANGGA sequence in apical loops skewed towards 5’UTRs as a consensus motif for RsmA/N binding. Genetic analysis combined with CLIP-seq results identified previously unrecognized RsmA/N targets involved in LPS modification. Moreover, the small non-coding RNAs RsmY/RsmZ, which sequester RsmA/N away from target mRNAs, are positively regulated by the RsmA/N-mediated translational repression of hptB, encoding a histidine phosphotransfer protein, and cafA, encoding a cytoplasmic axial filament protein, thus providing a possible mechanistic explanation for homeostasis of the Rsm system. Our findings present the global RsmA/N-RNA interaction network that exerts pleiotropic effects on gene expression in P. aeruginosa.IMPORTANCEThe ubiquitous bacterium Pseudomonas aeruginosa is notorious as an opportunistic pathogen causing life-threatening acute and chronic infections in immunocompromised patients. P. aeruginosa infection processes are governed by two major gene regulatory systems, namely, the GacA/GacS (GacAS) two-component system and the RNA-binding proteins RsmA/RsmN (RsmA/N). RsmA/N basically function as a translational repressor or activator directly by competing with the ribosome. In this study, we identified hundreds of RsmA/N regulatory target RNAs and the consensus motifs for RsmA/N bindings by UV crosslinking in vivo. Moreover, our CLIP-seq revealed that RsmA/N posttranscriptionally regulate cell wall organization and exert feedback control on GacAS-RsmA/N systems. Many genes including small regulatory RNAs identified in this study are attractive targets for further elucidating the regulatory mechanisms of RsmA/N in P. aeruginosa.


2020 ◽  
Author(s):  
Benjamin Lang ◽  
Jae-Seong Yang ◽  
Mireia Garriga-Canut ◽  
Silvia Speroni ◽  
Maria Gili ◽  
...  

AbstractRNA-binding proteins (RBPs) are crucial factors of post-transcriptional gene regulation and their modes of action are intensely investigated. At the center of attention are RNA motifs that guide where RBPs bind. However, sequence motifs are often poor predictors of RBP-RNA interactions in vivo. It is hence believed that many RBPs recognize RNAs as complexes, to increase specificity and regulatory possibilities. To probe the potential for complex formation among RBPs, we assembled a library of 978 mammalian RBPs and used rec-Y2H screening to detect direct interactions between RBPs, sampling > 600 K interactions. We discovered 1994 new interactions and demonstrate that interacting RBPs bind RNAs adjacently in vivo. We further find that the mRNA binding region and motif preferences of RBPs can deviate, depending on their adjacently binding interaction partners. Finally, we reveal novel RBP interaction networks among major RNA processing steps and show that splicing impairing RBP mutations observed in cancer rewire spliceosomal interaction networks.Graphical abstract


2020 ◽  
Author(s):  
Trine Line Hauge Okholm ◽  
Shashank Sathe ◽  
Samuel S. Park ◽  
Andreas Bjerregaard Kamstrup ◽  
Asta Mannstaedt Rasmussen ◽  
...  

AbstractCircular RNAs (circRNAs) are stable, often highly expressed RNA transcripts with potential to modulate other regulatory RNAs. A few circRNAs have been shown to bind RNA binding proteins (RBPs), however, little is known about the prevalence and strength of these interactions in different biological contexts. Here, we comprehensively evaluate the interplay between circRNAs and RBPs in the ENCODE cell lines, HepG2 and K562, by profiling the expression of circRNAs in fractionated total RNA-sequencing samples and analyzing binding sites of 150 RBPs in large eCLIP data sets. We show that KHSRP binding sites are enriched in flanking introns of circRNAs in both HepG2 and K562 cells, and that KHSRP depletion affects circRNA biogenesis. Additionally, we show that exons forming circRNAs are generally enriched with RBP binding sites compared to non-circularizing exons. To detect individual circRNAs with regulatory potency, we computationally identify circRNAs that are highly covered by RBP binding sites and experimentally validate circRNA-RBP interactions by RNA immunoprecipitations. We characterize circCDYL, a highly expressed circRNA with clinical and functional implications in bladder cancer, which is covered with GRWD1 binding sites. We confirm that circCDYL binds GRWD1 in vivo and functionally characterizes the effect of circCDYL-GRWD1 interactions on target genes in HepG2. Furthermore, we confirm interactions between circCDYL and RBPs in bladder cancer cells and demonstrate that circCDYL depletion affects hallmarks of cancer and perturbs the expression of key cancer genes, e.g. TP53 and MYC. Finally, we show that elevated levels of highly RBP-covered circRNAs, including circCDYL, are associated with overall survival of bladder cancer patients. Our study demonstrates transcriptome-wide and cell-type-specific circRNA-RBP interactions that could play important regulatory roles in tumorigenesis.


Sign in / Sign up

Export Citation Format

Share Document