scholarly journals CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion

2021 ◽  
Author(s):  
Alexander Krohannon ◽  
Mansi Srivastava ◽  
Simone Rauch ◽  
Rajneesh Srivastava ◽  
Bryan Dickinson ◽  
...  

Recent discovery of the gene editing system - CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) associated proteins (Cas), has resulted in its widespread use for improved understanding of a variety of biological systems. Cas13, a lesser studied Cas protein, has been repurposed to allow for efficient and precise editing of RNA molecules. The Cas13 system utilizes base complementarity between a crRNA/sgRNA (crispr RNA or single guide RNA) and a target RNA transcript, to preferentially bind to only the target transcript. Unlike targeting the upstream regulatory regions of protein coding genes on the genome, the transcriptome is significantly more redundant, leading to many transcripts having wide stretches of identical nucleotide sequences. Transcripts also exhibit complex three-dimensional structures and interact with an array of RBPs (RNA Binding Proteins), both of which further limit the scope of effective target sequences. As a result, there currently exists no method to predict whether a specific sgRNA will effectively knockdown a transcript. Here we present a novel machine learning and computational tool, CASowary, to predict the efficacy of a sgRNA. We used publicly available RNA knockdown data from Cas13 characterization experiments for 555 sgRNAs targeting the transcriptome in HEK293 cells, in conjunction with transcriptome-wide protein occupancy information on RNA. Our model utilizes a Decision Tree architecture with a set of 112 sequence and target availability features, to classify sgRNA efficacy into one of four classes, based upon expected level of target transcript knockdown. After accounting for noise in the training data set, the noise-normalized accuracy exceeds 70%. Additionally, highly effective sgRNA predictions have been experimentally validated using an independent RNA targeting Cas system - CIRTS, confirming the robustness and reproducibility of our model's sgRNA predictions. Utilizing transcriptome wide protein occupancy map generated using POP-seq in Hela cells against publicly available protein-RNA interaction map in Hek293 cells, we show that CASowary can predict high quality guides for numerous transcripts in a cell line specific manner. Application of CASowary to whole transcriptomes should enable rapid deployment of CRISPR/Cas13 systems, facilitating the development of therapeutic interventions linked with aberrations in RNA regulatory processes.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mansi Srivastava ◽  
Rajneesh Srivastava ◽  
Sarath Chandra Janga

AbstractInteraction between proteins and RNA is critical for post-transcriptional regulatory processes. Existing high throughput methods based on crosslinking of the protein–RNA complexes and poly-A pull down are reported to contribute to biases and are not readily amenable for identifying interaction sites on non poly-A RNAs. We present Protein Occupancy Profile-Sequencing (POP-seq), a phase separation based method in three versions, one of which does not require crosslinking, thus providing unbiased protein occupancy profiles on whole cell transcriptome without the requirement of poly-A pulldown. Our study demonstrates that ~ 68% of the total POP-seq peaks exhibited an overlap with publicly available protein–RNA interaction profiles of 97 RNA binding proteins (RBPs) in K562 cells. We show that POP-seq variants consistently capture protein–RNA interaction sites across a broad range of genes including on transcripts encoding for transcription factors (TFs), RNA-Binding Proteins (RBPs) and long non-coding RNAs (lncRNAs). POP-seq identified peaks exhibited a significant enrichment (p value < 2.2e−16) for GWAS SNPs, phenotypic, clinically relevant germline as well as somatic variants reported in cancer genomes, suggesting the prevalence of uncharacterized genomic variation in protein occupied sites on RNA. We demonstrate that the abundance of POP-seq peaks increases with an increase in expression of lncRNAs, suggesting that highly expressed lncRNA are likely to act as sponges for RBPs, contributing to the rewiring of protein–RNA interaction network in cancer cells. Overall, our data supports POP-seq as a robust and cost-effective method that could be applied to primary tissues for mapping global protein occupancies.


2020 ◽  
Author(s):  
Mansi Srivastava ◽  
Rajneesh Srivastava ◽  
Sarath Chandra Janga

AbstractInteraction between proteins and RNA is critical for post-transcriptional regulatory processes. Existing high throughput methods based on crosslinking of the protein-RNA complexes and polyA pull down are reported to contribute to biases and are not readily amenable for identifying interaction sites on non polyA RNAs. We present Protein Occupancy Profile-Sequencing (POP-seq), a phase separation based method in three versions, one of which does not require crosslinking, thus providing unbiased protein occupancy profiles on whole cell transcriptome without the requirement of polyA pulldown. Our study demonstrates that ~68% of the total POP-seq peaks exhibited an overlap with publicly available protein-RNA interaction profiles of 97 RNA binding proteins (RBPs) in K562 cells. We show that POP-seq variants consistently capture protein-RNA interaction sites across a broad range of genes including on transcripts encoding for transcription factors (TFs), RNA-Binding Proteins (RBPs) and long non-coding RNAs (lncRNAs). POP-seq identified peaks exhibited a significant enrichment (p value < 2.2e-16) for GWAS SNPs, phenotypic, clinically relevant germline as well as somatic variants reported in cancer genomes, suggesting the prevalence of uncharacterized genomic variation in protein occupied sites on RNA. We demonstrate that the abundance of POP-seq peaks increases with an increase in expression of lncRNAs, suggesting that highly expressed lncRNA are likely to act as sponges for RBPs, contributing to the rewiring of protein-RNA interaction network in cancer cells. Overall, our data supports POP-seq as a robust and cost-effective method that could be applied to primary tissues for mapping global protein occupancies.


2018 ◽  
Author(s):  
Emad Bahrami-Samani ◽  
Yi Xing

AbstractGene expression is tightly regulated at the post-transcriptional level through splicing, transport, translation, and decay. RNA-binding proteins (RBPs) play key roles in post-transcriptional gene regulation, and genetic variants that alter RBP-RNA interactions can affect gene products and functions. We developed a computational method ASPRIN (Allele-Specific Protein-RNA Interaction), that uses a joint analysis of CLIP-seq (cross-linking and immunoprecipitation followed by high-throughput sequencing) and RNA-seq data to identify genetic variants that alter RBP-RNA interactions by directly observing the allelic preference of RBP from CLIP-seq experiments as compared to RNA-seq. We used ASPRIN to systematically analyze CLIP-seq and RNA-seq data for 166 RBPs in two ENCODE (Encyclopedia of DNA Elements) cell lines. ASPRIN identified genetic variants that alter RBP-RNA interactions by modifying RBP binding motifs within RNA. Moreover, through an integrative ASPRIN analysis with population-scale RNA-seq data, we showed that ASPRIN can help reveal potential causal variants that affect alternative splicing via allele-specific protein-RNA interactions.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jordy Homing Lam ◽  
Yu Li ◽  
Lizhe Zhu ◽  
Ramzan Umarov ◽  
Hanlun Jiang ◽  
...  

Abstract Protein-RNA interaction plays important roles in post-transcriptional regulation. However, the task of predicting these interactions given a protein structure is difficult. Here we show that, by leveraging a deep learning model NucleicNet, attributes such as binding preference of RNA backbone constituents and different bases can be predicted from local physicochemical characteristics of protein structure surface. On a diverse set of challenging RNA-binding proteins, including Fem-3-binding-factor 2, Argonaute 2 and Ribonuclease III, NucleicNet can accurately recover interaction modes discovered by structural biology experiments. Furthermore, we show that, without seeing any in vitro or in vivo assay data, NucleicNet can still achieve consistency with experiments, including RNAcompete, Immunoprecipitation Assay, and siRNA Knockdown Benchmark. NucleicNet can thus serve to provide quantitative fitness of RNA sequences for given binding pockets or to predict potential binding pockets and binding RNAs for previously unknown RNA binding proteins.


2019 ◽  
Vol 35 (23) ◽  
pp. 4867-4870
Author(s):  
Chengyu Liu ◽  
Yu-Chen Liu ◽  
Hsien-Da Huang ◽  
Wei Wang

Abstract Motivation In recent years, multiple circular RNAs (circRNA) biogenesis mechanisms have been discovered. Although each reported mechanism has been experimentally verified in different circRNAs, no single biogenesis mechanism has been proposed that can universally explain the biogenesis of all tens of thousands of discovered circRNAs. Under the hypothesis that human circRNAs can be categorized according to different biogenesis mechanisms, we designed a contextual regression model trained to predict the formation of circular RNA from a random genomic locus on human genome, with potential biogenesis factors of circular RNA as the features of the training data. Results After achieving high prediction accuracy, we found through the feature extraction technique that the examined human circRNAs can be categorized into seven subgroups, according to the presence of the following sequence features: RNA editing sites, simple repeat sequences, self-chains, RNA binding protein binding sites and CpG islands within the flanking regions of the circular RNA back-spliced junction sites. These results support all of the previously reported biogenesis mechanisms of circRNA and solidify the idea that multiple biogenesis mechanisms co-exist for different subset of human circRNAs. Furthermore, we uncover a potential new links between circRNA biogenesis and flanking CpG island. We have also identified RNA binding proteins putatively correlated with circRNA biogenesis. Availability and implementation Scripts and tutorial are available at http://wanglab.ucsd.edu/star/circRNA. This program is under GNU General Public License v3.0. Supplementary information Supplementary data are available at Bioinformatics online.


2012 ◽  
Vol 3 (5) ◽  
pp. 403-414 ◽  
Author(s):  
Jochen Imig ◽  
Alexander Kanitz ◽  
André P. Gerber

AbstractThe development of genome-wide analysis tools has prompted global investigation of the gene expression program, revealing highly coordinated control mechanisms that ensure proper spatiotemporal activity of a cell’s macromolecular components. With respect to the regulation of RNA transcripts, the concept of RNA regulons, which – by analogy with DNA regulons in bacteria – refers to the coordinated control of functionally related RNA molecules, has emerged as a unifying theory that describes the logic of regulatory RNA-protein interactions in eukaryotes. Hundreds of RNA-binding proteins and small non-coding RNAs, such as microRNAs, bind to distinct elements in target RNAs, thereby exerting specific and concerted control over posttranscriptional events. In this review, we discuss recent reports committed to systematically explore the RNA-protein interaction network and outline some of the principles and recurring features of RNA regulons: the coordination of functionally related mRNAs through RNA-binding proteins or non-coding RNAs, the modular structure of its components, and the dynamic rewiring of RNA-protein interactions upon exposure to internal or external stimuli. We also summarize evidence for robust combinatorial control of mRNAs, which could determine the ultimate fate of each mRNA molecule in a cell. Finally, the compilation and integration of global protein-RNA interaction data has yielded first insights into network structures and provided the hypothesis that RNA regulons may, in part, constitute noise ‘buffers’ to handle stochasticity in cellular transcription.


2019 ◽  
Author(s):  
Sean R. Kundinger ◽  
Isaac Bishof ◽  
Eric B. Dammer ◽  
Duc M. Duong ◽  
Nicholas T. Seyfried

AbstractArginine (Arg)-rich RNA-binding proteins play an integral role in RNA metabolism. Post-translational modifications (PTMs) within Arg-rich domains, such as phosphorylation and methylation, regulate multiple steps in RNA metabolism. However, the identification of PTMs within Arg-rich domains with complete trypsin digestion is extremely challenging due to the high density of Arg residues within these proteins. Here, we report a middle-down proteomic approach coupled with electron transfer dissociation (ETD) mass spectrometry to map previously unknown sites of phosphorylation and methylation within the Arg-rich domains of U1-70K and structurally similar RNA-binding proteins from nuclear extracts of HEK293 cells. Remarkably, the Arg-rich domains in RNA-binding proteins are densely modified by methylation and phosphorylation compared with the remainder of the proteome, with di-methylation and phosphorylation favoring RSRS motifs. Although they favor a common motif, analysis of combinatorial PTMs within RSRS motifs indicate that phosphorylation and methylation do not often co-occur, suggesting they may functionally oppose one another. Collectively, these findings suggest that the level of PTMs within Arg-rich domains may be among the highest in the proteome, and a possible unexplored regulator of RNA metabolism. These data also serve as a resource to facilitate future mechanistic studies of the role of PTMs in RNA-binding protein structure and function.BriefsMiddle-down proteomics reveals arginine-rich RNA-binding proteins contain many sites of methylation and phosphorylation.


2021 ◽  
Vol 22 (17) ◽  
pp. 9416
Author(s):  
Rafał Mańka ◽  
Pawel Janas ◽  
Karolina Sapoń ◽  
Teresa Janas ◽  
Tadeusz Janas

RNA motifs may promote interactions with exosomes (EXO-motifs) and lipid rafts (RAFT-motifs) that are enriched in exosomal membranes. These interactions can promote selective RNA loading into exosomes. We quantified the affinity between RNA aptamers containing various EXO- and RAFT-motifs and membrane lipid rafts in a liposome model of exosomes by determining the dissociation constants. Analysis of the secondary structure of RNA molecules provided data about the possible location of EXO- and RAFT-motifs within the RNA structure. The affinity of RNAs containing RAFT-motifs (UUGU, UCCC, CUCC, CCCU) and some EXO-motifs (CCCU, UCCU) to rafted liposomes is higher in comparison to aptamers without these motifs, suggesting direct RNA-exosome interaction. We have confirmed these results through the determination of the dissociation constant values of exosome-RNA aptamer complexes. RNAs containing EXO-motifs GGAG or UGAG have substantially lower affinity to lipid rafts, suggesting indirect RNA-exosome interaction via RNA binding proteins. Bioinformatics analysis revealed RNA aptamers containing both raft- and miRNA-binding motifs and involvement of raft-binding motifs UCCCU and CUCCC. A strategy is proposed for using functional RNA aptamers (fRNAa) containing both RAFT-motif and a therapeutic motif (e.g., miRNA inhibitor) to selectively introduce RNAs into exosomes for fRNAa delivery to target cells for personalized therapy.


2020 ◽  
Author(s):  
Sungyul Lee ◽  
Young-suk Lee ◽  
Yeon Choi ◽  
Ahyeon Son ◽  
Youngran Park ◽  
...  

AbstractSARS-CoV-2 is an RNA virus whose success as a pathogen relies on its ability to repurpose host RNA-binding proteins (RBPs) to form its own RNA interactome. Here, we developed and applied a robust ribonucleoprotein capture protocol to uncover the SARS-CoV-2 RNA interactome. We report 109 host factors that directly bind to SARS-CoV-2 RNAs including general antiviral factors such as ZC3HAV1, TRIM25, and PARP12. Applying RNP capture on another coronavirus HCoV-OC43 revealed evolutionarily conserved interactions between viral RNAs and host proteins. Network and transcriptome analyses delineated antiviral RBPs stimulated by JAK-STAT signaling and proviral RBPs responsible for hijacking multiple steps of the mRNA life cycle. By knockdown experiments, we further found that these viral-RNA-interacting RBPs act against or in favor of SARS-CoV-2. Overall, this study provides a comprehensive list of RBPs regulating coronaviral replication and opens new avenues for therapeutic interventions.


Sign in / Sign up

Export Citation Format

Share Document