scholarly journals Classification and clustering of RNA crosslink-ligation data reveal complex structures and homodimers

2021 ◽  
Author(s):  
Minjie Zhang ◽  
Irena T Fischer-Hwang ◽  
Kongpan Li ◽  
Jianhui Bai ◽  
Jian-Fu Chen ◽  
...  

The recent development and application of methods based on the general principle of "crosslinking and proximity ligation" (crosslink-ligation) are revolutionizing RNA structure studies in living cells. However, extracting structure information from such data presents unique challenges. Here we introduce a set of computational tools for the systematic analysis of data from a wide variety of crosslink-ligation methods, specifically focusing on read mapping, alignment classification and clustering. We design a new strategy to map short reads with irregular gaps at high sensitivity and specificity. Analysis of previously published data reveals distinct properties and bias caused by the crosslinking reactions. We perform rigorous and exhaustive classification of alignments and discover 8 types of arrangements that provide distinct information on RNA structures and interactions. To deconvolve the dense and intertwined gapped alignments, we develop a net-work/graph-based tool CRSSANT (Crosslinked RNA Secondary Structure Analysis using Network Techniques), which enables clustering of gapped alignments and discovery of new alternative and dynamic conformations. We discover that multiple crosslinking and ligation events can occur on the same RNA, generating multi-segment alignments to report complex high level RNA structures and multi-RNA interactions. We find that alignments with overlapped segments are produced from potential homodimers and develop a new method for their de novo identification. Analysis of overlapping alignments revealed potential new homodimers in cellular noncoding RNAs and RNA virus genomes in the Picornaviridae family. Together, this suite of computational tools enables rapid and efficient analysis of RNA structure and interaction data in living cells.

2021 ◽  
Author(s):  
Tycho Marinus ◽  
Adam B Fessler ◽  
Craig A Ogle ◽  
Danny Incarnato

Abstract Due to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.


2015 ◽  
Author(s):  
Katarzyna B Hooks ◽  
Samina Naseeb ◽  
Sam Griffiths-Jones ◽  
Daniela Delneri

The Saccharomyces cerevisiae genome has undergone extensive intron loss during its evolutionary history. It has been suggested that the few remaining introns (in only 5% of protein-coding genes) are retained because of their impact on function under stress conditions. Here, we explore the possibility that novel non-coding RNA structures (ncRNAs) are embedded within intronic sequences and are contributing to phenotype and intron retention in yeast. We employed de novo RNA structure prediction tools to screen intronic sequences in S. cerevisiae and 36 other fungi. We identified and validated 19 new intronic RNAs via RNAseq and RT-PCR. Contrary to common belief that excised introns are rapidly degraded, we found that, in six cases, the excised introns were maintained intact in the cells. In other two cases we showed that the ncRNAs were further processed from their introns. RNAseq analysis confirmed higher expression of introns in the ribosomial protein genes containing predicted RNA structures. We deleted the novel intronic RNA structure within the GLC7 intron and showed that this predicted ncRNA, rather than the intron itself, is responsible for the cell???s ability to respond to salt stress. We also showed a direct association between the presence of the intronic ncRNA and GLC7 expression. Overall, these data support the notion that some introns may have been maintained in the genome because they harbour functional ncRNAs.


2020 ◽  
Author(s):  
Veronica F. Busa ◽  
Alexander V. Favorov ◽  
Elana J. Fertig ◽  
Anthony K. L. Leung

AbstractThe etiology of diseases driven by dysregulated mRNA metabolism can be elucidated by characterizing the responsible RNA-binding proteins (RBPs). Although characterizations of RBPs have been mainly focused on their binding sequences, not much has been investigated about their preferences for RNA structures. We present nearBynding, an R/Bioconductor pipeline that incorporates RBP binding sites and RNA structure information to discern structural binding preferences for an RBP. nearBynding visualizes RNA structure at and proximal to sites of RBP binding transcriptome-wide, analyzes CLIP-seq data without peak-calling, and provides a flexible scaffold to study RBP binding preferences relative to diverse RNA structure data types.


2020 ◽  
Author(s):  
Tycho Marinus ◽  
Adam B. Fessler ◽  
Craig A. Ogle ◽  
Danny Incarnato

ABSTRACTDue to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art SHAPE reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.


Author(s):  
Haopeng Yu ◽  
Yi Zhang ◽  
Qing Sun ◽  
Huijie Gao ◽  
Shiheng Tao

Abstract RNA fulfills a crucial regulatory role in cells by folding into a complex RNA structure. To date, a chemical compound, dimethyl sulfate (DMS), has been developed to probe the RNA structure at the transcriptome level effectively. We proposed a database, RSVdb (https://taolab.nwafu.edu.cn/rsvdb/), for the browsing and visualization of transcriptome RNA structures. RSVdb, including 626 225 RNAs with validated DMS reactivity from 178 samples in eight species, supports four main functions: information retrieval, research overview, structure prediction and resource download. Users can search for species, studies, transcripts and genes of interest; browse the quality control of sequencing data and statistical charts of RNA structure information; preview and perform online prediction of RNA structures in silico and under DMS restraint of different experimental treatments and download RNA structure data for species and studies. Together, RSVdb provides a reference for RNA structure and will support future research on the function of RNA structure at the transcriptome level.


2019 ◽  
Author(s):  
Haopeng Yu ◽  
Yi Zhang ◽  
Qing Sun ◽  
Huijie Gao ◽  
Shiheng Tao

ABSTRACTRNA fulfills a crucial regulatory role in cells by folding into a complex RNA structure. To date, a chemical compound, dimethyl sulfate (DMS), has been developed to effectively probe the RNA structure at the transcriptome level. We proposed a database, RSVdb (https://taolab.nwafu.edu.cn/rsvdb/), for the browsing and visualization of transcriptome RNA structures. RSVdb, including 626,225 RNAs with validated DMS reactivity from 178 samples in 8 species, supports four main functions: information retrieval, research overview, structure prediction, and resource download. Users can search for species, studies, transcripts and genes of interest; browse the quality control of sequencing data and statistical charts of RNA structure information; preview and perform online prediction of RNA structures in silico and under DMS restraint of different experimental treatments; and download RNA structure data for species and studies. Together, RSVdb provides a reference for RNA structure and will support future research on the function of RNA structure at the transcriptome level.


2017 ◽  
Author(s):  
Josef Pánek ◽  
Martin Černý

ABSTRACTWhile understanding the structure of RNA molecules is vital for deciphering their functions, determining RNA structures experimentally is exceptionally hard. At the same time, extant approaches to computational RNA structure prediction have limited applicability and reliability. In this paper we provide a method to solve a simpler yet still biologically relevant problem: prediction of secondary RNA structure using structure of different molecules as a template.Our method identifies conserved and unconserved subsequences within an RNA molecule. For conserved subsequences, the template structure is directly transferred into the generated structure and combined with de-novo predicted structure for the unconserved subsequences with low evolutionary conservation. The method also determines, when the generated structure is unreliable.The method is validated using experimentally identified structures. The accuracy of the method exceeds that of classical prediction algorithms and constrained prediction methods. This is demonstrated by comparison using large number of heterogeneous RNAs. The presented method is fast and robust, and useful for various applications requiring knowledge of secondary structures of individual RNA sequences.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Qing-Jun Luo ◽  
Jinsong Zhang ◽  
Pan Li ◽  
Qing Wang ◽  
Yue Zhang ◽  
...  

AbstractIt is known that an RNA’s structure determines its biological function, yet current RNA structure probing methods only capture partial structure information. The ability to measure intact (i.e., full length) RNA structures will facilitate investigations of the functions and regulation mechanisms of small RNAs and identify short fragments of functional sites. Here, we present icSHAPE-MaP, an approach combining in vivo selective 2′-hydroxyl acylation and mutational profiling to probe intact RNA structures. We further showcase the RNA structural landscape of substrates bound by human Dicer based on the combination of RNA immunoprecipitation pull-down and icSHAPE-MaP small RNA structural profiling. We discover distinct structural categories of Dicer substrates in correlation to both their binding affinity and cleavage efficiency. And by tertiary structural modeling constrained by icSHAPE-MaP RNA structural data, we find the spatial distance measuring as an influential parameter for Dicer cleavage-site selection.


2021 ◽  
Vol 18 (1) ◽  
Author(s):  
César Augusto Diniz Xavier ◽  
Margaret Louise Allen ◽  
Anna Elizabeth Whitfield

Abstract Background Advances in sequencing and analysis tools have facilitated discovery of many new viruses from invertebrates, including ants. Solenopsis invicta is an invasive ant that has quickly spread worldwide causing significant ecological and economic impacts. Its virome has begun to be characterized pertaining to potential use of viruses as natural enemies. Although the S. invicta virome is the best characterized among ants, most studies have been performed in its native range, with less information from invaded areas. Methods Using a metatranscriptome approach, we further identified and molecularly characterized virus sequences associated with S. invicta, in two introduced areas, U.S and Taiwan. The data set used here was obtained from different stages (larvae, pupa, and adults) of S. invicta life cycle. Publicly available RNA sequences from GenBank’s Sequence Read Archive were downloaded and de novo assembled using CLC Genomics Workbench 20.0.1. Contigs were compared against the non-redundant protein sequences and those showing similarity to viral sequences were further analyzed. Results We characterized five putative new viruses associated with S. invicta transcriptomes. Sequence comparisons revealed extensive divergence across ORFs and genomic regions with most of them sharing less than 40% amino acid identity with those closest homologous sequences previously characterized. The first negative-sense single-stranded RNA virus genomic sequences included in the orders Bunyavirales and Mononegavirales are reported. In addition, two positive single-strand virus genome sequences and one single strand DNA virus genome sequence were also identified. While the presence of a putative tenuivirus associated with S. invicta was previously suggested to be a contamination, here we characterized and present strong evidence that Solenopsis invicta virus 14 (SINV-14) is a tenui-like virus that has a long-term association with the ant. Furthermore, based on virus sequence abundance compared to housekeeping genes, phylogenetic relationships, and completeness of viral coding sequences, our results suggest that four of five virus sequences reported, those being SINV-14, SINV-15, SINV-16 and SINV-17, may be associated to viruses actively replicating in the ant S. invicta. Conclusions The present study expands our knowledge about viral diversity associated with S. invicta in introduced areas with potential to be used as biological control agents, which will require further biological characterization.


Sign in / Sign up

Export Citation Format

Share Document