EVALUATING MIXTURE MODELS FOR BUILDING RNA KNOWLEDGE-BASED POTENTIALS

Ribonucleic acid (RNA) molecules play important roles in a variety of biological processes. To properly function, RNA molecules usually have to fold to specific structures, and therefore understanding RNA structure is vital in comprehending how RNA functions. One approach to understanding and predicting biomolecular structure is to use knowledge-based potentials built from experimentally determined structures. These types of potentials have been shown to be effective for predicting both protein and RNA structures, but their utility is limited by their significantly rugged nature. This ruggedness (and hence the potential's usefulness) depends heavily on the choice of bin width to sort structural information (e.g. distances) but the appropriate bin width is not known a priori. To circumvent the binning problem, we compared knowledge-based potentials built from inter-atomic distances in RNA structures using different mixture models (Kernel Density Estimation, Expectation Minimization and Dirichlet Process). We show that the smooth knowledge-based potential built from Dirichlet process is successful in selecting native-like RNA models from different sets of structural decoys with comparable efficacy to a potential developed by spline-fitting — a commonly taken approach — to binned distance histograms. The less rugged nature of our potential suggests its applicability in diverse types of structural modeling.

Download Full-text

PTRNAmark: an all-atomic distance-dependent knowledge-based potential for 3D RNA structure evaluation

10.1101/076000 ◽

2016 ◽

Author(s):

Yi Yang ◽

Qi Gu ◽

Ya-Zhou Shi

Keyword(s):

Rna Structure ◽

Data Sets ◽

Rna Structures ◽

Tertiary Structures ◽

Rna Molecules ◽

Knowledge Based ◽

Powerful Approach ◽

Testing Data ◽

Atomic Distance ◽

Structure Evaluation

ABSTRACTRNA molecules play vital biological roles, and understanding their structures gives us crucial insights into their biological functions. Model evaluation is a necessary step for better prediction and design of 3D RNA structures. Knowledge-based statistical potential has been proved to be a powerful approach for evaluating models of protein tertiary structures. In present, several knowledge-based potentials have also been proposed to assess models of RNA 3D structures. However, further amelioration is required to rank near-native structures and pick out the native structure from near-native structures, which is crucial in the prediction of RNA tertiary structures. In this work, we built a novel RNA knowledge-based potential— PTRNAmark, which not only combines nucleotides’ mutual and self energies but also fully considers the specificity of every RNA. The benchmarks on different testing data sets all show that PTRNAmark are more efficient than existing evaluation methods in recognizing native state from a pool of near-native states of RNAs as well as in ranking near-native states of RNA models.

Download Full-text

Cotranscriptional kinetic folding of RNA secondary structures

10.1101/2020.07.10.196972 ◽

2020 ◽

Author(s):

Vo Hong Thanh ◽

Pekka Orponen

Keyword(s):

Structure Formation ◽

Rna Structure ◽

Rna Folding ◽

Computational Prediction ◽

Simulation Method ◽

Rna Structures ◽

Rna Molecules ◽

Primitive Operation ◽

Nucleotide Resolution ◽

Kinetics Of

Computational prediction of RNA structures is an important problem in computational structural biology. Studies of RNA structure formation often assume that the process starts from a fully synthesized sequence. Experimental evidence, however, has shown that RNA folds concurrently with its elongation. We investigate RNA structure formation, taking into account also the cotranscriptional effects. We propose a single-nucleotide resolution kinetic model of the folding process of RNA molecules, where the polymerase-driven elongation of an RNA strand by a new nucleotide is included as a primitive operation, together with a stochastic simulation method that implements this folding concurrently with the transcriptional synthesis. Numerical case studies show that our cotranscriptional RNA folding model can predict the formation of metastable conformations that are favored in actual biological systems. Our new computational tool can thus provide quantitative predictions and offer useful insights into the kinetics of RNA folding.

Download Full-text

A novel SHAPE reagent enables the analysis of RNA structure in living cells with unprecedented accuracy

Nucleic Acids Research ◽

10.1093/nar/gkaa1255 ◽

2021 ◽

Author(s):

Tycho Marinus ◽

Adam B Fessler ◽

Craig A Ogle ◽

Danny Incarnato

Keyword(s):

Rna Structure ◽

Structure Prediction ◽

Critical Role ◽

Living Cells ◽

Pathological Process ◽

Rna Structures ◽

Rna Molecules ◽

Derived Data

Abstract Due to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.

Download Full-text

Distributed Biotin-Streptavidin Transcription Roadblocks for Mapping Cotranscriptional RNA Folding

10.1101/100073 ◽

2017 ◽

Author(s):

Eric J. Strobel ◽

Kyle E. Watters ◽

Julius B. Lucks

Keyword(s):

Experimental Study ◽

Rna Structure ◽

Rna Folding ◽

Folding Pathway ◽

Decision Making Process ◽

Important Process ◽

Rna Structures ◽

Rna Molecules ◽

Fundamental Properties ◽

Nucleotide Resolution

AbstractRNA molecules fold cotranscriptionally as they emerge from RNA polymerase. Cotranscriptional folding is an important process for proper RNA structure formation as the order of folding can determine an RNA molecule’s structure, and thus its functional properties. Despite its fundamental importance, the experimental study of RNA cotranscriptional folding has been limited by the lack of easily approachable methods that can interrogate nascent RNA structures at nucleotide resolution during transcription. We previously developed cotranscriptional selective 2’-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-seq) to simultaneously probe all of the intermediate structures an RNA molecule transitions through during transcription elongation. Here, we improve the broad applicability of cotranscriptional SHAPE-Seq by developing a sequence-independent streptavidin roadblocking strategy to simplify the preparation of roadblocking transcription templates. We determine the fundamental properties of streptavidin roadblocks and show that randomly distributed streptavidin roadblocks can be used in cotranscriptional SHAPE-Seq experiments to measure the Bacillus cereus crcB fluoride riboswitch folding pathway. Comparison of EcoRIE111Q and streptavidin roadblocks in cotranscriptional SHAPE-Seq data shows that both strategies identify the same RNA structural transitions related to the riboswitch decision-making process. Finally, we propose guidelines to leverage the complementary strengths of each transcription roadblock for use in studying cotranscriptional folding.

Download Full-text

SHAPER: A Web Server for Fast and Accurate SHAPE Reactivity Prediction

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.721955 ◽

2021 ◽

Vol 8 ◽

Author(s):

Yuanzhe Zhou ◽

Jun Li ◽

Travis Hurst ◽

Shi-Jie Chen

Keyword(s):

Rna Structure ◽

Structural Information ◽

3D Structure ◽

Web Server ◽

Rna Structures ◽

Structure Selection ◽

Chemical Probing ◽

Local Flexibility ◽

Multiple Conformations ◽

Shape Data

Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) chemical probing serves as a convenient and efficient experiment technique for providing information about RNA local flexibility. The local structural information contained in SHAPE reactivity data can be used as constraints in 2D/3D structure predictions. Here, we present SHAPE predictoR (SHAPER), a web server for fast and accurate SHAPE reactivity prediction. The main purpose of the SHAPER web server is to provide a portal that uses experimental SHAPE data to refine 2D/3D RNA structure selection. Input structures for the SHAPER server can be obtained through experimental or computational modeling. The SHAPER server can accept RNA structures with single or multiple conformations, and the predicted SHAPE profile and correlation with experimental SHAPE data (if provided) for each conformation can be freely downloaded through the web portal. The SHAPER web server is available at http://rna.physics.missouri.edu/shaper/.

Download Full-text

Estimating RNA structure chemical probing reactivities from reverse transcriptase stops and mutations

10.1101/292532 ◽

2018 ◽

Cited By ~ 4

Author(s):

Angela M Yu ◽

Molly E. Evans ◽

Julius B. Lucks

Keyword(s):

Statistical Analysis ◽

Reverse Transcriptase ◽

Rna Structure ◽

Rna Structures ◽

Modeling Framework ◽

Single Experiment ◽

Complementary Dna ◽

Rna Molecules ◽

Chemical Probing ◽

Covalent Adducts

ABSTRACTChemical probing experiments interrogate RNA structures by creating covalent adducts on RNA molecules in structure-dependent patterns. Adduct positions are then detected through conversion of the modified RNAs into complementary DNA (cDNA) by reverse transcription (RT) as either stops (RT-stops) or mutations (RT-mutations). Statistical analysis of the frequencies of RT-stops and RT-mutations can then be used to estimate a measure of chemical probing reactivity at each nucleotide of an RNA, which reveals properties of the underlying RNA structure. Inspired by recent work that showed that different reverse transcriptase enzymes show distinct biases for detecting adducts as either RT-stops or RT-mutations, here we use a statistical modeling framework to derive an equation for chemical probing reactivity using experimental signatures from both RT-stops and RT-mutations within a single experiment. The resulting formula intuitively matches the expected result from considering reactivity to be defined as the fraction of adduct observed at each position in an RNA at the end of a chemical probing experiment. We discuss assumptions and implementation of the model, as well as ways in which the model may be experimentally validated.

Download Full-text

RNAlign2D: a rapid method for combined RNA structure and sequence-based alignment using a pseudo-amino acid substitution matrix

BMC Bioinformatics ◽

10.1186/s12859-021-04426-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Tomasz Woźniak ◽

Małgorzata Sajek ◽

Jadwiga Jaruzelska ◽

Marcin Piotr Sajek

Keyword(s):

Amino Acid ◽

Amino Acid Substitution ◽

Rna Structure ◽

Structural Information ◽

Amino Acid Sequences ◽

Substitution Matrix ◽

Rna Sequences ◽

Rna Molecules ◽

Bioinformatic Tools ◽

Amino Acid Substitution Matrix

Abstract Background The functions of RNA molecules are mainly determined by their secondary structures. These functions can also be predicted using bioinformatic tools that enable the alignment of multiple RNAs to determine functional domains and/or classify RNA molecules into RNA families. However, the existing multiple RNA alignment tools, which use structural information, are slow in aligning long molecules and/or a large number of molecules. Therefore, a more rapid tool for multiple RNA alignment may improve the classification of known RNAs and help to reveal the functions of newly discovered RNAs. Results Here, we introduce an extremely fast Python-based tool called RNAlign2D. It converts RNA sequences to pseudo-amino acid sequences, which incorporate structural information, and uses a customizable scoring matrix to align these RNA molecules via the multiple protein sequence alignment tool MUSCLE. Conclusions RNAlign2D produces accurate RNA alignments in a very short time. The pseudo-amino acid substitution matrix approach utilized in RNAlign2D is applicable for virtually all protein aligners.

Download Full-text

Repurposing Ribo-Seq to provide insights into structured RNAs

10.1101/2020.05.18.103374 ◽

2020 ◽

Author(s):

Brayon J. Fremin ◽

Ami S. Bhatt

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Large Scale ◽

Structural Information ◽

Ribosome Profiling ◽

Rna Structures ◽

Size Selection ◽

Bacterial Ribosome ◽

Powerful Approach

AbstractRibosome profiling (Ribo-Seq) is a powerful method to study translation in bacteria. However, this method can enrich RNAs that are not bound by ribosomes, but rather, are protected from degradation in another way. For example, Escherichia coli Ribo-Seq libraries also capture reads from most non-coding RNAs (ncRNAs). These fragments of ncRNAs pass all size selection steps of the Ribo-Seq protocol and survive hours of MNase treatment, presumably without protection from the ribosome or other macromolecules or proteins. Since bacterial ribosome profiling does not directly isolate ribosomes, but instead uses broad size range cutoffs to fractionate actively translated RNAs, it is understandable that some ncRNAs are retained after size selection. However, how these ‘contaminants’ survive MNase treatment is unclear. Through analyzing metaRibo-Seq reads across ssrS, a well established structured RNA in E. coli, and structured direct repeats from Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) arrays in Ruminococcus lactaris, we observed that these RNAs are protected from MNase treatment by virtue of their secondary structure. Therefore, large volumes of data previously discarded as contaminants in bacterial Ribo-Seq experiments can, in fact, be used to gain information regarding the in vivo secondary structure of ncRNAs, providing unique insight into their native functional structures.ImportanceWe observe that ‘contaminant’ signals in bacterial Ribo-Seq experiments that are often disregarded and discarded, in fact, strongly overlap with structured regions of ncRNAs. Structured ncRNAs are pivotal mediators of bioregulation in bacteria and their functions are often reliant on their specific structures. We present an approach to access important RNA structural information through merely repurposing ‘contaminant’ signals in bacterial Ribo-Seq experiments. This powerful approach enables us to partially resolve RNA structures, identify novel structured RNAs, and elucidate RNA structure-function relationships in bacteria at a large-scale and in vivo.

Download Full-text

A novel SHAPE reagent enables the analysis of RNA structure in living cells with unprecedented accuracy

10.1101/2020.08.31.274761 ◽

2020 ◽

Author(s):

Tycho Marinus ◽

Adam B. Fessler ◽

Craig A. Ogle ◽

Danny Incarnato

Keyword(s):

Rna Structure ◽

Structure Prediction ◽

Critical Role ◽

Living Cells ◽

Pathological Process ◽

Rna Structures ◽

Rna Molecules ◽

Derived Data

ABSTRACTDue to the mounting evidence that RNA structure plays a critical role in regulating almost any physiological as well as pathological process, being able to accurately define the folding of RNA molecules within living cells has become a crucial need. We introduce here 2-aminopyridine-3-carboxylic acid imidazolide (2A3), as a general probe for the interrogation of RNA structures in vivo. 2A3 shows moderate improvements with respect to the state-of-the-art SHAPE reagent NAI on naked RNA under in vitro conditions, but it significantly outperforms NAI when probing RNA structure in vivo, particularly in bacteria, underlining its increased ability to permeate biological membranes. When used as a restraint to drive RNA structure prediction, data derived by SHAPE-MaP with 2A3 yields more accurate predictions than NAI-derived data. Due to its extreme efficiency and accuracy, we can anticipate that 2A3 will rapidly take over conventional SHAPE reagents for probing RNA structures both in vitro and in vivo.

Download Full-text

Computational modeling of RNA 3D structure based on experimental data

Bioscience Reports ◽

10.1042/bsr20180430 ◽

2019 ◽

Vol 39 (2) ◽

Cited By ~ 13

Author(s):

Almudena Ponce-Salvatierra ◽

Astha ◽

Katarzyna Merdas ◽

Chandran Nithin ◽

Pritha Ghosh ◽

...

Keyword(s):

Experimental Data ◽

Computational Methods ◽

Rna Structure ◽

Structure Prediction ◽

3D Structure ◽

Rna Structures ◽

Data Types ◽

Rna Sequences ◽

Rna Molecules

Abstract RNA molecules are master regulators of cells. They are involved in a variety of molecular processes: they transmit genetic information, sense cellular signals and communicate responses, and even catalyze chemical reactions. As in the case of proteins, RNA function is dictated by its structure and by its ability to adopt different conformations, which in turn is encoded in the sequence. Experimental determination of high-resolution RNA structures is both laborious and difficult, and therefore the majority of known RNAs remain structurally uncharacterized. To address this problem, predictive computational methods were developed based on the accumulated knowledge of RNA structures determined so far, the physical basis of the RNA folding, and taking into account evolutionary considerations, such as conservation of functionally important motifs. However, all theoretical methods suffer from various limitations, and they are generally unable to accurately predict structures for RNA sequences longer than 100-nt residues unless aided by additional experimental data. In this article, we review experimental methods that can generate data usable by computational methods, as well as computational approaches for RNA structure prediction that can utilize data from experimental analyses. We outline methods and data types that can be potentially useful for RNA 3D structure modeling but are not commonly used by the existing software, suggesting directions for future development.

Download Full-text