Structural 3D Domain Reconstruction of the RNA Genome from Viruses with Secondary Structure Models

Three-dimensional RNA domain reconstruction is important for the assembly, disassembly and delivery functionalities of a packed proteinaceus capsid. However, to date, the self-association of RNA molecules is still an open problem. Recent chemical probing reports provide, with high reliability, the secondary structure of diverse RNA ensembles, such as those of viral genomes. Here, we present a method for reconstructing the complete 3D structure of RNA genomes, which combines a coarse-grained model with a subdomain composition scheme to obtain the entire genome inside proteinaceus capsids based on secondary structures from experimental techniques. Despite the amount of sampling involved in the folded and also unfolded RNA molecules, advanced microscope techniques can provide points of anchoring, which enhance our model to include interactions between capsid pentamers and RNA subdomains. To test our method, we tackle the satellite tobacco mosaic virus (STMV) genome, which has been widely studied by both experimental and computational communities. We provide not only a methodology to structurally analyze the tertiary conformations of the RNA genome inside capsids, but a flexible platform that allows the easy implementation of features/descriptors coming from both theoretical and experimental approaches.

Download Full-text

RNA 3D Structure Prediction Using Coarse-Grained Models

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.720937 ◽

2021 ◽

Vol 8 ◽

Author(s):

Jun Li ◽

Shi-Jie Chen

Keyword(s):

Structure Prediction ◽

3D Structure ◽

Three Dimensional ◽

Coarse Grained ◽

Biological Functions ◽

3D Structures ◽

Rna Molecules ◽

Atomic Structures ◽

Long Time

The three-dimensional (3D) structures of Ribonucleic acid (RNA) molecules are essential to understanding their various and important biological functions. However, experimental determination of the atomic structures is laborious and technically difficult. The large gap between the number of sequences and the experimentally determined structures enables the thriving development of computational approaches to modeling RNAs. However, computational methods based on all-atom simulations are intractable for large RNA systems, which demand long time simulations. Facing such a challenge, many coarse-grained (CG) models have been developed. Here, we provide a review of CG models for modeling RNA 3D structures, compare the performance of the different models, and offer insights into potential future developments.

Download Full-text

Structural relation matching: an algorithm to identify structural patterns into RNAs and their interactions

Journal of Integrative Bioinformatics ◽

10.1515/jib-2020-0039 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Michela Quadrini

Keyword(s):

Hydrogen Bonding ◽

Secondary Structure ◽

Nucleotide Sequence ◽

Thermus Thermophilus ◽

Three Dimensional ◽

Secondary Structures ◽

Structural Effect ◽

Biological Processes ◽

Structural Pattern ◽

Rna Molecules

Abstract RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA–RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA–RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors.

Download Full-text

GARN2: coarse-grained prediction of 3D structure of large RNA molecules by regret minimization

Bioinformatics ◽

10.1093/bioinformatics/btx175 ◽

2017 ◽

Vol 33 (16) ◽

pp. 2479-2486 ◽

Cited By ~ 5

Author(s):

Mélanie Boudard ◽

Dominique Barth ◽

Julie Bernauer ◽

Alain Denise ◽

Johanne Cohen

Keyword(s):

3D Structure ◽

Coarse Grained ◽

Regret Minimization ◽

Rna Molecules

Download Full-text

RNA Structure Analysis: A Multifaceted Approach

Pattern Discovery in Biomolecular Data ◽

10.1093/oso/9780195119404.003.0018 ◽

1999 ◽

Author(s):

Bruce A. Shapiro ◽

Wojciech Kasprzak

Keyword(s):

Nucleic Acid ◽

Secondary Structure ◽

Tertiary Structure ◽

Hammerhead Ribozyme ◽

3D Structure ◽

Computer Prediction ◽

Rna Molecules ◽

Tertiary Interactions ◽

The One ◽

Secondary And Tertiary Structure

Genomic information (nucleic acid and amino acid sequences) completely determines the characteristics of the nucleic acid and protein molecules that express a living organism’s function. One of the greatest challenges in which computation is playing a role is the prediction of higher order structure from the one-dimensional sequence of genes. Rules for determining macromolecule folding have been continually evolving. Specifically in the case of RNA (ribonucleic acid) there are rules and computer algorithms/systems (see below) that partially predict and can help analyze the secondary and tertiary interactions of distant parts of the polymer chain. These successes are very important for determining the structural and functional characteristics of RNA in disease processes and hi the cell life cycle. It has been shown that molecules with the same function have the potential to fold into similar structures though they might differ in their primary sequences. This fact also illustrates the importance of secondary and tertiary structure in relation to function. Examples of such constancy in secondary structure exist in transfer RNAs (tRNAs), 5s RNAs, 16s RNAs, viroid RNAs, and portions of retroviruses such as HIV. The secondary and tertiary structure of tRNA Phe (Kim et al., 1974), of a hammerhead ribozyme (Pley et al., 1994), and of Tetrahymena (Cate et al., 1996a, 1996b) have been shown by their crystal structure. Currently little is known of tertiary interactions, but studies on tRNA indicate these are weaker than secondary structure interactions (Riesner and Romer, 1973; Crothers and Cole, 1978; Jaeger et al., 1989b). It is very difficult to crystallize and/or get nuclear magnetic resonance spectrum data for large RNA molecules. Therefore, a logical place to start in determining the 3D structure of RNA is computer prediction of the secondary structure. The sequence (primary structure) of an RNA molecule is relatively easy to produce. Because experimental methods for determining RNA secondary and tertiary structure (when the primary sequence folds back on itself and forms base pairs) have not kept pace with the rapid discovery of RNA molecules and their function, use of and methods for computer prediction of secondary and tertiary structures have increasingly been developed.

Download Full-text

Theoretical and computational modeling of naturally and artificially modified RNA nucleotides

10.32469/10355/85785 ◽

2021 ◽

Author(s):

◽

Travis Caleb Hurst

Keyword(s):

3D Structure ◽

Experimental Information ◽

Potential Of Mean Force ◽

Free Energy Calculations ◽

Coarse Grained ◽

Structure Stability ◽

Rna Molecules ◽

Parameterization Method ◽

The Impact ◽

Structural Ensemble

Ribonucleic acid (RNA) is a polymeric nucleic acid that is crucial for cellular function, regulating gene expression and encoding/decoding protein/DNA molecules. Recent discoveries of diverse functionality in non-coding RNAs have led to unprecedented demand for RNA 3D structure determination. With current technology, general, accurate prediction of 3D structures for large RNAs from the sequence remains computationally intractable. One of the principal challenges arises from the conformational flexibility of RNA, especially in loop/junction regions, which results in a rugged energy landscape. Several strategies exist to overcome this challenge, including incorporation of efficient experimental information and coarse-grained (CG) modeling to improve computational sampling of the structural ensemble. A second challenge is the inclusion of naturally modified derivatives of canonical RNA nucleotides in structure analysis. Most RNA prediction strategies rely upon the canonical nucleotides (adenine (A), uracil (U), guanine (G), and cytosine (C)), ignoring the effects of modified nucleotides on the structure and system dynamics. In general, RNA molecules contain rigid and flexible structural elements, which can be probed using efficient selective 2'-hydroxyl analyzed by primer extension (SHAPE) experiments. SHAPE experiments selectively modify flexible RNA nucleotides and can be processed to produce a characteristic reactivity profile for an RNA molecule that contains structural information. Incorporation of efficient experimental information, such as SHAPE, in predicting RNA 3D structure is highly desirable for overcoming the current knowledge gap between RNA sequence and 3D structure. In the first project, we introduce a physics-based model, the 3D structure-SHAPE relationship (3DSSR) model, to predict the SHAPE reactivity from the structure and show how this model may be used to sieve SHAPE-compatible structures from a pool of low-energy decoys and refine our predictions. In the second project, we compare 3DSSR performance to that of a convolutional neural network (CNN) trained on the SHAPE data and RNA structures, showing that 3DSSR outperforms the CNN given the limited data available. In the third project, we further improve the 3DSSR model, gaining deeper insights into the SHAPE reaction and biases. In the fourth project, we explore the theory underpinning the iterative simulated CG RNA folding model (IsRNA). In establishing the underlying mechanics driving the success of the model, we were able to clarify and improve the parameterization method while expanding the model interpretation, which should broaden application of the method to other biopolymers, such as protein. We found that the parameterization method follows statistical mechanics principles but also has a Bayesian interpretation. Further, we found that the parameterization process can benefit from application of the principle of maximum entropy, which improves simulation and parameterization efficiency. In the fifth project, we investigate the impact of nucleotide modification on the structure and configurational ensemble of RNA molecules using free energy calculations. By applying modifications to a common RNA hairpin, we estimate the impact on the stability of the structural ensemble, identifying specific interactions that drive changes to the potential of mean force (PMF) and showing the context and modification-dependence of the variable alterations to the structure stability.

Download Full-text

RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses

10.1101/2020.03.27.012906 ◽

2020 ◽

Cited By ~ 19

Author(s):

Ramya Rangan ◽

Ivan N. Zheludev ◽

Rhiju Das

Keyword(s):

Secondary Structure ◽

Viral Genome ◽

Rna Secondary Structure ◽

Therapeutic Strategies ◽

Genome Sequences ◽

Sequence Alignments ◽

Secondary Structure Models ◽

Prior Literature ◽

Rna Genome ◽

Genomic Regions

AbstractAs the COVID-19 outbreak spreads, there is a growing need for a compilation of conserved RNA genome regions in the SARS-CoV-2 virus along with their structural propensities to guide development of antivirals and diagnostics. Using sequence alignments spanning a range of betacoronaviruses, we rank genomic regions by RNA sequence conservation, identifying 79 regions of length at least 15 nucleotides as exactly conserved over SARS-related complete genome sequences available near the beginning of the COVID-19 outbreak. We then confirm the conservation of the majority of these genome regions across 739 SARS-CoV-2 sequences reported to date from the current COVID-19 outbreak, and we present a curated list of 30 ‘SARS-related-conserved’ regions. We find that known RNA structured elements curated as Rfam families and in prior literature are enriched in these conserved genome regions, and we predict additional conserved, stable secondary structures across the viral genome. We provide 106 ‘SARS-CoV-2-conserved-structured’ regions as potential targets for antivirals that bind to structured RNA. We further provide detailed secondary structure models for the 5’ UTR, frame-shifting element, and 3’ UTR. Last, we predict regions of the SARS-CoV-2 viral genome have low propensity for RNA secondary structure and are conserved within SARS-CoV-2 strains. These 59 ‘SARS-CoV-2-conserved-unstructured’ genomic regions may be most easily targeted in primer-based diagnostic and oligonucleotide-based therapeutic strategies.

Download Full-text

RNA-align: quick and accurate alignment of RNA 3D structures based on size-independent TM-scoreRNA

Bioinformatics ◽

10.1093/bioinformatics/btz282 ◽

2019 ◽

Vol 35 (21) ◽

pp. 4459-4461 ◽

Cited By ~ 1

Author(s):

Sha Gong ◽

Chengxin Zhang ◽

Yang Zhang

Keyword(s):

Rna Structure ◽

Large Scale ◽

3D Structure ◽

Structure Alignment ◽

Coarse Grained ◽

Supplementary Information ◽

Art Programs ◽

3D Structures ◽

Rna Molecules ◽

Functional Relations

Abstract Motivation Comparison of RNA 3D structures can be used to infer functional relationship of RNA molecules. Most of the current RNA structure alignment programs are built on size-dependent scales, which complicate the interpretation of structure and functional relations. Meanwhile, the low speed prevents the programs from being applied to large-scale RNA structural database search. Results We developed an open-source algorithm, RNA-align, for RNA 3D structure alignment which has the structure similarity scaled by a size-independent and statistically interpretable scoring metric. Large-scale benchmark tests show that RNA-align significantly outperforms other state-of-the-art programs in both alignment accuracy and running speed. The major advantage of RNA-align lies at the quick convergence of the heuristic alignment iterations and the coarse-grained secondary structure assignment, both of which are crucial to the speed and accuracy of RNA structure alignments. Availability and implementation https://zhanglab.ccmb.med.umich.edu/RNA-align/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

RAG-Web: RNA structure prediction/design using RNA-As-Graphs

Bioinformatics ◽

10.1093/bioinformatics/btz611 ◽

2019 ◽

Cited By ~ 1

Author(s):

Grace Meng ◽

Marva Tariq ◽

Swati Jain ◽

Shereef Elmetwaly ◽

Tamar Schlick

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Three Dimensional ◽

Coarse Grained ◽

Supplementary Information ◽

Tree Graph ◽

Rna Structure Prediction ◽

Tree Graphs

Abstract Summary We launch a webserver for RNA structure prediction and design corresponding to tools developed using our RNA-As-Graphs (RAG) approach. RAG uses coarse-grained tree graphs to represent RNA secondary structure, allowing the application of graph theory to analyze and advance RNA structure discovery. Our webserver consists of three modules: (a) RAG Sampler: samples tree graph topologies from an RNA secondary structure to predict corresponding tertiary topologies, (b) RAG Builder: builds three-dimensional atomic models from candidate graphs generated by RAG Sampler, and (c) RAG Designer: designs sequences that fold onto novel RNA motifs (described by tree graph topologies). Results analyses are performed for further assessment/selection. The Results page provides links to download results and indicates possible errors encountered. RAG-Web offers a user-friendly interface to utilize our RAG software suite to predict and design RNA structures and sequences. Availability and implementation The webserver is freely available online at: http://www.biomath.nyu.edu/ragtop/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Towards 3D structure prediction of large RNA molecules: an integer programming framework to insert local 3D motifs in RNA secondary structure

Bioinformatics ◽

10.1093/bioinformatics/bts226 ◽

2012 ◽

Vol 28 (12) ◽

pp. i207-i214 ◽

Cited By ~ 32

Author(s):

V. Reinharz ◽

F. Major ◽

J. Waldispuhl

Keyword(s):

Integer Programming ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

3D Structure ◽

Rna Molecules ◽

Programming Framework ◽

3D Structure Prediction

Download Full-text

Relative Information Gain: Shannon entropy-based measure of the relative structural conservation in RNA alignments

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab007 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Marco Pietrosanto ◽

Marta Adinolfi ◽

Andrea Guarracino ◽

Fabrizio Ferrè ◽

Gabriele Ausiello ◽

...

Keyword(s):

Secondary Structure ◽

Information Gain ◽

Structural Information ◽

Coarse Grained ◽

Fine Grained ◽

Additional Information ◽

Levels Of Detail ◽

Relative Information ◽

Substitution Matrices ◽

Secondary Structure Models

Abstract Structural characterization of RNAs is a dynamic field, offering many modelling possibilities. RNA secondary structure models are usually characterized by an encoding that depicts structural information of the molecule through string representations or graphs. In this work, we provide a generalization of the BEAR encoding (a context-aware structural encoding we previously developed) by expanding the set of alignments used for the construction of substitution matrices and then applying it to secondary structure encodings ranging from fine-grained to more coarse-grained representations. We also introduce a re-interpretation of the Shannon Information applied on RNA alignments, proposing a new scoring metric, the Relative Information Gain (RIG). The RIG score is available for any position in an alignment, showing how different levels of detail encoded in the RNA representation can contribute differently to convey structural information. The approaches presented in this study can be used alongside state-of-the-art tools to synergistically gain insights into the structural elements that RNAs and RNA families are composed of. This additional information could potentially contribute to their improvement or increase the degree of confidence in the secondary structure of families and any set of aligned RNAs.

Download Full-text