Characteristics and Prediction of RNA Structure

RNA secondary structures with pseudoknots are often predicted by minimizing free energy, which is NP-hard. Most RNAs fold during transcription from DNA into RNA through a hierarchical pathway wherein secondary structures form prior to tertiary structures. Real RNA secondary structures often have local instead of global optimization because of kinetic reasons. The performance of RNA structure prediction may be improved by considering dynamic and hierarchical folding mechanisms. This study is a novel report on RNA folding that accords with the golden mean characteristic based on the statistical analysis of the real RNA secondary structures of all 480 sequences from RNA STRAND, which are validated by NMR or X-ray. The length ratios of domains in these sequences are approximately 0.382L, 0.5L, 0.618L, andL, whereLis the sequence length. These points are just the important golden sections of sequence. With this characteristic, an algorithm is designed to predict RNA hierarchical structures and simulate RNA folding by dynamically folding RNA structures according to the above golden section points. The sensitivity and number of predicted pseudoknots of our algorithm are better than those of the Mfold, HotKnots, McQfold, ProbKnot, and Lhw-Zhu algorithms. Experimental results reflect the folding rules of RNA from a new angle that is close to natural folding.

Download Full-text

RAG: RNA-As-Graphs database—concepts, analysis, and features

Nutrition and Health ◽

10.1177/026010608700500206 ◽

1987 ◽

Vol 5 (1-2) ◽

pp. 1285-1291 ◽

Cited By ~ 11

Author(s):

Hin Hark Gan ◽

Daniela Fera ◽

Julie Zorn ◽

Nahum Shiffeldrim ◽

Michael Tang ◽

...

Keyword(s):

Structural Diversity ◽

Secondary Structures ◽

Sequence Length ◽

Structural Elements ◽

Rna Structures ◽

Rna Motifs ◽

Rna Secondary Structures ◽

Vertex Number ◽

Dual Graphs ◽

Tree Graphs

Motivation Understanding RNA's structural diversity is vital for identifying novel RNA structures and pursuing RNA genomics initiatives. By classifying RNA secondary motifs based on correlations between conserved RNA secondary structures and functional properties, we offer an avenue for predicting novel motifs. Although several RNA databases exist, no comprehensive schemes are available for cataloguing the range and diversity of RNA's structural repertoire. Results Our RNA-As-Graphs (RAG) database describes and ranks all mathematically possible (including existing and candidate) RNA secondary motifs on the basis of graphical enumeration techniques. We represent RNA secondary structures as two-dimensional graphs (networks), specifying the connectivity between RNA secondary structural elements, such as loops, bulges, stems and junctions. We archive RNA tree motifs as ‘tree graphs’ and other RNAs, including pseudoknots, as general ‘dual graphs’. All RNA motifs are catalogued by graph vertex number (a measure of sequence length) and ranked by topological complexity. The RAG inventory immediately suggests candidates for novel RNA motifs, either naturally occurring or synthetic, and thereby might stimulate the prediction and design of novel RNA motifs. Availability The database is accessible on the web at http://monod.biomath.nyu.edu/rna Contact [email protected]

Download Full-text

New Algorithms in RNA Structure Prediction Based on BHG

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420500317 ◽

2020 ◽

Vol 34 (13) ◽

pp. 2050031

Author(s):

Zhendong Liu ◽

Gang Li ◽

Jun S. Liu

Keyword(s):

Rna Structure ◽

Structure Prediction ◽

Rna Folding ◽

Rna Structures ◽

Structural Prediction ◽

Extended Structure ◽

Rna Structure Prediction ◽

Computing Algorithm ◽

Hard Problems ◽

Basin Hopping

There are some NP-hard problems in the prediction of RNA structures. Prediction of RNA folding structure in RNA nucleotide sequence remains an unsolved challenge. We investigate the computing algorithm in RNA folding structural prediction based on extended structure and basin hopping graph, it is a computing mode of basin hopping graph in RNA folding structural prediction including pseudoknots. This study presents the predicting algorithm based on extended structure, it also proposes an improved computing algorithm based on barrier tree and basin hopping graph, which are the attractive approaches in RNA folding structural prediction. Many experiments have been implemented in Rfam14.1 database and PseudoBase database, the experimental results show that our two algorithms are efficient and accurate than the other existing algorithms.

Download Full-text

In Vivo and In Vitro Genome-Wide Profiling of RNA Secondary Structures Reveals Key Regulatory Features in Plasmodium falciparum

Frontiers in Cellular and Infection Microbiology ◽

10.3389/fcimb.2021.673966 ◽

2021 ◽

Vol 11 ◽

Author(s):

Yanwei Qi ◽

Yuhong Zhang ◽

Guixing Zheng ◽

Bingxia Chen ◽

Mengxin Zhang ◽

...

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Rna Secondary Structure ◽

Secondary Structures ◽

Rna Structures ◽

Trophozoite Stage ◽

Rna Secondary Structures ◽

Biological Programme

It is widely accepted that the structure of RNA plays important roles in a number of biological processes, such as polyadenylation, splicing, and catalytic functions. Dynamic changes in RNA structure are able to regulate the gene expression programme and can be used as a highly specific and subtle mechanism for governing cellular processes. However, the nature of most RNA secondary structures in Plasmodium falciparum has not been determined. To investigate the genome-wide RNA secondary structural features at single-nucleotide resolution in P. falciparum, we applied a novel high-throughput method utilizing the chemical modification of RNA structures to characterize these structures. Structural data from parasites are in close agreement with the known 18S ribosomal RNA secondary structures of P. falciparum and can help to predict the in vivo RNA secondary structure of a total of 3,396 transcripts in the ring-stage and trophozoite-stage developmental cycles. By parallel analysis of RNA structures in vivo and in vitro during the Plasmodium parasite ring-stage and trophozoite-stage intraerythrocytic developmental cycles, we identified some key regulatory features. Recent studies have established that the RNA structure is a ubiquitous and fundamental regulator of gene expression. Our study indicate that there is a critical connection between RNA secondary structure and mRNA abundance during the complex biological programme of P. falciparum. This work presents a useful framework and important results, which may facilitate further research investigating the interactions between RNA secondary structure and the complex biological programme in P. falciparum. The RNA secondary structure characterized in this study has potential applications and important implications regarding the identification of RNA structural elements, which are important for parasite infection and elucidating host-parasite interactions and parasites in the environment.

Download Full-text

Comparative analysis of coronavirus genomic RNA structure reveals conservation in SARS-like coronaviruses

10.1101/2020.06.15.153197 ◽

2020 ◽

Cited By ~ 10

Author(s):

Wes Sanders ◽

Ethan J. Fritch ◽

Emily A. Madden ◽

Rachel L. Graham ◽

Heather A. Vincent ◽

...

Keyword(s):

Secondary Structure ◽

Rna Structure ◽

Rna Secondary Structure ◽

Molecular Mechanisms ◽

Selective Pressure ◽

Secondary Structures ◽

Rna Structures ◽

Rna Secondary Structures ◽

The Past ◽

Novel Strategy

AbstractCoronaviruses, including SARS-CoV-2 the etiological agent of COVID-19 disease, have caused multiple epidemic and pandemic outbreaks in the past 20 years1–3. With no vaccines, and only recently developed antiviral therapeutics, we are ill equipped to handle coronavirus outbreaks4. A better understanding of the molecular mechanisms that regulate coronavirus replication and pathogenesis is needed to guide the development of new antiviral therapeutics and vaccines. RNA secondary structures play critical roles in multiple aspects of coronavirus replication, but the extent and conservation of RNA secondary structure across coronavirus genomes is unknown5. Here, we define highly structured RNA regions throughout the MERS-CoV, SARS-CoV, and SARS-CoV-2 genomes. We find that highly stable RNA structures are pervasive throughout coronavirus genomes, and are conserved between the SARS-like CoV. Our data suggests that selective pressure helps preserve RNA secondary structure in coronavirus genomes, suggesting that these structures may play important roles in virus replication and pathogenesis. Thus, disruption of conserved RNA secondary structures could be a novel strategy for the generation of attenuated SARS-CoV-2 vaccines for use against the current COVID-19 pandemic.

Download Full-text

Caveats to deep learning approaches to RNA secondary structure prediction

10.1101/2021.12.14.472648 ◽

2021 ◽

Author(s):

Christoph Flamm ◽

Julia Wielach ◽

Michael T. Wolfinger ◽

Stefan Badelt ◽

Ronny Lorenz ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Training Data ◽

Sequence Length ◽

Learning Approaches ◽

Rna Secondary Structures

Machine learning (ML) and in particular deep learning techniques have gained popularity for predicting structures from biopolymer sequences. An interesting case is the prediction of RNA secondary structures, where well established biophysics based methods exist. These methods even yield exact solutions under certain simplifying assumptions. Nevertheless, the accuracy of these classical methods is limited and has seen little improvement over the last decade. This makes it an attractive target for machine learning and consequently several deep learning models have been proposed in recent years. In this contribution we discuss limitations of current approaches, in particular due to biases in the training data. Furthermore, we propose to study capabilities and limitations of ML models by first applying them on synthetic data that can not only be generated in arbitrary amounts, but are also guaranteed to be free of biases. We apply this idea by testing several ML models of varying complexity. Finally, we show that the best models are capable of capturing many, but not all, properties of RNA secondary structures. Most severely, the number of predicted base pairs scales quadratically with sequence length, even though a secondary structure can only accommodate a linear number of pairs.

Download Full-text

Using SHAPE-MaP To Model RNA Secondary Structure and Identify 3′UTR Variation in Chikungunya Virus

Journal of Virology ◽

10.1128/jvi.00701-20 ◽

2020 ◽

Vol 94 (24) ◽

Author(s):

Emily A. Madden ◽

Kenneth S. Plante ◽

Clayton R. Morrison ◽

Katrina M. Kutchko ◽

Wes Sanders ◽

...

Keyword(s):

Secondary Structure ◽

Virus Replication ◽

Rna Structure ◽

Rna Secondary Structure ◽

Chikungunya Virus ◽

Secondary Structures ◽

Rna Structures ◽

Functional Importance ◽

Rna Secondary Structures ◽

Mosquito Cells

ABSTRACT Chikungunya virus (CHIKV) is a mosquito-borne alphavirus associated with debilitating arthralgia in humans. RNA secondary structure in the viral genome plays an important role in the lifecycle of alphaviruses; however, the specific role of RNA structure in regulating CHIKV replication is poorly understood. Our previous studies found little conservation in RNA secondary structure between alphaviruses, and this structural divergence creates unique functional structures in specific alphavirus genomes. Therefore, to understand the impact of RNA structure on CHIKV biology, we used SHAPE-MaP to inform the modeling of RNA secondary structure throughout the genome of a CHIKV isolate from the 2013 Caribbean outbreak. We then analyzed regions of the genome with high levels of structural specificity to identify potentially functional RNA secondary structures and identified 23 regions within the CHIKV genome with higher than average structural stability, including four previously identified, functionally important CHIKV RNA structures. We also analyzed the RNA flexibility and secondary structures of multiple 3′UTR variants of CHIKV that are known to affect virus replication in mosquito cells. This analysis found several novel RNA structures within these 3′UTR variants. A duplication in the 3′UTR that enhances viral replication in mosquito cells led to an overall increase in the amount of unstructured RNA in the 3′UTR. This analysis demonstrates that the CHIKV genome contains a number of unique, specific RNA secondary structures and provides a strategy for testing these secondary structures for functional importance in CHIKV replication and pathogenesis. IMPORTANCE Chikungunya virus (CHIKV) is a mosquito-borne RNA virus that causes febrile illness and debilitating arthralgia in humans. CHIKV causes explosive outbreaks but there are no approved therapies to treat or prevent CHIKV infection. The CHIKV genome contains functional RNA secondary structures that are essential for proper virus replication. Since RNA secondary structures have only been defined for a small portion of the CHIKV genome, we used a chemical probing method to define the RNA secondary structures of CHIKV genomic RNA. We identified 23 highly specific structured regions of the genome, and confirmed the functional importance of one structure using mutagenesis. Furthermore, we defined the RNA secondary structure of three CHIKV 3′UTR variants that differ in their ability to replicate in mosquito cells. Our study highlights the complexity of the CHIKV genome and describes new systems for designing compensatory mutations to test the functional relevance of viral RNA secondary structures.

Download Full-text

An algorithm for template-based prediction of secondary structures of individual RNA sequences

10.1101/171108 ◽

2017 ◽

Author(s):

Josef Pánek ◽

Martin Černý

Keyword(s):

Rna Structure ◽

Structure Prediction ◽

De Novo ◽

Secondary Structures ◽

Rna Structures ◽

Rna Sequences ◽

Biologically Relevant ◽

Rna Molecules ◽

Rna Structure Prediction ◽

Limited Applicability

ABSTRACTWhile understanding the structure of RNA molecules is vital for deciphering their functions, determining RNA structures experimentally is exceptionally hard. At the same time, extant approaches to computational RNA structure prediction have limited applicability and reliability. In this paper we provide a method to solve a simpler yet still biologically relevant problem: prediction of secondary RNA structure using structure of different molecules as a template.Our method identifies conserved and unconserved subsequences within an RNA molecule. For conserved subsequences, the template structure is directly transferred into the generated structure and combined with de-novo predicted structure for the unconserved subsequences with low evolutionary conservation. The method also determines, when the generated structure is unreliable.The method is validated using experimentally identified structures. The accuracy of the method exceeds that of classical prediction algorithms and constrained prediction methods. This is demonstrated by comparison using large number of heterogeneous RNAs. The presented method is fast and robust, and useful for various applications requiring knowledge of secondary structures of individual RNA sequences.

Download Full-text

RNA secondary structure prediction using deep learning with thermodynamic integration

10.1101/2020.08.10.244442 ◽

2020 ◽

Author(s):

Kengo Sato ◽

Manato Akiyama ◽

Yasubumi Sakakibara

Keyword(s):

Deep Learning ◽

Secondary Structure ◽

Structure Prediction ◽

Rna Secondary Structure ◽

Secondary Structure Prediction ◽

Secondary Structures ◽

Thermodynamic Integration ◽

Rna Secondary Structure Prediction ◽

Rna Secondary Structures ◽

Non Coding Rnas

RNA secondary structure prediction is one of the key technologies for revealing the essential roles of functional non-coding RNAs. Although machine learning-based rich-parametrized models have achieved extremely high performance in terms of prediction accuracy, the risk of overfitting for such models has been reported. In this work, we propose a new algorithm for predicting RNA secondary structures that uses deep learning with thermodynamic integration, thereby enabling robust predictions. Similar to our previous work, the folding scores, which are computed by a deep neural network, are integrated with traditional thermodynamic parameters to enable robust predictions. We also propose thermodynamic regularization for training our model without overfitting it to the training data. Our algorithm (MXfold2) achieved the most robust and accurate predictions in computational experiments designed for newly discovered non-coding RNAs, with significant 2–10 % improvements over our previous algorithm (MXfold) and standard algorithms for predicting RNA secondary structures in terms of F-value.

Download Full-text

The global and local distribution of RNA structure throughout the SARS-CoV-2 genome

Journal of Virology ◽

10.1128/jvi.02190-20 ◽

2020 ◽

Cited By ~ 2

Author(s):

Rafael de Cesaris Araujo Tavares ◽

Gandhar Mahadeshwar ◽

Han Wan ◽

Nicholas C. Huston ◽

Anna Marie Pyle

Keyword(s):

In Silico ◽

Rna Structure ◽

Drug Targets ◽

Rna Folding ◽

Rna Viruses ◽

Viral Agent ◽

Rna Structures ◽

Viral Rnas ◽

Viral Genomes ◽

Rna Genome

SARS-CoV-2 is the causative viral agent of COVID-19, the disease at the center of the current global pandemic. While knowledge of highly structured regions is integral for mechanistic insights into the viral infection cycle, very little is known about the location and folding stability of functional elements within the massive, ∼30kb SARS-CoV-2 RNA genome. In this study, we analyze the folding stability of this RNA genome relative to the structural landscape of other well-known viral RNAs. We present an in-silico pipeline to predict regions of high base pair content across long genomes and to pinpoint hotspots of well-defined RNA structures, a method that allows for direct comparisons of RNA structural complexity within the several domains in SARS-CoV-2 genome. We report that the SARS-CoV-2 genomic propensity for stable RNA folding is exceptional among RNA viruses, superseding even that of HCV, one of the most structured viral RNAs in nature. Furthermore, our analysis suggests varying levels of RNA structure across genomic functional regions, with accessory and structural ORFs containing the highest structural density in the viral genome. Finally, we take a step further to examine how individual RNA structures formed by these ORFs are affected by the differences in genomic and subgenomic contexts, which given the technical difficulty of experimentally separating cellular mixtures of sgRNA from gRNA, is a unique advantage of our in-silico pipeline. The resulting findings provide a useful roadmap for planning focused empirical studies of SARS-CoV-2 RNA biology, and a preliminary guide for exploring potential SARS-CoV-2 RNA drug targets. Importance The RNA genome of SARS-CoV-2 is among the largest and most complex viral genomes, and yet its RNA structural features remain relatively unexplored. Since RNA elements guide function in most RNA viruses, and they represent potential drug targets, it is essential to chart the architectural features of SARS-CoV-2 and pinpoint regions that merit focused study. Here we show that RNA folding stability of SARS-CoV-2 genome is exceptional among viral genomes and we develop a method to directly compare levels of predicted secondary structure across SARS-CoV-2 domains. Remarkably, we find that coding regions display the highest structural propensity in the genome, forming motifs that differ between the genomic and subgenomic contexts. Our approach provides an attractive strategy to rapidly screen for candidate structured regions based on base pairing potential and provides a readily interpretable roadmap to guide functional studies of RNA viruses and other pharmacologically relevant RNA transcripts.

Download Full-text