scholarly journals RNAStat: An Integrated Tool for Statistical Analysis of RNA 3D Structures

2022 ◽  
Vol 1 ◽  
Author(s):  
Zhi-Hao Guo ◽  
Li Yuan ◽  
Ya-Lan Tan ◽  
Ben-Gong Zhang ◽  
Ya-Zhou Shi

The 3D architectures of RNAs are essential for understanding their cellular functions. While an accurate scoring function based on the statistics of known RNA structures is a key component for successful RNA structure prediction or evaluation, there are few tools or web servers that can be directly used to make comprehensive statistical analysis for RNA 3D structures. In this work, we developed RNAStat, an integrated tool for making statistics on RNA 3D structures. For given RNA structures, RNAStat automatically calculates RNA structural properties such as size and shape, and shows their distributions. Based on the RNA structure annotation from DSSR, RNAStat provides statistical information of RNA secondary structure motifs including canonical/non-canonical base pairs, stems, and various loops. In particular, the geometry of base-pairing/stacking can be calculated in RNAStat by constructing a local coordinate system for each base. In addition, RNAStat also supplies the distribution of distance between any atoms to the users to help build distance-based RNA statistical potentials. To test the usability of the tool, we established a non-redundant RNA 3D structure dataset, and based on the dataset, we made a comprehensive statistical analysis on RNA structures, which could have the guiding significance for RNA structure modeling. The python code of RNAStat, the dataset used in this work, and corresponding statistical data files are freely available at GitHub (https://github.com/RNA-folding-lab/RNAStat).

2018 ◽  
Author(s):  
Jinfang Zheng ◽  
Juan Xie ◽  
Xu Hong ◽  
Shiyong Liu

ABSTRACTRNA-protein 3D complex structure prediction is still challenging. Recently, a template-based approach PRIME is proposed in our team to build RNA-protein complex 3D structure models with a higher success rate than computational docking software. However, scoring function of RNA alignment algorithm SARA in PRIME is size-dependent, which limits its ability to detect templates in some cases. Herein, we developed a novel RNA 3D structural alignment approach RMalign, which is based on a size-independent scoring function RMscore. The parameter in RMscore is then optimized in randomly selected RNA pairs and phase transition points (from dissimilar to similar) are determined in another randomly selected RNA pairs. In tRNA benchmarking, the precision of RMscore is higher than that of SARAscore (0.8771 and 0.7766, respectively) with phase transition points. In balance-FSCOR benchmarking, RMalign performed as good as ESA-RNA with a non-normalized score measuring RNA structure similarity. In balance-x-FSCOR benchmarking, RMalign achieves much better than a state-of-the-art RNA 3D structural alignment approach SARA due to a size-independent scoring function. Taking the advantage of RMalign, we update our RNA-protein modeling approach PRIME to version 2.0. The PRIME2.0 significantly improves about 10% success rate than PRIME.Author summaryRNA structures are important for RNA functions. With the increasing of RNA structures in PDB, RNA 3D structure alignment approaches have been developed. However, the scoring function which is used for measuring RNA structural similarity is still length dependent. This shortcoming limits its ability to detect RNA structure templates in modeling RNA structure or RNA-protein 3D complex structure. Thus, we developed a length independent scoring function RMscore to enhance the ability to detect RNA structure homologs. The benchmarking data shows that RMscore can distinct the similar and dissimilar RNA structure effectively. RMscore should be a useful scoring function in modeling RNA structures for the biological community. Based on RMscore, we develop an RNA 3D structure alignment RMalign. In both RNA structure and function classification benchmarking, RMalign obtains as good as or even better performance than the state-of-the-art approaches. With a length independent scoring function RMscore, RMalign should be useful for the modeling RNA structures. Based on above results, we update PRIME to PRIME2.0. We provide a more accurate RNA-protein 3D complex structure modeling tool PRIME2.0 which should be useful for the biological community.


2019 ◽  
Vol 39 (2) ◽  
Author(s):  
Almudena Ponce-Salvatierra ◽  
Astha ◽  
Katarzyna Merdas ◽  
Chandran Nithin ◽  
Pritha Ghosh ◽  
...  

Abstract RNA molecules are master regulators of cells. They are involved in a variety of molecular processes: they transmit genetic information, sense cellular signals and communicate responses, and even catalyze chemical reactions. As in the case of proteins, RNA function is dictated by its structure and by its ability to adopt different conformations, which in turn is encoded in the sequence. Experimental determination of high-resolution RNA structures is both laborious and difficult, and therefore the majority of known RNAs remain structurally uncharacterized. To address this problem, predictive computational methods were developed based on the accumulated knowledge of RNA structures determined so far, the physical basis of the RNA folding, and taking into account evolutionary considerations, such as conservation of functionally important motifs. However, all theoretical methods suffer from various limitations, and they are generally unable to accurately predict structures for RNA sequences longer than 100-nt residues unless aided by additional experimental data. In this article, we review experimental methods that can generate data usable by computational methods, as well as computational approaches for RNA structure prediction that can utilize data from experimental analyses. We outline methods and data types that can be potentially useful for RNA 3D structure modeling but are not commonly used by the existing software, suggesting directions for future development.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Marcin Magnus ◽  
Kalli Kappel ◽  
Rhiju Das ◽  
Janusz M. Bujnicki

Abstract Background The understanding of the importance of RNA has dramatically changed over recent years. As in the case of proteins, the function of an RNA molecule is encoded in its tertiary structure, which in turn is determined by the molecule’s sequence. The prediction of tertiary structures of complex RNAs is still a challenging task. Results Using the observation that RNA sequences from the same RNA family fold into conserved structure, we test herein whether parallel modeling of RNA homologs can improve ab initio RNA structure prediction. EvoClustRNA is a multi-step modeling process, in which homologous sequences for the target sequence are selected using the Rfam database. Subsequently, independent folding simulations using Rosetta FARFAR and SimRNA are carried out. The model of the target sequence is selected based on the most common structural arrangement of the common helical fragments. As a test, on two blind RNA-Puzzles challenges, EvoClustRNA predictions ranked as the first of all submissions for the L-glutamine riboswitch and as the second for the ZMP riboswitch. Moreover, through a benchmark of known structures, we discovered several cases in which particular homologs were unusually amenable to structure recovery in folding simulations compared to the single original target sequence. Conclusion This work, for the first time to our knowledge, demonstrates the importance of the selection of the target sequence from an alignment of an RNA family for the success of RNA 3D structure prediction. These observations prompt investigations into a new direction of research for checking 3D structure “foldability” or “predictability” of related RNA sequences to obtain accurate predictions. To support new research in this area, we provide all relevant scripts in a documented and ready-to-use form. By exploring new ideas and identifying limitations of the current RNA 3D structure prediction methods, this work is bringing us closer to the near-native computational RNA 3D models.


2019 ◽  
Vol 20 (17) ◽  
pp. 4116 ◽  
Author(s):  
Jun Wang ◽  
Jian Wang ◽  
Yanzhao Huang ◽  
Yi Xiao

3D structures of RNAs are the basis for understanding their biological functions. However, experimentally solved RNA 3D structures are very limited in comparison with known RNA sequences up to now. Therefore, many computational methods have been proposed to solve this problem, including our 3dRNA. In recent years, 3dRNA has been greatly improved by adding several important features, including structure sampling, structure ranking and structure optimization under residue-residue restraints. Particularly, the optimization procedure with restraints enables 3dRNA to treat pseudoknots in a new way. These new features of 3dRNA can greatly promote its performance and have been integrated into the 3dRNA v2.0 web server. Here we introduce these new features in the 3dRNA v2.0 web server for the users.


Life ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1135
Author(s):  
Shunya Kashiwagi ◽  
Kengo Sato ◽  
Yasubumi Sakakibara

Protein–RNA interactions (PRIs) are essential for many biological processes, so understanding aspects of the sequences and structures involved in PRIs is important for unraveling such processes. Because of the expensive and time-consuming techniques required for experimental determination of complex protein–RNA structures, various computational methods have been developed to predict PRIs. However, most of these methods focus on predicting only RNA-binding regions in proteins or only protein-binding motifs in RNA. Methods for predicting entire residue–base contacts in PRIs have not yet achieved sufficient accuracy. Furthermore, some of these methods require the identification of 3D structures or homologous sequences, which are not available for all protein and RNA sequences. Here, we propose a prediction method for predicting residue–base contacts between proteins and RNAs using only sequence information and structural information predicted from sequences. The method can be applied to any protein–RNA pair, even when rich information such as its 3D structure, is not available. In this method, residue–base contact prediction is formalized as an integer programming problem. We predict a residue–base contact map that maximizes a scoring function based on sequence-based features such as k-mers of sequences and the predicted secondary structure. The scoring function is trained using a max-margin framework from known PRIs with 3D structures. To verify our method, we conducted several computational experiments. The results suggest that our method, which is based on only sequence information, is comparable with RNA-binding residue prediction methods based on known binding data.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Jaswinder Singh ◽  
Jack Hanson ◽  
Kuldip Paliwal ◽  
Yaoqi Zhou

AbstractThe majority of our human genome transcribes into noncoding RNAs with unknown structures and functions. Obtaining functional clues for noncoding RNAs requires accurate base-pairing or secondary-structure prediction. However, the performance of such predictions by current folding-based algorithms has been stagnated for more than a decade. Here, we propose the use of deep contextual learning for base-pair prediction including those noncanonical and non-nested (pseudoknot) base pairs stabilized by tertiary interactions. Since only $$<$$<250 nonredundant, high-resolution RNA structures are available for model training, we utilize transfer learning from a model initially trained with a recent high-quality bpRNA dataset of $$> $$>10,000 nonredundant RNAs made available through comparative analysis. The resulting method achieves large, statistically significant improvement in predicting all base pairs, noncanonical and non-nested base pairs in particular. The proposed method (SPOT-RNA), with a freely available server and standalone software, should be useful for improving RNA structure modeling, sequence alignment, and functional annotations.


2017 ◽  
Vol 1 (3) ◽  
pp. 275-285 ◽  
Author(s):  
Bernhard C. Thiel ◽  
Christoph Flamm ◽  
Ivo L. Hofacker

We summarize different levels of RNA structure prediction, from classical 2D structure to extended secondary structure and motif-based research toward 3D structure prediction of RNA. We outline the importance of classical secondary structure during all those levels of structure prediction.


2019 ◽  
Vol 35 (21) ◽  
pp. 4459-4461 ◽  
Author(s):  
Sha Gong ◽  
Chengxin Zhang ◽  
Yang Zhang

Abstract Motivation Comparison of RNA 3D structures can be used to infer functional relationship of RNA molecules. Most of the current RNA structure alignment programs are built on size-dependent scales, which complicate the interpretation of structure and functional relations. Meanwhile, the low speed prevents the programs from being applied to large-scale RNA structural database search. Results We developed an open-source algorithm, RNA-align, for RNA 3D structure alignment which has the structure similarity scaled by a size-independent and statistically interpretable scoring metric. Large-scale benchmark tests show that RNA-align significantly outperforms other state-of-the-art programs in both alignment accuracy and running speed. The major advantage of RNA-align lies at the quick convergence of the heuristic alignment iterations and the coarse-grained secondary structure assignment, both of which are crucial to the speed and accuracy of RNA structure alignments. Availability and implementation https://zhanglab.ccmb.med.umich.edu/RNA-align/. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document