scholarly journals Graphical Processing Unit - Supported RNA Secondary Structure Comparison

10.29007/bhsr ◽  
2020 ◽  
Author(s):  
Mutlu Mete ◽  
Abdullah Arslan

This study is part of our perpetual effort to develop improved RNA secondary structure analysis tools and databases. In this work we present a new Graphical Processing Unit (GPU)-based RNA structural analysis framework that supports fast multiple RNA secondary structure comparison for very large databases. A search-based secondary structure comparison algorithm deployed in RNASSAC website helps bioinformaticians find common RNA substructures from the underlying database. The algorithm performs two levels of binary searches on the database. Its time requirement is affected by the database size. Experiments on the RNASSAC website show that the algorithm takes seconds for a database of 4,666 RNAs. For example, it takes about 4.4 sec for comparing 25 RNAs from this database. In another case, when many non-overlapping common substructures are desired, a heuristic approach requires as long as 85 sec in comparing 40 RNAs from the same database. The comparisons by this sequential algorithm takes at least 50% more time when RNAs are compared from the database of several millions of RNAs. The most recently curated databases already have millions of RNA secondary structures. The improvement in run-time performance of comparison algorithms is necessary. This study present a GPU-based RNA substructure comparison algorithm with which running time for multiple RNA secondary structures remains feasible for large databases. Our new parallel algorithm is 12 times faster than the CPU version (sequential) comparison algorithm of the RNASSAC website. The response time significantly reduces towards development of a realtime RNA comparison web service for bioinformatics community.

2020 ◽  
Author(s):  
Kengo Sato ◽  
Manato Akiyama ◽  
Yasubumi Sakakibara

RNA secondary structure prediction is one of the key technologies for revealing the essential roles of functional non-coding RNAs. Although machine learning-based rich-parametrized models have achieved extremely high performance in terms of prediction accuracy, the risk of overfitting for such models has been reported. In this work, we propose a new algorithm for predicting RNA secondary structures that uses deep learning with thermodynamic integration, thereby enabling robust predictions. Similar to our previous work, the folding scores, which are computed by a deep neural network, are integrated with traditional thermodynamic parameters to enable robust predictions. We also propose thermodynamic regularization for training our model without overfitting it to the training data. Our algorithm (MXfold2) achieved the most robust and accurate predictions in computational experiments designed for newly discovered non-coding RNAs, with significant 2–10 % improvements over our previous algorithm (MXfold) and standard algorithms for predicting RNA secondary structures in terms of F-value.


Author(s):  
Lina Yang ◽  
Yang Liu ◽  
Huiwu Luo ◽  
Xichun Li ◽  
Yuan Yan Tang

The function of pseudoknots cannot be ignored in the RNA secondary structure. Existing methods for analyzing RNA secondary structures with pseudoknots exhibit many shortcomings. This paper presents a novel RNA secondary structure visualization method in the case of a joint analysis of RNA primary structures and secondary structures. The way is based on the page number representation of the RNA secondary structure. It innovatively uses five vectors to represent bases, which are sequentially connected to outline the characteristics of the RNA secondary structure. The method covers almost all the constituent elements of the RNA secondary structure and extracts features completely. Experiments are based on the available techniques for large-scale annotation of RNA secondary structures, using a combination method of discrete wavelet transform and fractal dimension. The classification effect is compared with the previous RNA secondary structure representation methods. Experimental results show that the RNA secondary structure visualization method proposed in this paper has good application prospects in RNA secondary structure classification.


2010 ◽  
Vol 08 (04) ◽  
pp. 727-742 ◽  
Author(s):  
KENGO SATO ◽  
MICHIAKI HAMADA ◽  
TOUTAI MITUYAMA ◽  
KIYOSHI ASAI ◽  
YASUBUMI SAKAKIBARA

Since many functional RNAs form stable secondary structures which are related to their functions, RNA secondary structure prediction is a crucial problem in bioinformatics. We propose a novel model for generating RNA secondary structures based on a non-parametric Bayesian approach, called hierarchical Dirichlet processes for stochastic context-free grammars (HDP-SCFGs). Here non-parametric means that some meta-parameters, such as the number of non-terminal symbols and production rules, do not have to be fixed. Instead their distributions are inferred in order to be adapted (in the Bayesian sense) to the training sequences provided. The results of our RNA secondary structure predictions show that HDP-SCFGs are more accurate than the MFE-based and other generative models.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Kengo Sato ◽  
Manato Akiyama ◽  
Yasubumi Sakakibara

AbstractAccurate predictions of RNA secondary structures can help uncover the roles of functional non-coding RNAs. Although machine learning-based models have achieved high performance in terms of prediction accuracy, overfitting is a common risk for such highly parameterized models. Here we show that overfitting can be minimized when RNA folding scores learnt using a deep neural network are integrated together with Turner’s nearest-neighbor free energy parameters. Training the model with thermodynamic regularization ensures that folding scores and the calculated free energy are as close as possible. In computational experiments designed for newly discovered non-coding RNAs, our algorithm (MXfold2) achieves the most robust and accurate predictions of RNA secondary structures without sacrificing computational efficiency compared to several other algorithms. The results suggest that integrating thermodynamic information could help improve the robustness of deep learning-based predictions of RNA secondary structure.


Sign in / Sign up

Export Citation Format

Share Document