scholarly journals UFold: Fast and Accurate RNA Secondary Structure Prediction with Deep Learning

2020 ◽  
Author(s):  
Yingxin Cao ◽  
Laiyi Fu ◽  
Jie Wu ◽  
Qing Nie ◽  
Xiaohui Xie

AbstractFor many RNA molecules, the secondary structure is essential for the correction function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization. Here we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data without any thermodynamic assumptions. UFold improves substantially upon previous models, with approximately 31% improvement over traditional thermodynamic models and 24.5% improvement over other learning-based methods. It achieves an F1 score of 0.96 on base pair prediction accuracy. An online web server running UFold is publicly available at http://ufold.ics.uci.edu.

2020 ◽  
Author(s):  
Kengo Sato ◽  
Manato Akiyama ◽  
Yasubumi Sakakibara

RNA secondary structure prediction is one of the key technologies for revealing the essential roles of functional non-coding RNAs. Although machine learning-based rich-parametrized models have achieved extremely high performance in terms of prediction accuracy, the risk of overfitting for such models has been reported. In this work, we propose a new algorithm for predicting RNA secondary structures that uses deep learning with thermodynamic integration, thereby enabling robust predictions. Similar to our previous work, the folding scores, which are computed by a deep neural network, are integrated with traditional thermodynamic parameters to enable robust predictions. We also propose thermodynamic regularization for training our model without overfitting it to the training data. Our algorithm (MXfold2) achieved the most robust and accurate predictions in computational experiments designed for newly discovered non-coding RNAs, with significant 2–10 % improvements over our previous algorithm (MXfold) and standard algorithms for predicting RNA secondary structures in terms of F-value.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Bowen Shen ◽  
Hao Zhang ◽  
Cong Li ◽  
Tianheng Zhao ◽  
Yuanning Liu

Traditional machine learning methods are widely used in the field of RNA secondary structure prediction and have achieved good results. However, with the emergence of large-scale data, deep learning methods have more advantages than traditional machine learning methods. As the number of network layers increases in deep learning, there will often be problems such as increased parameters and overfitting. We used two deep learning models, GoogLeNet and TCN, to predict RNA secondary results. And from the perspective of the depth and width of the network, improvements are made based on the neural network model, which can effectively improve the computational efficiency while extracting more feature information. We process the existing real RNA data through experiments, use deep learning models to extract useful features from a large amount of RNA sequence data and structure data, and then predict the extracted features to obtain each base’s pairing probability. The characteristics of RNA secondary structure and dynamic programming methods are used to process the base prediction results, and the structure with the largest sum of the probability of each base pairing is obtained, and this structure will be used as the optimal RNA secondary structure. We, respectively, evaluated GoogLeNet and TCN models based on 5sRNA, tRNA data, and tmRNA data, and compared them with other standard prediction algorithms. The sensitivity and specificity of the GoogLeNet model on the 5sRNA and tRNA data sets are about 16% higher than the best prediction results in other algorithms. The sensitivity and specificity of the GoogLeNet model on the tmRNA dataset are about 9% higher than the best prediction results in other algorithms. As deep learning algorithms’ performance is related to the size of the data set, as the scale of RNA data continues to expand, the prediction accuracy of deep learning methods for RNA secondary structure will continue to improve.


1999 ◽  
Vol 6 (15) ◽  
Author(s):  
Rune B. Lyngsø ◽  
Michael Zuker ◽  
Christian N. S. Pedersen

Though not as abundant in known biological processes as proteins,<br />RNA molecules serve as more than mere intermediaries between<br />DNA and proteins, e.g. as catalytic molecules. Furthermore,<br />RNA secondary structure prediction based on free energy<br />rules for stacking and loop formation remains one of the few major<br />breakthroughs in the field of structure prediction. We present a<br />new method to evaluate all possible internal loops of size at most<br />k in an RNA sequence, s, in time O(k|s|^2); this is an improvement<br />from the previously used method that uses time O(k^2|s|^2).<br />For unlimited loop size this improves the overall complexity of<br />evaluating RNA secondary structures from O(|s|^4) to O(|s|^3) and<br />the method applies equally well to finding the optimal structure<br />and calculating the equilibrium partition function. We use our<br />method to examine the soundness of setting k = 30, a commonly<br />used heuristic.


2012 ◽  
Vol 532-533 ◽  
pp. 1796-1799 ◽  
Author(s):  
Zhen Dong Liu ◽  
Da Ming Zhu

Pseudoknots are complicated and stable RNA structure. Based on the idea of iteratively forming stable stems, and the character that the stems in RNA molecules are relatively stable, an algorithm is presented to predict RNA secondary structure including pseudoknots, it is an improvement from the previously used algorithm ,the algorithm takes O(n3) time and O(n2) sapce , in predicting accuracy, it outperforms other known algorithm of RNA secondary structure prediction, its performance is tested with the RNA sub-sequences in PseudoBase. The experimental results indicate that the algorithm has good specificity and sensitivity.


Sign in / Sign up

Export Citation Format

Share Document