Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure

1999 ◽  
Vol 288 (5) ◽  
pp. 911-940 ◽  
Author(s):  
David H. Mathews ◽  
Jeffrey Sabina ◽  
Michael Zuker ◽  
Douglas H. Turner
2019 ◽  
Author(s):  
F A Rezaur Rahman Chowdhury ◽  
He Zhang ◽  
Liang Huang

AbstractRNA secondary structure is helpful for understanding RNA’s functionality, thus accurate prediction systems are desired. Both thermodynamics-based models and machine learning-based models have been used in different prediction systems to solve this problem. Compared to thermodynamics-based models, machine learning-based models can address the inaccurate measurement of thermodynamic parameters due to experimental limitation. However, the existing methods for training machine learning-based models are still expensive because of their cubic-time inference cost. To overcome this, we present a linear-time machine learning-based folding system, using recently proposed approximate folding tool LinearFold as inference engine, and structured SVM (sSVM) as training algorithm. Furthermore, to remedy non-convergence of naive sSVM with inexact search inference, we introduce a max violation update strategy. The training speed of our system is 41× faster than CONTRAfold on a diverse dataset for one epoch, and 14× faster than MXfold on a dataset with longer sequences. With the learned parameters, our system improves the accuracy of LinearFold, and is also the most accurate system among selected folding tools, including CONTRAfold, Vienna RNAfold and MXfold.


Sign in / Sign up

Export Citation Format

Share Document