Parallelization of Dynamic Programming in Nussinov RNA Folding Algorithm on the CUDA GPU

Author(s):  
Marina Zaharieva Stojanovski ◽  
Dejan Gjorgjevikj ◽  
Gjorgji Madjarov
2012 ◽  
Vol 10 (02) ◽  
pp. 1241007 ◽  
Author(s):  
SLAVICA DIMITRIEVA ◽  
PHILIPP BUCHER

Commonly used RNA folding programs compute the minimum free energy structure of a sequence under the pseudoknot exclusion constraint. They are based on Zuker's algorithm which runs in time O(n3). Recently, it has been claimed that RNA folding can be achieved in average time O(n2) using a sparsification technique. A proof of quadratic time complexity was based on the assumption that computational RNA folding obeys the "polymer-zeta property". Several variants of sparse RNA folding algorithms were later developed. Here, we present our own version, which is readily applicable to existing RNA folding programs, as it is extremely simple and does not require any new data structure. We applied it to the widely used Vienna RNAfold program, to create sibRNAfold, the first public sparsified version of a standard RNA folding program. To gain a better understanding of the time complexity of sparsified RNA folding in general, we carried out a thorough run time analysis with synthetic random sequences, both in the context of energy minimization and base pairing maximization. Contrary to previous claims, the asymptotic time complexity of a sparsified RNA folding algorithm using standard energy parameters remains O(n3) under a wide variety of conditions. Consistent with our run-time analysis, we found that RNA folding does not obey the "polymer-zeta property" as claimed previously. Yet, a basic version of a sparsified RNA folding algorithm provides 15- to 50-fold speed gain. Surprisingly, the same sparsification technique has a different effect when applied to base pairing optimization. There, its asymptotic running time complexity appears to be either quadratic or cubic depending on the base composition. The code used in this work is available at: http://sibRNAfold.sourceforge.net/ .


1998 ◽  
Vol 24 (11) ◽  
pp. 1617-1634 ◽  
Author(s):  
Jih-H. Chen ◽  
Shu-Yun Le ◽  
Bruce A. Shapiro ◽  
Jacob V. Maizel

1990 ◽  
Vol 185 (1) ◽  
pp. 57-62 ◽  
Author(s):  
Luke Pallansch ◽  
Howard Beswick ◽  
John Talian ◽  
Peggy Zelenka

2006 ◽  
Vol 30 (1) ◽  
pp. 72-76 ◽  
Author(s):  
Haijun Liu ◽  
Dong Xu ◽  
Jianlin Shao ◽  
Yifei Wang

1998 ◽  
Vol 276 (1) ◽  
pp. 43-55 ◽  
Author(s):  
Alexander P Gultyaev ◽  
F.H.D van Batenburg ◽  
Cornelis W.A Pleij

2019 ◽  
Vol 35 (14) ◽  
pp. i295-i304 ◽  
Author(s):  
Liang Huang ◽  
He Zhang ◽  
Dezhong Deng ◽  
Kai Zhao ◽  
Kaibo Liu ◽  
...  

Abstract Motivation Predicting the secondary structure of an ribonucleic acid (RNA) sequence is useful in many applications. Existing algorithms [based on dynamic programming] suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Results We present a novel alternative O(n3)-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in O(n) time and O(n) space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5′-to-3′) direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models. Availability and implementation Our source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100 000nt). Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Guillaume Rizk ◽  
Dominique Lavenier ◽  
Sanjay Rajopadhye

Sign in / Sign up

Export Citation Format

Share Document