Data dependency reduction in Dynamic Programming matrix

Author(s):  
Guillermo Delgado ◽  
Chatchawit Aporntewan
Author(s):  
Louis J. Cochrane ◽  
Derek Gatherer

The Needleman-Wunsch process is a classic tool in bioinformatics, being a dynamic programming algorithm that performs a pairwise alignment of two input biological sequences, either protein or nucleic acid. A distance matrix between the tokens used in the sequences is also required as input. The distance matrix is used to generate a positional pairwise similarity matrix between the input sequences, which is in turn used to generate a dynamic programming matrix. The best path through the dynamic programming matrix is navigated using a traceback procedure that maximises similarity, inserting gaps as necessary. Needleman-Wunsch can align both nucleic acids or proteins, which use alphabets of size 4 and 20 tokens respectively. It can also be applied to any other kind of sequence where distance matrices can be specified. Here, we apply it to chains of Pousseur’s Scambi electronic music fragments, of which there are 32, and which Pousseur categorised by their sonic properties, thus permitting the consecutive construction of distance, similarity and dynamic programming matrices. Traceback through the dynamic programming matrix thus produces contrapuntal duet compositions in which two Scambi chains are played in the maximally euphonious manner, providing also an illustration of the principles of biological sequence alignment in sound.


2005 ◽  
Vol 16 (1) ◽  
pp. 75-81 ◽  
Author(s):  
KETIL MALDE ◽  
ROBERT GIEGERICH

Position-specific scoring matrices are one way to represent approximate string patterns, which are commonly encountered in the field of bioinformatics. An important problem that arises with their application is calculating the statistical significance of matches. We review the currently most efficient algorithm for this task, and show how it can be implemented in Haskell, taking advantage of the built-in non-strictness of the language. The resulting program turns out to be an instance of dynamic programming, using lists rather the typical dynamic programming matrix.


Author(s):  
Yun Sup Lee ◽  
Yu Sin Kim ◽  
Roger Luis Uy

Needleman-Wunsch dynamic programming algorithm measures the similarity of the pairwise sequence and finds the optimal pair given the number of sequences. The task becomes nontrivial as the number of sequences to compare or the length of sequences increases. This research aims to parallelize the computation involved in the algorithm to speed up the performance using CUDA. However, there is a data dependency issue due to the property of a dynamic programming algorithm. As a solution, this research introduces the heterogeneous anti-diagonal approach, which benefits from the interaction between the serial implementation on CPU and the parallel implementation on GPU. We then measure and compare the computation time between the proposed approach and a straightforward serial approach that uses CPU only. Measurements of computation times are performed under the same experimental setup and using various pairwise sequences at different lengths. The experiment showed that the proposed approach outperforms the serial method in terms of computation time by approximately three times. Moreover, the computation time of the proposed heterogeneous anti-diagonal approach increases gradually despite the big increments in sequence length, whereas the computation time of the serial approach grows rapidly.


2014 ◽  
Vol 23 (03) ◽  
pp. 1450031
Author(s):  
QIANGHUA ZHU ◽  
FEI XIA ◽  
GUOQING JIN

RNA secondary structure prediction is one of the important research areas in modern bioinformatics and computational biology. PKNOTS is the most famous benchmark program and has been widely used to predict RNA secondary structure including pseudoknots. It adopts the standard 4D dynamic programming method and is the basis of many variants and improved algorithms. Unfortunately, the O(N6) computing requirements and complicated data dependency greatly limits the usefulness of PKNOTS package with the explosion in gene database size. In this paper, we present a fine-grained parallel PKNOTS algorithm and prototype system for accelerating RNA folding application on field programmable gate-array (FPGA) platform. We improved data locality by converting cycle nested relationship and reorganizing computing order of the elements in source code. We aggressively exploit data reuse, data dependency elimination and memory access scheduling strategies to minimize the need for loading data from external memory. To the best of our knowledge, our design is the first FPGA implementation for accelerating 4D dynamic programming problem for RNA folding application including pseudoknots. The experimental results show a factor of more than 11 × average speedup over the PKNOTS-1.05 software running on a PC platform with AMD Phenom 9650 Quad CPU for input RNA sequences. However, the power consumption of our FPGA accelerator is only about 50% of the general-purpose micro-processors.


Sign in / Sign up

Export Citation Format

Share Document