scholarly journals Accelerating Smith-Waterman Algorithm for Faster Sequence Alignment using Graphical Processing Unit

2022 ◽  
Vol 2161 (1) ◽  
pp. 012028
Author(s):  
Karamjeet Kaur ◽  
Sudeshna Chakraborty ◽  
Manoj Kumar Gupta

Abstract In bioinformatics, sequence alignment is very important task to compare and find similarity between biological sequences. Smith Waterman algorithm is most widely used for alignment process but it has quadratic time complexity. This algorithm is using sequential approach so if the no. of biological sequences is increasing then it takes too much time to align sequences. In this paper, parallel approach of Smith Waterman algorithm is proposed and implemented according to the architecture of graphic processing unit using CUDA in which features of GPU is combined with CPU in such a way that alignment process is three times faster than sequential implementation of Smith Waterman algorithm and helps in accelerating the performance of sequence alignment using GPU. This paper describes the parallel implementation of sequence alignment using GPU and this intra-task parallelization strategy reduces the execution time. The results show significant runtime savings on GPU.

2009 ◽  
Vol 19 (04) ◽  
pp. 513-533 ◽  
Author(s):  
FUMIHIKO INO ◽  
YUKI KOTANI ◽  
YUMA MUNEKAWA ◽  
KENICHI HAGIHARA

This paper presents a parallel system capable of accelerating biological sequence alignment on the graphics processing unit (GPU) grid. The GPU grid in this paper is a desktop grid system that utilizes idle GPUs and CPUs in the office and home. Our parallel implementation employs a master-worker paradigm to accelerate an OpenGL-based algorithm that runs on a single GPU. We integrate this implementation into a screensaver-based grid system that detects idle resources on which the alignment code can run. We also show some experimental results comparing our implementation with three different implementations running on a single GPU, a single CPU, or multiple CPUs. As a result, we find that a single non-dedicated GPU can provide us almost the same throughput as two dedicated CPUs in our laboratory environment, where GPU-equipped machines are ordinarily used to develop GPU applications. In a dedicated environment, the GPU-accelerated code achieves five times higher throughput than the CPU-based code. Furthermore, a linear speedup of 30.7X is observed on a 32-node cluster of dedicated GPUs. We also implement a compute unified device architecture (CUDA) based algorithm to demonstrate further acceleration.


Author(s):  
Abdullah N. Arslan

Sequence alignment is one of the most fundamental problems in computational biology. Ordinarily, the problem aims to align symbols of given sequences in a way to optimize similarity score. This score is computed using a given scoring matrix that assigns a score to every pair of symbols in an alignment. The expectation is that scoring matrices perform well for alignments of all sequences. However, it has been shown that this is not always true although scoring matrices are derived from known similarities. Biological sequences share common sequence structures that are signatures of common functions, or evolutionary relatedness. The alignment process should be guided by constraining the desired alignments to contain these structures even though this does not always yield optimal scores. Changes in biological sequences occur over the course of millions of years, and in ways, and orders we do not completely know. Sequence alignment has become a dynamic area where new knowledge is acquired, new common structures are extracted from sequences, and these yield more sophisticated alignment methods, which in turn yield more knowledge. This feedback loop is essential for this inherently difficult task. The ordinary definition of sequence alignment does not always reveal biologically accurate similarities. To overcome this, there have been attempts that redefined sequence similarity. Huang (1994) proposed an optimization problem in which close matches are rewarded more favorably than the same number of isolated matches. Zhang, Berman & Miller (1998) proposed an algorithm that finds alignments free of low scoring regions. Arslan, Egecioglu, & Pevzner (2001) proposed length-normalized local sequence alignment for which the objective is to find subsequences that yield maximum length-normalized score where the length-normalized score of a given alignment is its score divided by sum of subsequence-lengths involved in the alignment. This can be considered as a contextdependent sequence alignment where a high degree of local similarity defines a context. Arslan, Egecioglu, & Pevzner (2001) presented a fractional programming algorithm for the resulting problem. Although these attempts are important, some biologically meaningful alignments can contain motifs whose inclusions are not guaranteed in the alignments returned by these methods. Our emphasis in this chapter is on methods that guide sequence alignment by requiring desired alignments to contain given common structures identified in sequences (motifs).


Author(s):  
Ahmad Hasif Azman ◽  
Syed Abdul Mutalib Al Junid ◽  
Abdul Hadi Abdul Razak ◽  
Mohd Faizul Md Idros ◽  
Abdul Karimi Halim ◽  
...  

Nowadays, the requirement for high performance and sensitive alignment tools have increased after the advantage of the Deoxyribonucleic Acid (DNA) and molecular biology has been figured out through Bioinformatics study. Therefore, this paper reports the performance evaluation of parallel Smith-Waterman Algorithm implementation on the new NVIDIA GeForce GTX Titan X Graphic Processing Unit (GPU) compared to the Central Processing Unit (CPU) running on Intel® CoreTM i5-4440S CPU 2.80GHz. Both of the design were developed using C-programming language and targeted to the respective platform. The code for GPU was developed and compiled using NVIDIA Compute Unified Device Architecture (CUDA). It clearly recorded that, the performance of GPU based computational is better compared to the CPU based. These results indicate that the GPU based DNA sequence alignment has a better speed in accelerating the computational process of DNA sequence alignment.


Author(s):  
Soumya Ranjan Nayak ◽  
S Sivakumar ◽  
Akash Kumar Bhoi ◽  
Gyoo-Soo Chae ◽  
Pradeep Kumar Mallick

Graphical processing unit (GPU) has gained more popularity among researchers in the field of decision making and knowledge discovery systems. However, most of the earlier studies have GPU memory utilization, computational time, and accuracy limitations. The main contribution of this paper is to present a novel algorithm called the Mixed Mode Database Miner (MMDBM) classifier by implementing multithreading concepts on a large number of attributes. The proposed method use the quick sort algorithm in GPU parallel computing to overcome the state of the art limitations. This method applies the dynamic rule generation approach for constructing the decision tree based on the predicted rules. Moreover, the implementation results are compared with both SLIQ and MMDBM using Java and GPU with the computed acceleration ratio time using the BP dataset. The primary objective of this work is to improve the performance with less processing time. The results are also analyzed using various threads in GPU mining using eight different datasets of UCI Machine learning repository. The proposed MMDBM algorithm have been validated on these chosen eight different dataset with accuracy of 91.3% in diabetes, 89.1% in breast cancer, 96.6% in iris, 89.9% in labor, 95.4% in vote, 89.5% in credit card, 78.7% in supermarket and 78.7% in BP, and simultaneously, it also takes less computational time for given datasets. The outcome of this work will be beneficial for the research community to develop more effective multi thread based GPU solution in GPU mining to handle large set of data in minimal processing time. Therefore, this can be considered a more reliable and precise method for GPU computing.


Sign in / Sign up

Export Citation Format

Share Document