GPU accelerated partial order multiple sequence alignment for long reads self-correction

Mapping Intimacies ◽

10.1101/2020.02.14.946939 ◽

2020 ◽

Author(s):

Francesco Peverelli ◽

Lorenzo Di Tucci ◽

Marco D. Santambrogio ◽

Nan Ding ◽

Steven Hofmeyr ◽

...

Keyword(s):

Error Correction ◽

Partial Order ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Consensus Sequence ◽

Pairwise Alignment ◽

Multiple Sequence ◽

Graph Alignment ◽

Correction Process ◽

Long Reads

AbstractAs third generation sequencing technologies become more reliable and widely used to solve several genome-related problems, self-correction of long reads is becoming the preferred method to reduce the error rate of Pacific Biosciences and Oxford Nanopore long reads, that is now around 10-12%. Several of these self-correction methods rely on some form of Multiple Sequence Alignment (MSA) to obtain a consensus sequence for the original reads. In particular, error-correction tools such as RACON and CONSENT use Partial Order (PO) graph alignment to accomplish this task. PO graph alignment, which is computationally more expensive than optimal global pairwise alignment between two sequences, needs to be performed several times for each read during the error correction process. GPUs have proven very effective in accelerating several compute-intensive tasks in different scientific fields. We harnessed the power of these architectures to accelerate the error correction process of existing self-correction tools, to improve the efficiency of this step of genome analysis.In this paper, we introduce a GPU-accelerated version of the PO alignment presented in the POA v2 software library, implemented on an NVIDIA Tesla V100 GPU. We obtain up to 6.5x speedup compared to 64 CPU threads run on two 2.3 GHz 16-core Intel Xeon Processors E5-2698 v3. In our implementation we focused on the alignment of smaller sequences, as the CONSENT segmentation strategy based on k-mer chaining provides an optimal opportunity to exploit the parallel-processing power of GPUs. To demonstrate this, we have integrated our kernel in the CONSENT software. This accelerated version of CONSENT provides a speedup for the whole error correction step that ranges from 1.95x to 8.5x depending on the input reads.

Download Full-text

GPU accelerated partial order multiple sequence alignment for long reads self-correction

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) ◽

10.1109/ipdpsw50202.2020.00039 ◽

2020 ◽

Author(s):

Francesco Peverelli ◽

Lorenzo Di Tucci ◽

Marco D. Santambrogio ◽

Nan Ding ◽

Steven Hofmeyr ◽

...

Keyword(s):

Partial Order ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Long Reads

Download Full-text

Multiple Sequence Alignment Optimization Using Meta-Heuristic Techniques

Data Analytics in Medicine ◽

10.4018/978-1-7998-1204-3.ch031 ◽

2020 ◽

pp. 565-579 ◽

Cited By ~ 1

Author(s):

Mohamed Issa ◽

Aboul Ella Hassanien

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Phylogenetic Trees ◽

Pairwise Alignment ◽

Accurate Method ◽

Alignment Algorithm ◽

Bacterial Foraging Optimization ◽

Multiple Sequence ◽

Speed Up ◽

Dna Fragment Assembly

Sequence alignment is a vital process in many biological applications such as Phylogenetic trees construction, DNA fragment assembly and structure/function prediction. Two kinds of alignment are pairwise alignment which align two sequences and Multiple Sequence alignment (MSA) that align sequences more than two. The accurate method of alignment is based on Dynamic Programming (DP) approach which suffering from increasing time exponentially with increasing the length and the number of the aligned sequences. Stochastic or meta-heuristics techniques speed up alignment algorithm but with near optimal alignment accuracy not as that of DP. Hence, This chapter aims to review the recent development of MSA using meta-heuristics algorithms. In addition, two recent techniques are focused in more deep: the first is Fragmented protein sequence alignment using two-layer particle swarm optimization (FTLPSO). The second is Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm (MO-BFO).

Download Full-text

An algorithm of multiple sequence alignment based on consensus sequence searched by simulated annealing and star alignment

2015 International Symposium on Bioelectronics and Bioinformatics (ISBB) ◽

10.1109/isbb.2015.7344909 ◽

2015 ◽

Cited By ~ 3

Author(s):

Dengfeng Yao ◽

Minghu Jiang ◽

Xu You ◽

Abudoukelimu Abulizi ◽

Renkui Hou

Keyword(s):

Simulated Annealing ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Consensus Sequence ◽

Multiple Sequence

Download Full-text

A Hybrid Flow for Multiple Sequence Alignment with a BLASTn Based Pairwise Alignment Processor

2018 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas.2018.8351254 ◽

2018 ◽

Author(s):

Mao-Jan Lin ◽

Chih-Yu Chang ◽

Yu-Cheng Li ◽

Nae-Chyun Chen ◽

Yi-Chang Lu

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Pairwise Alignment ◽

Multiple Sequence

Download Full-text

Parallelization of Pairwise Alignment and Neighbor-Joining Algorithm in Progressive Multiple Sequence Alignment

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v9.i1.pp234-242 ◽

2018 ◽

Vol 9 (1) ◽

pp. 234

Author(s):

Agung Widyo Utomo

Keyword(s):

Shared Memory ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Programming Model ◽

Heuristic Method ◽

Pairwise Alignment ◽

Neighbor Joining ◽

Progressive Alignment ◽

Multiple Sequence ◽

Progressive Multiple Sequence Alignment

Progressive multiple sequence alignment ClustalW is a widely used heuristic method for computing multiple sequence alignment (MSA). It has three stages: distance matrix computation using pairwise alignment, guide tree reconstruction using neighbor-joining and progressive alignment. To accelerate computing for large data, the progressive MSA algorithm needs to be parallelized. This research aims to identify, decompose and implement the pairwise alignment and neighbor-joining in progressive MSA using message passing, shared memory and hybrid programming model in the computer cluster. The experimental results obtained shared memory programming model as the best scenario implementation with speed up up to 12 times.

Download Full-text

Efficient mapping of genomic sequences to optimize multiple pairwise alignment in hybrid cluster platforms

Journal of Integrative Bioinformatics ◽

10.1515/jib-2014-251 ◽

2014 ◽

Vol 11 (3) ◽

pp. 60-71

Author(s):

Alberto Montañola ◽

Concepció Roig ◽

Porfidio Hernández

Keyword(s):

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Pairwise Alignment ◽

Experimental Results ◽

Genomic Sequences ◽

Multiple Sequence ◽

Optimal Amount ◽

New Challenges ◽

Best Parameters ◽

Available Resources

Summary Multiple sequence alignment (MSA), used in biocomputing to study similarities between different genomic sequences, is known to require important memory and computation resources. Nowadays, researchers are aligning thousands of these sequences, creating new challenges in order to solve the problem using the available resources efficiently. Determining the efficient amount of resources to allocate is important to avoid waste of them, thus reducing the economical costs required in running for example a specific cloud instance. The pairwise alignment is the initial key step of the MSA problem, which will compute all pair alignments needed. We present a method to determine the optimal amount of memory and computation resources to allocate by the pairwise alignment, and we will validate it through a set of experimental results for different possible inputs. These allow us to determine the best parameters to configure the applications in order to use effectively the available resources of a given system.

Download Full-text

Generating consensus sequences from partial order multiple sequence alignment graphs

Bioinformatics ◽

10.1093/bioinformatics/btg109 ◽

2003 ◽

Vol 19 (8) ◽

pp. 999-1008 ◽

Cited By ~ 47

Author(s):

C. Lee

Keyword(s):

Partial Order ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence ◽

Consensus Sequences

Download Full-text

POAVIZ: a Partial Order Multiple Sequence Alignment Visualizer

Bioinformatics ◽

10.1093/bioinformatics/btg175 ◽

2003 ◽

Vol 19 (11) ◽

pp. 1446-1448 ◽

Cited By ~ 6

Author(s):

C. Grasso ◽

M. Quist ◽

K. Ke ◽

C. Lee

Keyword(s):

Partial Order ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence

Download Full-text

Multiple sequence alignment using partial order graphs

Bioinformatics ◽

10.1093/bioinformatics/18.3.452 ◽

2002 ◽

Vol 18 (3) ◽

pp. 452-464 ◽

Cited By ~ 495

Author(s):

C. Lee ◽

C. Grasso ◽

M. F. Sharlow

Keyword(s):

Partial Order ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Multiple Sequence

Download Full-text

Evolving Consensus Sequence for Multiple Sequence Alignment with a Genetic Algorithm

Genetic and Evolutionary Computation — GECCO 2003 - Lecture Notes in Computer Science ◽

10.1007/3-540-45110-2_124 ◽

2003 ◽

pp. 2313-2324 ◽

Cited By ~ 3

Author(s):

Conrad Shyu ◽

James A. Foster

Keyword(s):

Genetic Algorithm ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Consensus Sequence ◽

Multiple Sequence

Download Full-text