DCA: An efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment

1997 ◽  
Vol 13 (6) ◽  
pp. 625-626 ◽  
Author(s):  
Jens Stoye ◽  
Vincent Moulton ◽  
Andreas W.M. Dress
2010 ◽  
Vol 439-440 ◽  
pp. 35-40
Author(s):  
Zhan Mao Cao ◽  
Wen Jun Xiao ◽  
Li Min Peng

A brand new performance assessment model is proposed for multiple sequence alignment. The new strategy is based on beam constructing of DC-BTA algorithm, which is a Divide-and-Conquer alignment method with beams. Beams form blocks of almost the identical columns and contribute biggest similarity weight to sequences. A formula to compute all beam areas covering a sequence assigns a value or weight to the sequence. And the total beam area is a partial to the whole alignment. A rate value between 0 and 1 is computed to assess the performance. This scheme is a simple and effective assessment policy in DC-BTA for the convenience of collecting the beam areas.


This paper presents a strategy to tackle the Multiple Sequence Alignment (MSA) problem, which is one of the most important tasks in the biological sequence analysis. Its role is to align the sequences in their entirety to derive relationships and common characteristics between a set of protein or nucleotide sequences. The MSA problem was proved to be an NP-Hard problem. The proposed strategy incorporates a new idea based on the well-known divide and conquer paradigm. This paper presents a novel method of clustering sequences as a preliminary step to improve the final alignment; this decomposition can be used as an optimization procedure with any MSA aligner to explore promising alignments of the search space. In their solution, authors proposed to align the clusters in a parallel and distributed way in order to benefit from parallel architectures. The strategy was tested using classical benchmarks like BAliBASE, Sabre, Prefab4 and Oxm, and the experimental results show that it gives good results by comparing to the other aligners.


Author(s):  
John Tsiligaridis

The purpose of this chapter is to present a set of algorithms and their efficiency for the consistency based Multiple Sequence Alignment (MSA) problem. Based on the strength and adaptability of the Genetic Algorithm (GA) two approaches are developed depending on the MSA type. The first approach, for the non related sequences (no consistency), involves a Hybrid Genetic Algorithm (GA_TS) considering also Tabu Search (TS). The Traveling Salesman Problem (TSP) is also applied determining MSA orders. The second approach, for sequences with consistency, deals with a hybrid GA based on the Divide and Conquer principle (DCP) and it can save space. A consistent dot matrices (CDM) algorithm discovers consistency and creates MSA. The proposed GA (GA_TS_VS) also uses TS but it works with partitions. In conclusion, GAs are stochastic approaches that are proved very beneficial for MSA in terms of their performance.


Author(s):  
Vladimir Smirnov ◽  
Tandy Warnow

Abstract Motivation The estimation of large multiple sequence alignments (MSAs) is a basic bioinformatics challenge. Divide-and-conquer is a useful approach that has been shown to improve the scalability and accuracy of MSA estimation in established methods such as SATé and PASTA. In these divide-and-conquer strategies, a sequence dataset is divided into disjoint subsets, alignments are computed on the subsets using base MSA methods (e.g. MAFFT), and then merged together into an alignment on the full dataset. Results We present MAGUS, Multiple sequence Alignment using Graph clUStering, a new technique for computing large-scale alignments. MAGUS is similar to PASTA in that it uses nearly the same initial steps (starting tree, similar decomposition strategy, and MAFFT to compute subset alignments), but then merges the subset alignments using the Graph Clustering Merger, a new method for combining disjoint alignments that we present in this study. Our study, on a heterogeneous collection of biological and simulated datasets, shows that MAGUS produces improved accuracy and is faster than PASTA on large datasets, and matches it on smaller datasets. Availability and implementation MAGUS: https://github.com/vlasmirnov/MAGUS Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Rabah Lebsir ◽  
Abdesslem Layeb ◽  
Tahi Fariza

This paper presents a strategy to tackle the Multiple Sequence Alignment (MSA) problem, which is one of the most important tasks in the biological sequence analysis. Its role is to align the sequences in their entirety to derive relationships and common characteristics between a set of protein or nucleotide sequences. The MSA problem was proved to be an NP-Hard problem. The proposed strategy incorporates a new idea based on the well-known divide and conquer paradigm. This paper presents a novel method of clustering sequences as a preliminary step to improve the final alignment; this decomposition can be used as an optimization procedure with any MSA aligner to explore promising alignments of the search space. In their solution, authors proposed to align the clusters in a parallel and distributed way in order to benefit from parallel architectures. The strategy was tested using classical benchmarks like BAliBASE, Sabre, Prefab4 and Oxm, and the experimental results show that it gives good results by comparing to the other aligners.


Sign in / Sign up

Export Citation Format

Share Document