A multiple sequence comparison method

1993 ◽  
Vol 55 (2) ◽  
pp. 465-486 ◽  
Author(s):  
A. K. C. Wong ◽  
S. C. Chan ◽  
D. K. Y. Chiu

1993 ◽  
Vol 55 (2) ◽  
pp. 465-486 ◽  
Author(s):  
A WONG ◽  
S CHAN ◽  
D CHIU


2019 ◽  
Author(s):  
Pu Tian

AbstractSequence comparison is the cornerstone of bioinformatics and is traditionally realized by alignment. Unfortunately, exponential computational complexity renders rigorous multiple sequence alignment (MSA) intractable. Approximate algorithms and heuristics provide acceptable performance for relatively small number of sequences but engender prohibitive computational cost and unbounded accumulation of error for massive sequence sets. Alignment free algorithms achieved linear computational cost for sequence pair comparison but the challenge for multiple sequence comparison (MSC) remains. Meanwhile, various number of parameters and procedures need to be empirically adjusted for different MSC tasks with their complex interactions and impact not well understood. Therefore, development of efficient and nonparametric global sequence comparison method is essential for explosive sequencing data. It is shown here that sorted composition vector (SCV), which is based on a physical perspective on sequence composition constraint, is a feasible non-parametric encoding scheme for global protein sequence comparison and classification with linear computational complexity, and provides a global atlas tree for natural protein sequences. This finding renders massive sequence comparison and classification, which is infeasible on supercomputers, routine on a workstation. SCV sets an example of one-way encoding that might revolutionize recognition and classification tasks in general.



2010 ◽  
Vol 39 (3) ◽  
pp. 325-335
Author(s):  
Junmei Jing ◽  
Conrad J. Burden ◽  
Sylvain Forêt ◽  
Susan R. Wilson


1992 ◽  
Vol 54 (4) ◽  
pp. 563-598 ◽  
Author(s):  
S. C. Chan ◽  
A. K. C. Wong ◽  
D. K. Y. Chiu


PLoS ONE ◽  
2016 ◽  
Vol 11 (7) ◽  
pp. e0158897 ◽  
Author(s):  
Mitchell J. Brittnacher ◽  
Sonya L. Heltshe ◽  
Hillary S. Hayden ◽  
Matthew C. Radey ◽  
Eli J. Weiss ◽  
...  


1995 ◽  
Vol 16 (1) ◽  
pp. 1-22 ◽  
Author(s):  
M. Vingron ◽  
P.A. Pevzner


2021 ◽  
Author(s):  
Yang Young Lu ◽  
Yiwen Wang ◽  
Fang Zhang ◽  
Jiaxing Bai ◽  
Ying Wang

AbstractMotivationUnderstanding the phylogenetic relationship among organisms is the key in contemporary evolutionary study and sequence analysis is the workhorse towards this goal. Conventional approaches to sequence analysis are based on sequence alignment, which is neither scalable to large-scale datasets due to computational inefficiency nor adaptive to next-generation sequencing (NGS) data. Alignment-free approaches are typically used as computationally effective alternatives yet still suffering the high demand of memory consumption. One desirable sequence comparison method at large-scale requires succinctly-organized sequence data management, as well as prompt sequence retrieval given a never-before-seen sequence as query.ResultsIn this paper, we proposed a novel approach, referred to as SAINT, for efficient and accurate alignment-free sequence comparison. Compared to existing alignment-free sequence comparison methods, SAINT offers advantages in two aspects: (1) SAINT is a weakly-supervised learning method where the embedding function is learned automatically from the easily-acquired data; (2) SAINT utilizes the non-linear deep learning-based model which potentially better captures the complicated relationship among genome sequences. We have applied SAINT to real-world datasets to demonstrate its empirical utility, both qualitatively and quantitatively. Considering the extensive applicability of alignment-free sequence comparison methods, we expect SAINT to motivate a more extensive set of applications in sequence comparison at large scale.AvailabilityThe open source, Apache licensed, python-implemented code will be available upon acceptance.Supplementary informationSupplementary data are available at Bioinformatics online.



1990 ◽  
pp. 438-447 ◽  
Author(s):  
David J. Bacon ◽  
Wayne F. Anderson


Sign in / Sign up

Export Citation Format

Share Document