scholarly journals Testing statistical significance scores of sequence comparison methods with structure similarity

2006 ◽  
Vol 7 (1) ◽  
Author(s):  
Tim Hulsen ◽  
Jacob de Vlieg ◽  
Jack AM Leunissen ◽  
Peter MA Groenen
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Andrzej Zielezinski ◽  
Hani Z. Girgis ◽  
Guillaume Bernard ◽  
Chris-Andre Leimeister ◽  
Kujin Tang ◽  
...  

1992 ◽  
Vol 54 (4) ◽  
pp. 563-598 ◽  
Author(s):  
S. C. Chan ◽  
A. K. C. Wong ◽  
D. K. Y. Chiu

Author(s):  
Afshin Fayyaz movaghar ◽  
Sabine Mercier ◽  
Louis Ferré

We propose an approximate distribution for the gapped local score of a two sequence comparison. Our method stands on combining an adapted scoring scheme that includes the gaps and an approximate distribution of the ungapped local score of two independent sequences of i.i.d. random variables. The new scoring scheme is defined on h-tuples of the sequences, using the gapped global score. The influence of h and the accuracy of the p-value are numerically studied and compared with obtained p-value of BLAST. The numerical experiments emphasize that our approximate p-values outperform the BLAST ones, particularly for both simulated and real short sequences.


1991 ◽  
Vol 4 (4) ◽  
pp. 375-383 ◽  
Author(s):  
Patrick Argos ◽  
Martin Vingron ◽  
Gerhard Vogt

2021 ◽  
Author(s):  
Yang Young Lu ◽  
Yiwen Wang ◽  
Fang Zhang ◽  
Jiaxing Bai ◽  
Ying Wang

AbstractMotivationUnderstanding the phylogenetic relationship among organisms is the key in contemporary evolutionary study and sequence analysis is the workhorse towards this goal. Conventional approaches to sequence analysis are based on sequence alignment, which is neither scalable to large-scale datasets due to computational inefficiency nor adaptive to next-generation sequencing (NGS) data. Alignment-free approaches are typically used as computationally effective alternatives yet still suffering the high demand of memory consumption. One desirable sequence comparison method at large-scale requires succinctly-organized sequence data management, as well as prompt sequence retrieval given a never-before-seen sequence as query.ResultsIn this paper, we proposed a novel approach, referred to as SAINT, for efficient and accurate alignment-free sequence comparison. Compared to existing alignment-free sequence comparison methods, SAINT offers advantages in two aspects: (1) SAINT is a weakly-supervised learning method where the embedding function is learned automatically from the easily-acquired data; (2) SAINT utilizes the non-linear deep learning-based model which potentially better captures the complicated relationship among genome sequences. We have applied SAINT to real-world datasets to demonstrate its empirical utility, both qualitatively and quantitatively. Considering the extensive applicability of alignment-free sequence comparison methods, we expect SAINT to motivate a more extensive set of applications in sequence comparison at large scale.AvailabilityThe open source, Apache licensed, python-implemented code will be available upon acceptance.Supplementary informationSupplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document