Efficient Approximate Pattern Matching Algorithm for Biological Sequences

Journal of Xidian University ◽

10.37896/jxu14.8/128 ◽

2020 ◽

Vol 14 (8) ◽

Keyword(s):

Pattern Matching ◽

Biological Sequences ◽

Matching Algorithm ◽

Approximate Pattern Matching ◽

Pattern Matching Algorithm

Download Full-text

IDPM: An Improved Degenerate Pattern Matching Algorithm for Biological Sequences

International Journal of Foundations of Computer Science ◽

10.1142/s0129054117500307 ◽

2017 ◽

Vol 28 (07) ◽

pp. 889-914

Author(s):

Jie Lin ◽

Yue Jiang ◽

E. James Harner ◽

Bing-Hua Jiang ◽

Don Adjeroh

Keyword(s):

Performance Improvement ◽

Pattern Matching ◽

Linear Time ◽

Computational Cost ◽

Biological Sequences ◽

Matching Problem ◽

Practical Utilization ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Let [Formula: see text] be a string, with symbols from an alphabet. [Formula: see text] is said to be degenerate if for some positions, say [Formula: see text], [Formula: see text] can contain a subset of symbols from the symbol alphabet, rather than just one symbol. Given a text string [Formula: see text] and a pattern [Formula: see text], both with symbols from an alphabet [Formula: see text], the degenerate string matching problem, is to find positions in [Formula: see text] where [Formula: see text] occured, such that [Formula: see text], [Formula: see text], or both are allowed to be degenerate. Though some algorithms have been proposed, their huge computational cost pose a significant challenge to their practical utilization. In this work, we propose IDPM, an improved degenerate pattern matching algorithm based on an extension of the Boyer–Moore algorithm. At the preprocessing phase, the algorithm defines an alphabet-independent compatibility rule, and computes the shift arrays using respective variants of the bad character and good suffix heuristics. At the search phase, IDPM improves the matching speed by using the compatibility rule. On average, the proposed IDPM algorithm has a linear time complexity with respect to the text size, and to the overall size of the pattern. IDPM demonstrates significance performance improvement over state-of-the-art approaches. It can be used in fast practical degenerate pattern matching with large data sizes, with important applications in flexible and scalable searching of huge biological sequences.

Download Full-text

A Fast Exact Pattern Matching Algorithm for Biological Sequences

2008 International Conference on BioMedical Engineering and Informatics ◽

10.1109/bmei.2008.154 ◽

2008 ◽

Author(s):

Yong Huang ◽

Lingdi Ping ◽

Xuezeng Pan ◽

Guoyong Cai

Keyword(s):

Pattern Matching ◽

Biological Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

Approximate Pattern Matching Algorithm

Information Processing and Management of Uncertainty in Knowledge-Based Systems - Communications in Computer and Information Science ◽

10.1007/978-3-319-40596-4_48 ◽

2016 ◽

pp. 577-587 ◽

Author(s):

Petr Hurtik ◽

Petra Hodáková ◽

Irina Perfilieva

Keyword(s):

Pattern Matching ◽

Matching Algorithm ◽

Approximate Pattern Matching ◽

Pattern Matching Algorithm

Download Full-text

A Fast Hybrid Pattern Matching Algorithm for Biological Sequences

2009 2nd International Conference on Biomedical Engineering and Informatics ◽

10.1109/bmei.2009.5305645 ◽

2009 ◽

Author(s):

Guoyong Cai ◽

Xining Nie ◽

Yong Huang

Keyword(s):

Pattern Matching ◽

Biological Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

A Fast Improved Pattern Matching Algorithm for Biological Sequences

2008 International Symposium on Computational Intelligence and Design ◽

10.1109/iscid.2008.117 ◽

2008 ◽

Author(s):

Yong Huang ◽

Lingdi Ping ◽

Xuezeng Pan ◽

Li Jiang ◽

Xiaoning Jiang

Keyword(s):

Pattern Matching ◽

Biological Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

A Fast Pattern Matching Algorithm for Biological Sequences

2008 2nd International Conference on Bioinformatics and Biomedical Engineering ◽

10.1109/icbbe.2008.148 ◽

2008 ◽

Author(s):

Yong Huang ◽

Xuezeng Pan ◽

Yunjun Gao ◽

Guoyong Cai

Keyword(s):

Pattern Matching ◽

Biological Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

An Optimized Aho-Corasick Multi-Pattern Matching Algorithm for Fast Pattern Matching

2020 IEEE 17th India Council International Conference (INDICON) ◽

10.1109/indicon49873.2020.9342041 ◽

2020 ◽

Author(s):

Uday Trivedi

Keyword(s):

Pattern Matching ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

A Flexible Pattern-Matching Algorithm for Network Intrusion Detection Systems Using Multi-Core Processors

10.3390/a10020058 ◽

2017 ◽

Vol 10 (2) ◽

pp. 58 ◽

Author(s):

◽

Keyword(s):

Intrusion Detection ◽

Pattern Matching ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Matching Algorithm ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Pattern Matching Algorithm

Download Full-text

Improving Wu-Manber: A Multi-pattern Matching Algorithm

2008 IEEE International Conference on Networking, Sensing and Control ◽

10.1109/icnsc.2008.4525327 ◽

2008 ◽

Author(s):

Chen Zhen ◽

Wu Di

Keyword(s):

Pattern Matching ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

Clustering oriented hashing based multiple string pattern matching algorithm

2015 International Conference on Circuits, Power and Computing Technologies [ICCPCT-2015] ◽

10.1109/iccpct.2015.7159288 ◽

2015 ◽

Author(s):

Punit Kanuga

Keyword(s):

Pattern Matching ◽

Matching Algorithm ◽

Pattern Matching Algorithm ◽

Download Full-text