Weak factor automata: the failure of failure factor oracles?

In indexing of, and pattern matching on, DNA and text sequences, it is often important to represent all factors of a sequence. One efficient, compact representation is the factor oracle (FO). At the same time, any classical deterministic finite automata (DFA) can be transformed to a so-called failure one (FDFA), which may use failure transitions to replace multiple symbol transitions, potentially yielding a more compact representation. We combine the two ideas and directly construct a failure factor oracle (FFO) from a given sequence, in contrast to ex post facto transformation to an FDFA. The algorithm is suitable for both short and long sequences. We empirically compared the resulting FFOs and FOs on number of transitions for many DNA sequences of lengths 4 − 512, showing gains of up to 10% in total number of transitions, with failure transitions also taking up less space than symbol transitions. The resulting FFOs can be used for indexing, as well as in a variant of the FO-using backward oracle matching algorithm. We discuss and classify this pattern matching algorithm in terms of the keyword pattern matching taxonomies of Watson, Cleophas and Zwaan. We also empirically compared the use of FOs and FFOs in such backward reading pattern matching algorithms, using both DNA and natural language (English) data sets. The results indicate that the decrease in pattern matching performance of an algorithm using an FFO instead of an FO may outweigh the gain in representation space by using an FFO instead of an FO.

Download Full-text

EPMA: Efficient pattern matching algorithm for DNA sequences

Expert Systems with Applications ◽

10.1016/j.eswa.2017.03.026 ◽

2017 ◽

Vol 80 ◽

pp. 162-170 ◽

Cited By ~ 6

Author(s):

Muhammad Tahir ◽

Muhammad Sardaraz ◽

Ataul Aziz Ikram

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

Fragment Finder 2.0: a computing server to identify structurally similar fragments

Journal of Applied Crystallography ◽

10.1107/s0021889812001501 ◽

2012 ◽

Vol 45 (2) ◽

pp. 332-334 ◽

Cited By ~ 5

Author(s):

R. Nagarajan ◽

S. Siva Balan ◽

R. Sabarinathan ◽

M. Kirti Vaishnavi ◽

K. Sekar

Keyword(s):

Search Engine ◽

Pattern Matching ◽

World Wide ◽

Data Sets ◽

Web Based ◽

Matching Algorithm ◽

Protein Fragments ◽

Interactive Computing ◽

Structural Superposition ◽

The World

Fragment Finder 2.0is a web-based interactive computing server which can be used to retrieve structurally similar protein fragments from 25 and 90% nonredundant data sets. The computing server identifies structurally similar fragments using the protein backbone Cα angles. In addition, the identified fragments can be superimposed using either of the two structural superposition programs,STAMPandPROFIT, provided in the server. The freely available Java plug-inJmolhas been interfaced with the server for the visualization of the query and superposed fragments. The server is the updated version of a previously developed search engine and employs an in-house-developed fast pattern matching algorithm. This server can be accessed freely over the World Wide Web through the URL http://cluster.physics.iisc.ernet.in/ff/.

Download Full-text

Fast bitwise pattern-matching algorithm for DNA sequences on modern hardware

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES ◽

10.3906/elk-1304-165 ◽

2015 ◽

Vol 23 ◽

pp. 1405-1417 ◽

Cited By ~ 3

Author(s):

Gıyasettin ÖZCAN ◽

Osman Sabri ÜNSAL

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

A Faster Pattern Matching Algorithm for Intrusion Detection

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1414 ◽

2012 ◽

Vol 532-533 ◽

pp. 1414-1418 ◽

Cited By ~ 1

Author(s):

Feng Du

Keyword(s):

Intrusion Detection ◽

Pattern Matching ◽

Time Complexity ◽

Matching Algorithm ◽

Matching Performance ◽

Bm Algorithm ◽

Pattern Matching Algorithm ◽

Improved Algorithm

In this paper, a Faster algorithm: BMF is proposed, which sets improvements in the time complexity of the BM algorithm. The BMF algorithm defines a new pre-calculation function to increase in the skips of pattern significantly. Experiments indicate that the time complexity is reduced by 63% at most. Therefore, the improved algorithm could provide significant improvement in pattern matching performance when using in an IDS.

Download Full-text

V15. Ex post facto structural connectome analysis in ALS at multicenter level: Analysis of over 400 data sets from 8 centers

Clinical Neurophysiology ◽

10.1016/j.clinph.2015.04.093 ◽

2015 ◽

Vol 126 (8) ◽

pp. e72-e73

Author(s):

H.-P. Müller ◽

G. Grön ◽

S. Abrahams ◽

P. Bede ◽

M. Filippi ◽

...

Keyword(s):

Data Sets ◽

Structural Connectome ◽

Ex Post ◽

Ex Post Facto ◽

Level Analysis

Download Full-text

High speed pattern matching algorithm based on deterministic finite automata with faulty transition table

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems - ANCS '10 ◽

10.1145/1872007.1872017 ◽

2010 ◽

Author(s):

Jan Kastil ◽

Jan Korenek

Keyword(s):

Pattern Matching ◽

High Speed ◽

Finite Automata ◽

Matching Algorithm ◽

Deterministic Finite Automata ◽

Transition Table ◽

Pattern Matching Algorithm

Download Full-text

A Network Intrusion Detection System with Improved SBOM Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.336-338.2419 ◽

2013 ◽

Vol 336-338 ◽

pp. 2419-2422

Author(s):

Li Hua Zhou

Keyword(s):

Intrusion Detection ◽

Pattern Matching ◽

Intrusion Detection System ◽

Detection System ◽

Network Intrusion Detection ◽

Data Sets ◽

Matching Algorithm ◽

Network Intrusion ◽

Network Intrusion Detection System ◽

Detection Evaluation

The efficiency of pattern matching algorithm used in detection engine decides the performance of intrusion detection system. This paper improves the data structure of SBOM Algorithm, which is well-known keyword matching algorithm, by adding or removing keywords dynamically. The results of experiments on 1999 DARPA intrusion Detection Evaluation Data Sets indicate that the implemented NIDS(Network Intrusion Detection System) is comparatively excellent for large keyword sets.

Download Full-text

A 3D pattern matching algorithm for DNA sequences

Bioinformatics ◽

10.1093/bioinformatics/btl669 ◽

2007 ◽

Vol 23 (6) ◽

pp. 680-686 ◽

Cited By ~ 10

Author(s):

J. Herisson ◽

G. Payen ◽

R. Gherbi

Keyword(s):

Pattern Matching ◽

Dna Sequences ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Download Full-text

An Advanced Algorithm for Finding Tandem Repeats in DNA Sequencing based on Text Mining

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8607.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 4058-4062

Keyword(s):

Pattern Matching ◽

Tandem Repeat ◽

Dna Sequences ◽

Tandem Repeats ◽

Critical Issue ◽

Effective Algorithm ◽

Data Set ◽

Quality Of Results ◽

Matching Algorithm

In this paper a new searching technique based on pattern matching is proposed. In bioinformatics, finding Tandem Repeats (TR) in DNA sequences is an critical issue. There exist many pattern matching algorithms and KMP (Knuth Morris Pratt) is one of the pattern matching algorithm that undergo deficiencies of runtime complexity and cost when size of the data set increases. The main aim of the paper is to generate an effective algorithm for detecting and identifying Tandem Repeats over a DNA sequence more efficiently. By introducing the concept of 2Dimensional matrix to minimize the purview scope and optimizing the problem, Tandem Repeat finding algorithm makes the detecting or identifying process more efficient and effective that improves the quality of results. The theoretical analysis and experimental results concludes that tandem repeat finding algorithm get equivalent results in less runtime. This algorithm is better to KMP for determining results, and it also reduces or weaken the runtime cost which is beneficial when DNA data becomes greater

Download Full-text

Parallel Combining Different Approaches to Multi-pattern Matching for Fpga-based Security Systems

Advances in Cyber-Physical Systems ◽

10.23939/acps2020.01.008 ◽

2017 ◽

Vol 5 (1) ◽

pp. 8-15

Author(s):

Sergii Hilgurt ◽

Keyword(s):

Information Security ◽

Pattern Matching ◽

Intrusion Detection System ◽

Detection System ◽

Finite Automata ◽

Network Intrusion Detection ◽

Security Systems ◽

Analytical Technique ◽

Network Intrusion ◽

Pros And Cons

The multi-pattern matching is a fundamental technique found in applications like a network intrusion detection system, anti-virus, anti-worms and other signature- based information security tools. Due to rising traffic rates, increasing number and sophistication of attacks and the collapse of Moore’s law, traditional software solutions can no longer keep up. Therefore, hardware approaches are frequently being used by developers to accelerate pattern matching. Reconfigurable FPGA-based devices, providing the flexibility of software and the near-ASIC performance, have become increasingly popular for this purpose. Hence, increasing the efficiency of reconfigurable information security tools is a scientific issue now. Many different approaches to constructing hardware matching circuits on FPGAs are known. The most widely used of them are based on discrete comparators, hash-functions and finite automata. Each approach possesses its own pros and cons. None of them still became the leading one. In this paper, a method to combine several different approaches to enforce their advantages has been developed. An analytical technique to quickly advance estimate the resource costs of each matching scheme without need to compile FPGA project has been proposed. It allows to apply optimization procedures to near-optimally split the set of pattern between different approaches in acceptable time.

Download Full-text