On a parallel-algorithms method for string matching problems (overview)

The string-matching paradigm is applied in every computer science and science branch in general. The existence of a plethora of string-matching algorithms makes it hard to choose the best one for any particular case. Expressing, measuring, and testing algorithm efficiency is a challenging task with many potential pitfalls. Algorithm efficiency can be measured based on the usage of different resources. In software engineering, algorithmic productivity is a property of an algorithm execution identified with the computational resources the algorithm consumes. Resource usage in algorithm execution could be determined, and for maximum efficiency, the goal is to minimize resource usage. Guided by the fact that standard measures of algorithm efficiency, such as execution time, directly depend on the number of executed actions. Without touching the problematics of computer power consumption or memory, which also depends on the algorithm type and the techniques used in algorithm development, we have developed a methodology which enables the researchers to choose an efficient algorithm for a specific domain. String searching algorithms efficiency is usually observed independently from the domain texts being searched. This research paper aims to present the idea that algorithm efficiency depends on the properties of searched string and properties of the texts being searched, accompanied by the theoretical analysis of the proposed approach. In the proposed methodology, algorithm efficiency is expressed through character comparison count metrics. The character comparison count metrics is a formal quantitative measure independent of algorithm implementation subtleties and computer platform differences. The model is developed for a particular problem domain by using appropriate domain data (patterns and texts) and provides for a specific domain the ranking of algorithms according to the patterns’ entropy. The proposed approach is limited to on-line exact string-matching problems based on information entropy for a search pattern. Meticulous empirical testing depicts the methodology implementation and purports soundness of the methodology.

Download Full-text

A New Algorithm for Subset Matching Problem Based on Set-String Transformation

Encyclopedia of Information Communication Technology ◽

10.4018/978-1-59904-845-1.ch080 ◽

2009 ◽

pp. 607-615

Author(s):

Yangjun Chen

Keyword(s):

Programming Languages ◽

Linear Time ◽

String Matching ◽

Special Problem ◽

Computer Engineering ◽

Data Types ◽

Matching Problem ◽

Matching Problems ◽

Abstract Data ◽

Geometric Pattern Matching

In computer engineering, a number of programming tasks involve a special problem, the so-called tree matching problem (Cole & Hariharan, 1997), as a crucial step, such as the design of interpreters for nonprocedural programming languages, automatic implementation of abstract data types, code optimization in compilers, symbolic computation, context searching in structure editors and automatic theorem proving. Recently, it has been shown that this problem can be transformed in linear time to another problem, the so called subset matching problem (Cole & Hariharan, 2002, 2003), which is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text position is a set of characters drawn from some alphabet S. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i + j - 1], for all j (1 = j = m). This is a generalization of the ordinary string matching and is of interest since an efficient algorithm for this problem implies an efficient solution to the tree matching problem. In addition, as shown in (Indyk, 1997), this problem can also be used to solve general string matching and counting matching (Muthukrishan, 1997; Muthukrishan & Palem, 1994), and enables us to design efficient algorithms for several geometric pattern matching problems. In this article, we propose a new algorithm on this issue, which needs only O(n + m) time in the case that the size of S is small and O(n + m·n0.5) time on average in general cases.

Download Full-text

Bit-Parallel Algorithms for Exact Circular String Matching

The Computer Journal ◽

10.1093/comjnl/bxt023 ◽

2013 ◽

Vol 57 (5) ◽

pp. 731-743 ◽

Cited By ~ 13

Author(s):

K.-H. Chen ◽

G.-S. Huang ◽

R. C.-T. Lee

Keyword(s):

Parallel Algorithms ◽

String Matching

Download Full-text

Faster algorithms for string matching problems: matching the convolution bound

Proceedings 39th Annual Symposium on Foundations of Computer Science (Cat. No.98CB36280) ◽

10.1109/sfcs.1998.743440 ◽

2002 ◽

Cited By ~ 19

Author(s):

P. Indyk

Keyword(s):

String Matching ◽

Matching Problems

Download Full-text

Optimal parallel algorithms for string matching

Proceedings of the sixteenth annual ACM symposium on Theory of computing - STOC '84 ◽

10.1145/800057.808687 ◽

1984 ◽

Cited By ~ 19

Author(s):

Zvi Galil*

Keyword(s):

Parallel Algorithms ◽

String Matching

Download Full-text

STRING MATCHING ARTIFICIAL NEURAL NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065701000874 ◽

2001 ◽

Vol 11 (05) ◽

pp. 445-453 ◽

Cited By ~ 2

Author(s):

TATIANA TAMBOURATZIS

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Computational Complexity ◽

Building Block ◽

String Matching ◽

Matching Problems ◽

Low Computational Complexity ◽

Artificial Neural ◽

Harmony Theory ◽

Fast Match

Three artificial neural networks (ANNs) are proposed for solving a variety of on- and off-line string matching problems. The ANN structure employed as the building block of these ANNs is derived from the harmony theory (HT) ANN, whereby the resulting string matching ANNs are characterized by fast match-mismatch decisions, low computational complexity, and activation values of the ANN output nodes that can be used as indicators of substitution, insertion (addition) and deletion spelling errors.

Download Full-text