A SPACE EFFICIENT BIT-PARALLEL ALGORITHM FOR THE MULTIPLE STRING MATCHING PROBLEM

Finite (nondeterministic) automata are very useful building blocks in the field of string matching. This is particularly true in the case of multiple pattern matching, where the use of factor-based automata can reduce substantially the number of computational steps when the patterns have large common factors. Direct simulation of nondeterministic automata can be performed very efficiently using the bit-parallelism technique, though this is not necessarily true for factor-based automata. In this paper we present an algorithm for the multiple string matching problem, based on the bit-parallel simulation of nondeterministic factor-based automata which satisfy a particular ordering condition. We also show how to enforce such condition by suitably modifying a minimal initial automaton, through equivalence preserving transformations. The resulting automaton turns out to be smaller than the corresponding maximal automata used by existing bit-parallel algorithms, as they do not take any advantage of common factors in patterns.

Download Full-text

PARALLEL PATTERN MATCHING WITH SCALING

Parallel Processing Letters ◽

10.1142/s0129626401000476 ◽

2001 ◽

Vol 11 (01) ◽

pp. 125-138 ◽

Cited By ~ 1

Author(s):

H. MONGELLI ◽

S. W. SONG

Keyword(s):

Parallel Algorithms ◽

Parallel Algorithm ◽

Pattern Matching ◽

Computing Time ◽

Parallel Machine ◽

Experimental Results ◽

Coarse Grained ◽

Matching Problem ◽

Communication Round ◽

Parallel Pattern

Given a text and a pattern, the problem of pattern matching consists of determining all the positions of the text where the pattern occurs. When the text and the pattern are matrices, the matching is termed bidimensional. There are variations of this problem where we allow the matching using a somehow modified pattern. A modification that we will allow is that the pattern can be scaled. We propose a new parallel algorithm for this problem, under the CGM (Coarse Grained Multicomputer) model. This algorithm requires linear local computing time in the input, linear memory and uses only one communication round, during which at most a linear amount of data is exchanged. To be the best of our knowledge, there are no known parallel algorithms for the bidimensional pattern matching problem with scaling in the literature. This proposed algorithm was implemented in C, using the PVM interface and was executed on a Parsytec PowerXplorer parallel machine. The experimental results obtained were very promising and showed significant speedups.

Download Full-text

Parallel Algorithms for the Boxed-Mesh Permutation Pattern Matching Problem

Journal of KIISE ◽

10.5626/jok.2019.46.4.299 ◽

2019 ◽

Vol 46 (4) ◽

pp. 299-307

Author(s):

Jihyo Choi ◽

Youngho Kim ◽

Joong Chae Na ◽

Jeong Seop Sim

Keyword(s):

Parallel Algorithms ◽

Pattern Matching ◽

Matching Problem ◽

Permutation Pattern

Download Full-text

The complexity of the Multiple Pattern Matching Problem for random strings

2018 Proceedings of the Fifteenth Workshop on Analytic Algorithmics and Combinatorics (ANALCO) ◽

10.1137/1.9781611975062.5 ◽

2018 ◽

pp. 40-53

Author(s):

Frédérique Bassino ◽

Tsinjo Rakotoarimalala ◽

Andrea Sportiello

Keyword(s):

Pattern Matching ◽

Matching Problem ◽

Multiple Pattern Matching

Download Full-text

A Space-Efficient Hashing-Based Algorithm for Order-Preserving Multiple Pattern Matching Problem

KIISE Transactions on Computing Practices ◽

10.5626/ktcp.2018.24.8.399 ◽

2018 ◽

Vol 24 (8) ◽

pp. 399-404

Author(s):

Jeonghoon Park ◽

Youngho Kim ◽

Jeong Seop Sim

Keyword(s):

Pattern Matching ◽

Matching Problem ◽

Multiple Pattern Matching

Download Full-text

Parallel Algorithms for String Matching Problem on Single and Two Dimensional Reconfigurable Pipelined Bus Systems

Journal of Computer Science ◽

10.3844/jcssp.2007.754.759 ◽

2007 ◽

Vol 3 (9) ◽

pp. 754-759 ◽

Cited By ~ 5

Author(s):

S.Viswanadha Raju ◽

A.Vinaya Babu

Keyword(s):

Parallel Algorithms ◽

String Matching ◽

Two Dimensional ◽

Matching Problem

Download Full-text

Two Dimensional Matching

Pattern Matching Algorithms ◽

10.1093/oso/9780195113679.003.0012 ◽

1997 ◽

Author(s):

A. Amir ◽

M. Farach

Keyword(s):

Pattern Matching ◽

String Matching ◽

Higher Dimensions ◽

Natural Generalization ◽

Theoretical Problem ◽

Two Dimensional ◽

Exact Matching ◽

Matching Problem ◽

Deterministic Algorithms ◽

Special Case

String matching is a basic theoretical problem in computer science, but has been useful in implementating various text editing tasks. The explosion of multimedia requires an appropriate generalization of string matching to higher dimensions. The first natural generalization is that of seeking the occurrences of a pattern in a text where both pattern arid text are rectangles. The last few years saw a tremendous activity in two dimensional pattern matching algorithms. We naturally had to limit the amount of information that entered this chapter. We chose to concentrate on serial deterministic algorithms for some of the basic issues of two dimensional matching. Throughout this chapter we define our problems in terms of squares rather than rectangles, however, all results presented easily generalize to rectangles. The Exact Two Dimensional Matching Problem is defined as follows: . . . INPUT: Text array T[n x n] and pattern array P[m x m]. OUTPUT: All locations [i,j] in T where there is an occurrence of P, i.e. T[i+k+,j+l] = P[k+1,l+1] 0 ≤ k, l ≤ n-1. . . . A natural way of solving any generalized problem is by reducing it to a special case whose solution is known. It is therefore not surprising that most solutions to the two dimensional exact matching problem use exact string matching algorithms in one way or another. In this section, we present an algorithm for two dimensional matching which relies on reducing a matrix of characters into a one dimensional array. Let P' [1 . . .m] be a pattern which is derived from P by setting P' [i] = P[i,l]P[i,2]…P[i,m], that is, the ith character of P' is the ith row of P. Let Ti[l . . .n — m + 1], for 1 ≤ i ≤ n, be a set of arrays such that Ti[j] = T[i, j] T [ i , j + 1 ] • • • T[i, j + m-1]. Clearly, P occurs at T[i, j] iff P' occurs at Ti[j].

Download Full-text

The exact multiple pattern matching problem solved by a reference tree approach

Theoretical Computer Science ◽

10.1016/j.tcs.2021.06.003 ◽

2021 ◽

Author(s):

Yi-Kung Shieh ◽

Shyong Jian Shyu ◽

Chin Lung Lu ◽

Richard Chia-Tung Lee

Keyword(s):

Pattern Matching ◽

Matching Problem ◽

Reference Tree ◽

Tree Approach ◽

Multiple Pattern Matching

Download Full-text

A NOVEL ALGORITHM FOR SOLVING THE STRING MATCHING PROBLEM

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026806002040 ◽

2006 ◽

Vol 06 (04) ◽

pp. 499-510 ◽

Cited By ~ 1

Author(s):

IBRAHIEM M. M. EL EMARY ◽

MOHAMMED S. M. JABER

Keyword(s):

Pattern Matching ◽

Word Length ◽

String Matching ◽

Specific Pattern ◽

Experimental Results ◽

Search Process ◽

Matching Problem ◽

Pattern Length ◽

Novel Algorithm ◽

Better Than

The string matching problem consists of finding one or more, generally all, exact occurrences of a pattern P in a text T. This paper presents a new algorithm for solving the string matching problem. Application of the proposed algorithm assists in improving the search process of a specific pattern in a certain unchangeable text through decreasing the number of character comparisons. Operation concept of such an algorithm depends on pattern reading to obtain the pattern length and the pattern first character and then a search is done in a table of two columns: the first column represents the word length in the text and the second one represents the start positions of each word classified by the same length. After that the algorithm just searches the words of the same length. Our experimental results depend mainly on comparing the performance of our algorithm with the well-known pattern matching algorithms such as Boyer–Moor's and Boyer–Moor–Galil's. The comparison between our algorithm and others are done in terms of the number of characters compared for different sizes of text. The output results show that our algorithm performs better than the others in terms of this parameter.

Download Full-text

APPLYING A Q-GRAM BASED MULTIPLE STRING MATCHING ALGORITHM FOR APPROXIMATE MATCHING

Informatyka Automatyka Pomiary w Gospodarce i Ochronie Środowiska ◽

10.5604/01.3001.0010.5214 ◽

2017 ◽

Vol 7 (3) ◽

pp. 47-50

Author(s):

Robert Susik

Keyword(s):

Pattern Matching ◽

String Matching ◽

Matching Algorithm ◽

Approximate Matching ◽

Fast Search ◽

Approximate Pattern Matching ◽

On Line ◽

Multiple Pattern Matching

We consider the application of multiple pattern matching (Multi AOSO on q-Grams) algorithm for approximate pattern matching. We propose the on-line approach which translates the problem from approximate pattern matching into a multiple pattern one (called partitioning into exact search). Presented solution allows relatively fast search multiple patterns in text with given k-differences(or mismatches). This paper presents comparison of solution based on MAG algorithm, and [4]. Experiments on DNA, English, Proteins and XML texts with up to k errors show that the new proposed algorithm achieves relatively good results in practical use.

Download Full-text

Off-line Parallel Exact String Searching

Pattern Matching Algorithms ◽

10.1093/oso/9780195113679.003.0005 ◽

1997 ◽

Author(s):

Z. Galil ◽

I. Yudkiewicz

Keyword(s):

Parallel Algorithms ◽

Optimal Algorithm ◽

String Matching ◽

Random Access ◽

Sequential Algorithm ◽

Matching Problem ◽

Minimal Position ◽

Two Factors ◽

The One ◽

Time Required

The string matching problem is defined as follows: given a string P0 ... Pm-1 called the pattern and a string T0 .. .Tn-1 called the text find all occurrences of the pattern in the text. The output of a string matching algorithm is a boolean array MATCH[0..n — 1] which contains a true value at each position where an occurrence of the pattern starts. Many sequential algorithms are known that solve this problem optimally, i.e., in a linear O(n) number of operations, most notable of which are the algorithms by Knuth, Morris and Pratt and by Boyer and Moore. In this chapter we limit ourselves to parallel algorithms. All algorithms considered in this chapter are for the parallel random access machine (PRAM) computation model. In the design of parallel algorithms for the various PRAM models, one tries to optimize two factors simultaneously: the number of processors used and the time required by the algorithm. The total number of operations performed, which is the time-processors product, is the measure of optimality. A parallel algorithm is called optimal if it needs the same number of operations as the fastest sequential algorithm. Hence, in the string matching problem, an algorithm is optimal if its time-processor product is linear in the length of the input strings. Apart from having an optimal algorithm the designer wishes the algorithm to be the fastest possible, where the only limit on the number of processors is the one caused by the time-processor product. The following fundamental lemma given by Brent is essential for understanding the tradeoff between time and processors : Any PRAM algoriihm of time t that consists of x elementary operations can be implemented on p processors in O(x/p + t) time. Using Brent’s lemma, any algorithm that uses a large number x of processors to run very fast can be implemented on p < x processors, with the same total work, however with an increase in time as described. A basic problem in the study of parallel algorithms for strings and arrays is finding the maximal/minimal position in an array that holds a certain value.

Download Full-text