scholarly journals Parallel Algorithms for String Matching Problem on Single and Two Dimensional Reconfigurable Pipelined Bus Systems

2007 ◽  
Vol 3 (9) ◽  
pp. 754-759 ◽  
Author(s):  
S.Viswanadha Raju ◽  
A.Vinaya Babu
Author(s):  
A. Amir ◽  
M. Farach

String matching is a basic theoretical problem in computer science, but has been useful in implementating various text editing tasks. The explosion of multimedia requires an appropriate generalization of string matching to higher dimensions. The first natural generalization is that of seeking the occurrences of a pattern in a text where both pattern arid text are rectangles. The last few years saw a tremendous activity in two dimensional pattern matching algorithms. We naturally had to limit the amount of information that entered this chapter. We chose to concentrate on serial deterministic algorithms for some of the basic issues of two dimensional matching. Throughout this chapter we define our problems in terms of squares rather than rectangles, however, all results presented easily generalize to rectangles. The Exact Two Dimensional Matching Problem is defined as follows: . . . INPUT: Text array T[n x n] and pattern array P[m x m]. OUTPUT: All locations [i,j] in T where there is an occurrence of P, i.e. T[i+k+,j+l] = P[k+1,l+1] 0 ≤ k, l ≤ n-1. . . . A natural way of solving any generalized problem is by reducing it to a special case whose solution is known. It is therefore not surprising that most solutions to the two dimensional exact matching problem use exact string matching algorithms in one way or another. In this section, we present an algorithm for two dimensional matching which relies on reducing a matrix of characters into a one dimensional array. Let P' [1 . . .m] be a pattern which is derived from P by setting P' [i] = P[i,l]P[i,2]…P[i,m], that is, the ith character of P' is the ith row of P. Let Ti[l . . .n — m + 1], for 1 ≤ i ≤ n, be a set of arrays such that Ti[j] = T[i, j] T [ i , j + 1 ] • • • T[i, j + m-1]. Clearly, P occurs at T[i, j] iff P' occurs at Ti[j].


2006 ◽  
Vol 17 (06) ◽  
pp. 1235-1251 ◽  
Author(s):  
DOMENICO CANTONE ◽  
SIMONE FARO

Finite (nondeterministic) automata are very useful building blocks in the field of string matching. This is particularly true in the case of multiple pattern matching, where the use of factor-based automata can reduce substantially the number of computational steps when the patterns have large common factors. Direct simulation of nondeterministic automata can be performed very efficiently using the bit-parallelism technique, though this is not necessarily true for factor-based automata. In this paper we present an algorithm for the multiple string matching problem, based on the bit-parallel simulation of nondeterministic factor-based automata which satisfy a particular ordering condition. We also show how to enforce such condition by suitably modifying a minimal initial automaton, through equivalence preserving transformations. The resulting automaton turns out to be smaller than the corresponding maximal automata used by existing bit-parallel algorithms, as they do not take any advantage of common factors in patterns.


Author(s):  
Z. Galil ◽  
I. Yudkiewicz

The string matching problem is defined as follows: given a string P0 ... Pm-1 called the pattern and a string T0 .. .Tn-1 called the text find all occurrences of the pattern in the text. The output of a string matching algorithm is a boolean array MATCH[0..n — 1] which contains a true value at each position where an occurrence of the pattern starts. Many sequential algorithms are known that solve this problem optimally, i.e., in a linear O(n) number of operations, most notable of which are the algorithms by Knuth, Morris and Pratt and by Boyer and Moore. In this chapter we limit ourselves to parallel algorithms. All algorithms considered in this chapter are for the parallel random access machine (PRAM) computation model. In the design of parallel algorithms for the various PRAM models, one tries to optimize two factors simultaneously: the number of processors used and the time required by the algorithm. The total number of operations performed, which is the time-processors product, is the measure of optimality. A parallel algorithm is called optimal if it needs the same number of operations as the fastest sequential algorithm. Hence, in the string matching problem, an algorithm is optimal if its time-processor product is linear in the length of the input strings. Apart from having an optimal algorithm the designer wishes the algorithm to be the fastest possible, where the only limit on the number of processors is the one caused by the time-processor product. The following fundamental lemma given by Brent is essential for understanding the tradeoff between time and processors : Any PRAM algoriihm of time t that consists of x elementary operations can be implemented on p processors in O(x/p + t) time. Using Brent’s lemma, any algorithm that uses a large number x of processors to run very fast can be implemented on p < x processors, with the same total work, however with an increase in time as described. A basic problem in the study of parallel algorithms for strings and arrays is finding the maximal/minimal position in an array that holds a certain value.


2020 ◽  
pp. 116-121
Author(s):  
Armen Kostanyan

The string matching problem (that is, the problem of finding all occurrences of a pattern in the text) is one of the well-known problems in symbolic computations with applications in many areas of artificial intelligence. The most famous algorithms for solving it are the finite state machine method and the Knuth-Morris-Pratt algorithm (KMP). In this paper, we consider the problem of finding all occurrences of a fuzzy pattern in the text. Such a pattern is defined as a sequence of fuzzy properties of text characters. To construct a solution to this problem, we introduce a two-dimensional prefix table, which is a generalization of the one-dimensional prefix array used in the KMP algorithm.


Author(s):  
Yangjun Chen

In computer engineering, a number of programming tasks involve a special problem, the so-called tree matching problem (Cole & Hariharan, 1997), as a crucial step, such as the design of interpreters for nonprocedural programming languages, automatic implementation of abstract data types, code optimization in compilers, symbolic computation, context searching in structure editors and automatic theorem proving. Recently, it has been shown that this problem can be transformed in linear time to another problem, the so called subset matching problem (Cole & Hariharan, 2002, 2003), which is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text position is a set of characters drawn from some alphabet S. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i + j - 1], for all j (1 = j = m). This is a generalization of the ordinary string matching and is of interest since an efficient algorithm for this problem implies an efficient solution to the tree matching problem. In addition, as shown in (Indyk, 1997), this problem can also be used to solve general string matching and counting matching (Muthukrishan, 1997; Muthukrishan & Palem, 1994), and enables us to design efficient algorithms for several geometric pattern matching problems. In this article, we propose a new algorithm on this issue, which needs only O(n + m) time in the case that the size of S is small and O(n + m·n0.5) time on average in general cases.


Sign in / Sign up

Export Citation Format

Share Document