A Note on Linear Time Simulation of Deterministic Two-Way Pushdown Automata

Neil D. Jones

doi:10.7146/dpb.v6i75.6492

A Note on Linear Time Simulation of Deterministic Two-Way Pushdown Automata

DAIMI Report Series ◽

10.7146/dpb.v6i75.6492 ◽

1977 ◽

Vol 6 (75) ◽

Author(s):

Neil D. Jones

Keyword(s):

Pattern Matching ◽

Data Structures ◽

Linear Time ◽

Random Access ◽

Simulation Algorithm ◽

Pushdown Automaton ◽

Matching Problems ◽

Time Result ◽

Time Simulation ◽

Pushdown Automata

<p>Cook has shown that any deterministic two-way pushdown automaton could be simulated by a uniform-cost random access machine in time O(n) for inputs of length n. The result was of interest because such a machine is a natural model for a variety of backtracking algorithms, particularly as used in pattern matching problems. The linear time result was surprising because of the fact that such machines may run as many as 2n steps before halting; similar problems with 'combinatorial explosions' are well known to occur in applications of backtracking. Cook's result inspired the development of a number of efficient pattern matching algorithms.</p><p>However, it is impractical to use Cook's algorithm directly to do pattern matching, since it involves a large constant time factor and much storage. The purpose of this note is to present an alternate, simpler simulation algorithm which involves consideration only of the configurations actually reached by the automaton. It can be expected to run faster and use less storage (depending on the data structures used), thus bringing Cook's result a step closer to practical utility.</p>

Download Full-text

Indexing ordered trees for (nonlinear) tree pattern matching by pushdown automata

Computer Science and Information Systems ◽

10.2298/csis111220024t ◽

2012 ◽

Vol 9 (3) ◽

pp. 1125-1153

Author(s):

J. Travnícek ◽

J. Janousek ◽

B. Melichar

Keyword(s):

Pattern Matching ◽

Data Structures ◽

Input Pattern ◽

Pushdown Automaton ◽

Ordered Trees ◽

Tree Pattern ◽

Tree Patterns ◽

Tree Pattern Matching ◽

Ordered Tree ◽

Pushdown Automata

Trees are one of the fundamental data structures used in Computer Science. We present a new kind of acyclic pushdown automata, the tree pattern pushdown automaton and the nonlinear tree pattern pushdown automaton, constructed for an ordered tree. These automata accept all tree patterns and nonlinear tree patterns, respectively, which match the tree and represent a full index of the tree for such patterns. Given a tree with n nodes, the numbers of these distinct tree patterns and nonlinear tree patterns can be at most 2n?1 +n and at most (2+v)n?1+2, respectively, where v is the maximal number of nonlinear variables allowed in nonlinear tree patterns. The total sizes of nondeterministic versions of the two pushdown automata are O(n) and O(n2), respectively. We discuss the time complexities and show timings of our implementations using the bit-parallelism technique. The timings show that for a given tree the running time is linear to the size of the input pattern.

Download Full-text

A note on linear time simulation of deterministic two-way pushdown automata

Information Processing Letters ◽

10.1016/0020-0190(77)90022-9 ◽

1977 ◽

Vol 6 (4) ◽

pp. 110-112 ◽

Cited By ~ 11

Author(s):

Neil D. Jones

Keyword(s):

Linear Time ◽

Time Simulation ◽

Pushdown Automata

Download Full-text

On Tree Pattern Matching by Pushdown Automata

Acta Polytechnica ◽

10.14311/1113 ◽

2009 ◽

Vol 49 (2) ◽

Author(s):

T. Flouri

Keyword(s):

Programming Languages ◽

Pattern Matching ◽

Systematic Approach ◽

Term Rewriting ◽

Pushdown Automaton ◽

Tree Pattern ◽

Tree Pattern Matching ◽

String Pattern ◽

Pushdown Automata ◽

Mechanical Theorem

Tree pattern matching is an important operation in Computer Science on which a number of tasks such as mechanical theorem proving, term-rewriting, symbolic computation and non-procedural programming languages are based on. Work has begun on a systematic approach to the construction of tree pattern matchers by deterministic pushdown automata which read subject trees in prefix notation. The method is analogous to the construction of string pattern matchers: for given patterns, a non-deterministic pushdown automaton is created and then it is determinised. In this first paper, we present the proposed non-deterministic pushdown automaton which will serve as a basis for the determinisation process, and prove its correctness.

Download Full-text

Subtree matching by pushdown automata

Computer Science and Information Systems ◽

10.2298/csis1002331f ◽

2010 ◽

Vol 7 (2) ◽

pp. 331-357 ◽

Cited By ~ 8

Author(s):

Tomás Flouri ◽

Jan Janousek ◽

Bořivoj Melichar

Keyword(s):

Programming Languages ◽

Theorem Proving ◽

Systematic Approach ◽

Finite Automata ◽

Term Rewriting ◽

Mechanical Theorem Proving ◽

Pushdown Automaton ◽

String Pattern ◽

Pushdown Automata ◽

Mechanical Theorem

Subtree matching is an important problem in Computer Science on which a number of tasks, such as mechanical theorem proving, term-rewriting, symbolic computation and nonprocedural programming languages are based on. A systematic approach to the construction of subtree pattern matchers by deterministic pushdown automata, which read subject trees in prefix and postfix notation, is presented. The method is analogous to the construction of string pattern matchers: for a given pattern, a nondeterministic pushdown automaton is created and is then determinised. In addition, it is shown that the size of the resulting deterministic pushdown automata directly corresponds to the size of the existing string pattern matchers based on finite automata.

Download Full-text

The 2-Interval Pattern Matching Problems and Its Application to ncRNA Scanning

Bioinformatics and Computational Biology - Lecture Notes in Computer Science ◽

10.1007/978-3-642-00727-9_10 ◽

2009 ◽

pp. 79-89

Author(s):

Thomas K. F. Wong ◽

S. M. Yiu ◽

T. W. Lam ◽

Wing-Kin Sung

Keyword(s):

Pattern Matching ◽

Matching Problems

Download Full-text

Mixed Hypergraphs for Linear-Time Construction of Denser Hashing-Based Data Structures

Lecture Notes in Computer Science - SOFSEM 2013: Theory and Practice of Computer Science ◽

10.1007/978-3-642-35843-2_31 ◽

2013 ◽

pp. 356-368 ◽

Cited By ~ 2

Author(s):

Michael Rink

Keyword(s):

Data Structures ◽

Linear Time

Download Full-text

Suffix Tree Data Structures for Matrices

Pattern Matching Algorithms ◽

10.1093/oso/9780195113679.003.0013 ◽

1997 ◽

Author(s):

R. Giancarlo ◽

R. Grossi

Keyword(s):

Linear Space ◽

Suffix Tree ◽

Linear Time ◽

Suffix Trees ◽

Construction Time ◽

Matching Problems ◽

Tree Construction ◽

The Matrix ◽

Visual Databases ◽

Efficient Construction

We discuss the suffix tree generalization to matrices in this chapter. We extend the suffix tree notion (described in Chapter 3) from text strings to text matrices whose entries are taken from an ordered alphabet with the aim of solving pattern-matching problems. This suffix tree generalization can be efficiently used to implement low-level routines for Computer Vision, Data Compression, Geographic Information Systems and Visual Databases. We examine the submatrices in the form of the text’s contiguous parts that still have a matrix shape. Representing these text submatrices as “suitably formatted” strings stored in a compacted trie is the rationale behind suffix trees for matrices. The choice of the format inevitably influences suffix tree construction time and space complexity. We first deal with square matrices and show that many suffix tree families can be defined for the same input matrix according to the matrix’s string representations. We can store each suffix tree in linear space and give an efficient construction algorithm whose input is both the matrix and the string representation chosen. We then treat rectangular matrices and define their corresponding suffix trees by means of some general rules which we list formally. We show that there is a super-linear lower bound to the space required (in contrast with the linear space required by suffix trees for square matrices). We give a simple example of one of these suffix trees. The last part of the chapter illustrates some technical results regarding suffix trees for square matrices: we show how to achieve an expected linear-time suffix tree construction for a constant-size alphabet under some mild probabilistic assumptions about the input distribution. We begin by defining a wide class of string representations for square matrices. We let Σ denote an ordered alphabet of characters and introduce another alphabet of five special characters, called shapes. A shape is one of the special characters taken from set {IN,SW,NW,SE,NE}. Shape IN encodes the 1x1 matrix generated from the empty matrix by creating a square.

Download Full-text

A New Algorithm for Subset Matching Problem Based on Set-String Transformation

Encyclopedia of Information Communication Technology ◽

10.4018/978-1-59904-845-1.ch080 ◽

2009 ◽

pp. 607-615

Author(s):

Yangjun Chen

Keyword(s):

Programming Languages ◽

Linear Time ◽

String Matching ◽

Special Problem ◽

Computer Engineering ◽

Data Types ◽

Matching Problem ◽

Matching Problems ◽

Abstract Data ◽

Geometric Pattern Matching

In computer engineering, a number of programming tasks involve a special problem, the so-called tree matching problem (Cole & Hariharan, 1997), as a crucial step, such as the design of interpreters for nonprocedural programming languages, automatic implementation of abstract data types, code optimization in compilers, symbolic computation, context searching in structure editors and automatic theorem proving. Recently, it has been shown that this problem can be transformed in linear time to another problem, the so called subset matching problem (Cole & Hariharan, 2002, 2003), which is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text position is a set of characters drawn from some alphabet S. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i + j - 1], for all j (1 = j = m). This is a generalization of the ordinary string matching and is of interest since an efficient algorithm for this problem implies an efficient solution to the tree matching problem. In addition, as shown in (Indyk, 1997), this problem can also be used to solve general string matching and counting matching (Muthukrishan, 1997; Muthukrishan & Palem, 1994), and enables us to design efficient algorithms for several geometric pattern matching problems. In this article, we propose a new algorithm on this issue, which needs only O(n + m) time in the case that the size of S is small and O(n + m·n0.5) time on average in general cases.

Download Full-text

On linear-time alphabet-independent 2-dimensional pattern matching

LATIN '95: Theoretical Informatics - Lecture Notes in Computer Science ◽

10.1007/3-540-59175-3_91 ◽

1995 ◽

pp. 220-229

Author(s):

Maxime Crochemore ◽

Wojciech Rytter

Keyword(s):

Pattern Matching ◽

Linear Time

Download Full-text

IDPM: An Improved Degenerate Pattern Matching Algorithm for Biological Sequences

International Journal of Foundations of Computer Science ◽

10.1142/s0129054117500307 ◽

2017 ◽

Vol 28 (07) ◽

pp. 889-914

Author(s):

Jie Lin ◽

Yue Jiang ◽

E. James Harner ◽

Bing-Hua Jiang ◽

Don Adjeroh

Keyword(s):

Performance Improvement ◽

Pattern Matching ◽

Linear Time ◽

Computational Cost ◽

Large Data ◽

Biological Sequences ◽

Matching Problem ◽

Practical Utilization ◽

Matching Algorithm ◽

Pattern Matching Algorithm

Let [Formula: see text] be a string, with symbols from an alphabet. [Formula: see text] is said to be degenerate if for some positions, say [Formula: see text], [Formula: see text] can contain a subset of symbols from the symbol alphabet, rather than just one symbol. Given a text string [Formula: see text] and a pattern [Formula: see text], both with symbols from an alphabet [Formula: see text], the degenerate string matching problem, is to find positions in [Formula: see text] where [Formula: see text] occured, such that [Formula: see text], [Formula: see text], or both are allowed to be degenerate. Though some algorithms have been proposed, their huge computational cost pose a significant challenge to their practical utilization. In this work, we propose IDPM, an improved degenerate pattern matching algorithm based on an extension of the Boyer–Moore algorithm. At the preprocessing phase, the algorithm defines an alphabet-independent compatibility rule, and computes the shift arrays using respective variants of the bad character and good suffix heuristics. At the search phase, IDPM improves the matching speed by using the compatibility rule. On average, the proposed IDPM algorithm has a linear time complexity with respect to the text size, and to the overall size of the pattern. IDPM demonstrates significance performance improvement over state-of-the-art approaches. It can be used in fast practical degenerate pattern matching with large data sizes, with important applications in flexible and scalable searching of huge biological sequences.

Download Full-text