Finite Automata on Transfinite Sequences and Regular Expressions

1985 ◽  
Vol 8 (3-4) ◽  
pp. 379-396
Author(s):  
Jerzy Wojciechowski

In this paper the notion of regular expression for finite automata on transfinite sequences /TF-automata/ is introduced. The characterization theorem for TF-automata is proved. From this theorem we conclude the decidability of the emptiness problem for TF-automata and the characterization theorem for finite automata on transfinite sequences of bounded lenght.

2009 ◽  
Vol 2009 ◽  
pp. 1-10 ◽  
Author(s):  
Yi-Hua E. Yang ◽  
Viktor K. Prasanna

We present a software toolchain for constructing large-scaleregular expression matching(REM) on FPGA. The software automates the conversion of regular expressions into compact and high-performance nondeterministic finite automata (RE-NFA). Each RE-NFA is described as an RTL regular expression matching engine (REME) in VHDL for FPGA implementation. Assuming a fixed number of fan-out transitions per state, ann-statem-bytes-per-cycle RE-NFA can be constructed inO(n×m)time andO(n×m)memory by our software. A large number of RE-NFAs are placed onto a two-dimensionalstaged pipeline, allowing scalability to thousands of RE-NFAs with linear area increase and little clock rate penalty due to scaling. On a PC with a 2 GHz Athlon64 processor and 2 GB memory, our prototype software constructs hundreds of RE-NFAs used by Snort in less than 10 seconds. We also designed a benchmark generator which can produce RE-NFAs with configurable pattern complexity parameters, including state count, state fan-in, loop-back and feed-forward distances. Several regular expressions with various complexities are used to test the performance of our RE-NFA construction software.


2011 ◽  
Vol 22 (07) ◽  
pp. 1593-1606 ◽  
Author(s):  
SABINE BRODA ◽  
ANTÓNIO MACHIAVELO ◽  
NELMA MOREIRA ◽  
ROGÉRIO REIS

The partial derivative automaton ([Formula: see text]) is usually smaller than other nondeterministic finite automata constructed from a regular expression, and it can be seen as a quotient of the Glushkov automaton ([Formula: see text]). By estimating the number of regular expressions that have ε as a partial derivative, we compute a lower bound of the average number of mergings of states in [Formula: see text] and describe its asymptotic behaviour. This depends on the alphabet size, k, and for growing k's its limit approaches half the number of states in [Formula: see text]. The lower bound corresponds to consider the [Formula: see text] automaton for the marked version of the regular expression, i.e. where all its letters are made different. Experimental results suggest that the average number of states of this automaton, and of the [Formula: see text] automaton for the unmarked regular expression, are very close to each other.


Author(s):  
Benedek Nagy

Union-free expressions are regular expressions without using the union operation. Consequently, (nondeterministic) union-free languages are described by regular expressions using only concatenation and Kleene star. The language class is also characterised by a special class of finite automata: 1CFPAs have exactly one cycle-free accepting path from each of their states. Obviously such an automaton has exactly one accepting state. The deterministic counterpart of such class of automata defines the deterministic union-free (d-union-free, for short) languages. In this paper [Formula: see text]-free nondeterministic variants of 1CFPAs are used to define n-union-free languages. The defined language class is shown to be properly between the classes of (nondeterministic) union-free and d-union-free languages (in case of at least binary alphabet). In case of unary alphabet the class of n-union-free languages coincides with the class of union-free languages. Some properties of the new subregular class of languages are discussed, e.g., closure properties. On the other hand, a regular expression is in union normal form if it is a finite union of union-free expressions. It is well known that every regular expression can be written in union normal form, i.e., all regular languages can be described as finite unions of (nondeterministic) union-free languages. It is also known that the same fact does not hold for deterministic union-free languages, that is, there are regular languages that cannot be written as finite unions of d-union-free languages. As an important result here we show that every regular language can be defined by a finite union of n-union-free languages. This fact also allows to define n-union-complexity of regular languages.


2007 ◽  
Vol 17 (01) ◽  
pp. 141-154 ◽  
Author(s):  
J.-M. CHAMPARNAUD ◽  
F. OUARDI ◽  
D. ZIADI

There exist two well-known quotients of the position automaton of a regular expression. The first one, called the equation automaton, was first introduced by Mirkin from the notion of prebase and has been redefined by Antimirov from the notion of partial derivative. The second one, due to Ilie and Yu and called the follow automaton, can be obtained by eliminating ε-transitions in an ε-NFA that is always smaller than the classical ε-NFAs (Thompson, Sippu and Soisalon–Soininen). Ilie and Yu discussed the difficulty of succeeding in a theoretical comparison between the size of the follow automaton and the size of the equation automaton and concluded that it is very likely necessary to realize experimental studies. In this paper we solve the theoretical question, by first defining a set of regular expressions, called normalized expressions, such that every regular expression can be normalized in linear time, and proving then that the equation automaton of a normalized expression is always smaller than its follow automaton.


Author(s):  
Vinoth Kumar K

The vast majority of the system security applications in today's systems depend on deep packet inspection. In recent years, regular expression matching was used as an important operator. It examines whether or not the packet's payload can be matched with a group of predefined regular expressions. Regular expressions are parsed using the deterministic finite automata representations. Conversely, to represent regular expression sets as DFA, the system needs a large amount of memory, an excessive amount of time, and an excessive amount of per flow state, limiting their practical applications. This chapter explores network intrusion detection systems.


2013 ◽  
Vol 24 (08) ◽  
pp. 1255-1279 ◽  
Author(s):  
HERMANN GRUBER ◽  
MARKUS HOLZER

Based on recent results from extremal graph theory, we prove that every n-state binary deterministic finite automaton can be converted into an equivalent regular expression of size O(1.742n) using state elimination. Furthermore, we give improved upper bounds on the language operations intersection and interleaving on regular expressions.


2014 ◽  
Vol 25 (08) ◽  
pp. 1141-1159 ◽  
Author(s):  
MARTIN KUTRIB ◽  
ANDREAS MALCHER ◽  
MATTHIAS WENDLANDT

Stateless variants of deterministic one-way multi-head finite automata with pebbles, that is, automata where the heads can drop, sense, and pick up pebbles, are studied. The relation between heads and pebbles is investigated, and a proper double hierarchy concerning these two resources is obtained. Moreover, it is shown that a conversion of an arbitrary automaton to a stateless automaton can always be achieved at the cost of additional heads and/or pebbles. On the other hand, there are languages where one head cannot be traded for any number of additional pebbles and vice versa. Finally, the emptiness problem and related problems are shown to be undecidable even for the ‘simplest’ model, namely, for stateless one-way finite automata with two heads and one pebble.


Author(s):  
Cyril Nicaud ◽  
Pablo Rotondo

In this article, we study some properties of random regular expressions of size [Formula: see text], when the cardinality of the alphabet also depends on [Formula: see text]. For this, we revisit and improve the classical Transfer Theorem from the field of analytic combinatorics. This provides precise estimations for the number of regular expressions, the probability of recognizing the empty word and the expected number of Kleene stars in a random expression. For all these statistics, we show that there is a threshold when the size of the alphabet approaches [Formula: see text], at which point the leading term in the asymptotics starts oscillating.


Sign in / Sign up

Export Citation Format

Share Document