ON THE DISAMBIGUATION OF FINITE AUTOMATA AND FUNCTIONAL TRANSDUCERS

This paper introduces a new disambiguation algorithm for finite automata and functional finite-state transducers. It gives a full description of this algorithm, including a detailed pseudocode and analysis, and several illustrating examples. The algorithm is often more efficient and the result dramatically smaller than the one obtained using determinization for finite automata or the construction of Schützenberger. The unambiguous automaton or transducer created by our algorithm are never larger than those generated by the construction of Schützenberger. In fact, in a variety of cases, the size of the unambiguous transducer returned by our algorithm is only linear in that of the input transducer while the transducer created by the construction of Schützenberger is exponentially larger. Our algorithm can be used effectively in many applications to make automata and transducers more efficient to use.

Download Full-text

LIMITED AUTOMATA AND REGULAR LANGUAGES

International Journal of Foundations of Computer Science ◽

10.1142/s0129054114400140 ◽

2014 ◽

Vol 25 (07) ◽

pp. 897-916 ◽

Cited By ~ 15

Author(s):

GIOVANNI PIGHIZZINI ◽

ANDREA PISONI

Keyword(s):

Finite Automata ◽

Upper And Lower Bounds ◽

Finite State Automata ◽

Turing Machines ◽

Double Exponential ◽

Finite State ◽

A Cell ◽

Fixed Constant ◽

The One ◽

Context Free

Limited automata are one-tape Turing machines that are allowed to rewrite the content of any tape cell only in the first d visits, for a fixed constant d. In the case d = 1, namely, when a rewriting is possible only during the first visit to a cell, these models have the same power of finite state automata. We prove state upper and lower bounds for the conversion of 1-limited automata into finite state automata. In particular, we prove a double exponential state gap between nondeterministic 1-limited automata and one-way deterministic finite automata. The gap reduces to a single exponential in the case of deterministic 1-limited automata. This also implies an exponential state gap between nondeterministic and deterministic 1-limited automata. Another consequence is that 1-limited automata can have less states than equivalent two-way nondeterministic finite automata. We show that this is true even if we restrict to the case of the one-letter input alphabet. For each d ≥ 2, d-limited automata are known to characterize the class of context-free languages. Using the Chomsky-Schützenberger representation for contextfree languages, we present a new conversion from context-free languages into 2-limited automata.

Download Full-text

On deterministic 1-limited 5′ → 3′ sensing Watson–Crick finite-state transducers

RAIRO - Theoretical Informatics and Applications ◽

10.1051/ita/2021007 ◽

2021 ◽

Vol 55 ◽

pp. 5

Author(s):

Benedek Nagy ◽

Zita Kovács

Keyword(s):

Dna Computing ◽

Finite Automata ◽

Theoretical Computer Science ◽

Theoretical Computer ◽

Double Stranded Dna ◽

Finite State Transducers ◽

Special Cases ◽

Finite State ◽

Dna Strands ◽

Processing Order

Finite automata and finite state transducers belong to the bases of (theoretical) computer science with many applications. On the other hand, DNA computing and related bio-inspired paradigms are relatively new fields of computing. Watson–Crick automata are in the intersection of the above fields. These finite automata have two reading heads as they read the upper and lower strands of the input DNA molecule, respectively. In 5′ → 3′ Watson–Crick automata the two reading heads move in the same biochemical direction, that is, from the 5′ end of the strand to the direction of the 3′ end. However, in the double-stranded DNA, the DNA strands are directed in opposite way to each other, therefore 5′ → 3′ Watson–Crick automata read the input from the two extremes. In sensing 5′ → 3′ automata the automata sense if the two heads are at the same position, moreover, the computing process is finished at that time. Based on this class of automata, we define WK transducers such that, at each transition, exactly one input letter is being processed, and exactly one output letter is written on a normal output tape. Some special cases are defined and analyzed, e.g., when only one of the reading heads is being used and when the transducer has only one state. We also show that the minimal transducer is uniquely defined if the transducer is deterministic and it has marked output, i.e., the output letter written in a step identifies the reading head that is used in that transition. We have also used the functions ‘processing order’ and ‘reading heads’ to analyze these transducers.

Download Full-text

GENERAL ALGORITHMS FOR TESTING THE AMBIGUITY OF FINITE AUTOMATA AND THE DOUBLE-TAPE AMBIGUITY OF FINITE-STATE TRANSDUCERS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054111008477 ◽

2011 ◽

Vol 22 (04) ◽

pp. 883-904 ◽

Cited By ~ 6

Author(s):

CYRIL ALLAUZEN ◽

MEHRYAR MOHRI ◽

ASHISH RASTOGI

Keyword(s):

General Problem ◽

Finite Automata ◽

General Algorithm ◽

Approximate Computation ◽

Probabilistic Automaton ◽

Bounded Delay ◽

Finite State Transducers ◽

Specific Analysis ◽

Finite State

We present efficient algorithms for testing the finite, polynomial, and exponential ambiguity of finite automata with ε-transitions. We give an algorithm for testing the exponential ambiguity of an automaton A in time [Formula: see text], and finite or polynomial ambiguity in time [Formula: see text], where |A|E denotes the number of transitions of A. These complexities significantly improve over the previous best complexities given for the same problem. Furthermore, the algorithms presented are simple and based on a general algorithm for the composition or intersection of automata. Additionally, we give an algorithm to determine in time [Formula: see text] the degree of polynomial ambiguity of a polynomially ambiguous automaton A and present an application of our algorithms to an approximate computation of the entropy of a probabilistic automaton. We also study the double-tape ambiguity of finite-state transducers. We show that the general problem is undecidable and that it is NP-hard for acyclic transducers. We present a specific analysis of the double-tape ambiguity of transducers with bounded delay. In particular, we give a characterization of double-tape ambiguity for synchronized transducers with zero delay that can be tested in quadratic time and give an algorithm for testing the double-tape ambiguity of transducers with bounded delay.

Download Full-text

Degrees of Infinite Words, Polynomials and Atoms

International Journal of Foundations of Computer Science ◽

10.1142/s0129054118420066 ◽

2018 ◽

Vol 29 (05) ◽

pp. 825-843 ◽

Cited By ~ 1

Author(s):

Jörg Endrullis ◽

Juhani Karhumäki ◽

Jan Willem Klop ◽

Aleksi Saarela

Keyword(s):

Finite Automata ◽

Classification Problem ◽

Turing Degrees ◽

Turing Machines ◽

Infinite Words ◽

Fine Grained ◽

Finite State Transducers ◽

Pure Mathematics ◽

Wide Range ◽

Finite State

We study finite-state transducers and their power for transforming infinite words. Infinite sequences of symbols are of paramount importance in a wide range of fields, from formal languages to pure mathematics and physics. While finite automata for recognising and transforming languages are well-understood, very little is known about the power of automata to transform infinite words. The word transformation realised by finite-state transducers gives rise to a complexity comparison of words and thereby induces equivalence classes, called (transducer) degrees, and a partial order on these degrees. The ensuing hierarchy of degrees is analogous to the recursion-theoretic degrees of unsolvability, also known as Turing degrees, where the transformational devices are Turing machines. However, as a complexity measure, Turing machines are too strong: they trivialise the classification problem by identifying all computable words. Finite-state transducers give rise to a much more fine-grained, discriminating hierarchy. In contrast to Turing degrees, hardly anything is known about transducer degrees, in spite of their naturality. We use methods from linear algebra and analysis to show that there are infinitely many atoms in the transducer degrees, that is, minimal non-trivial degrees.

Download Full-text

Composition of weighted finite transducers in MapReduce

Journal Of Big Data ◽

10.1186/s40537-020-00397-4 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Bilal Elghadyry ◽

Faissal Ouardi ◽

Sébastien Verel

Keyword(s):

Speech Processing ◽

Large Scale ◽

Large Scale Data ◽

Finite State Transducers ◽

Wide Range ◽

Finite State ◽

Common Operation ◽

Efficient Representation ◽

Weighted Finite State Transducers ◽

Np Hardness

AbstractWeighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.

Download Full-text

BORDERS AND FINITE AUTOMATA

International Journal of Foundations of Computer Science ◽

10.1142/s0129054107005029 ◽

2007 ◽

Vol 18 (04) ◽

pp. 859-871

Author(s):

MARTIN ŠIMŮNEK ◽

BOŘIVOJ MELICHAR

Keyword(s):

Pattern Matching ◽

Hamming Distance ◽

Finite Automata ◽

Music Analysis ◽

Theoretical Description ◽

Specific Form ◽

Distance Measures ◽

Computer Assisted ◽

Finite State ◽

Finite State Transducer

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.

Download Full-text

FINITELY SUBSEQUENTIAL TRANSDUCERS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054103002126 ◽

2003 ◽

Vol 14 (06) ◽

pp. 983-994 ◽

Cited By ~ 9

Author(s):

CYRIL ALLAUZEN ◽

MEHRYAR MOHRI

Keyword(s):

Speech Recognition ◽

Finite Number ◽

Efficient Algorithm ◽

Experimental Results ◽

Theoretical Formulation ◽

Large Vocabulary ◽

Finite State Transducers ◽

Finite State ◽

Large Vocabulary Speech Recognition

Finitely subsequential transducers are efficient finite-state transducers with a finite number of final outputs and are used in a variety of applications. Not all transducers admit equivalent finitely subsequential transducers however. We briefly describe an existing generalized determinization algorithm for finitely subsequential transducers and give the first characterization of finitely subsequentiable transducers, transducers that admit equivalent finitely subsequential transducers. Our characterization shows the existence of an efficient algorithm for testing finite subsequentiability. We have fully implemented the generalized determinization algorithm and the algorithm for testing finite subsequentiability. We report experimental results showing that these algorithms are practical in large-vocabulary speech recognition applications. The theoretical formulation of our results is the equivalence of the following three properties for finite-state transducers: determinizability in the sense of the generalized algorithm, finite subsequentiability, and the twins property.

Download Full-text

Finite State Transducers ................................................................................................................ Javier Baliosian and Dina Wonsever

Handbook of Finite State Based Models and Applications ◽

10.1201/b13055-8 ◽

2016 ◽

pp. 57-80 ◽

Cited By ~ 1

Keyword(s):

Finite State Transducers ◽

Finite State

Download Full-text

European language translation with weighted finite state transducers

Proceedings of the Third Workshop on Statistical Machine Translation - StatMT '08 ◽

10.3115/1626394.1626410 ◽

2008 ◽

Cited By ~ 1

Author(s):

Graeme Blackwood ◽

Adrià de Gispert ◽

Jamie Brunning ◽

William Byrne

Keyword(s):

Language Translation ◽

European Language ◽

Finite State Transducers ◽

Finite State ◽

Weighted Finite State Transducers

Download Full-text

POPULARIZATION OF LANGUAGE THROUGH MASS MEDIA IN THE REGIONS OF RUSSIA

GÃªnero & Direito ◽

10.22478/ufpb.2179-7137.2019v8n5.48640 ◽

2019 ◽

Vol 8 (5) ◽

Author(s):

Murshida Kh. Fatykhova ◽

Regina I. Gazizova

Keyword(s):

Mass Media ◽

Information Technologies ◽

National Language ◽

Full Description ◽

Huge Impact ◽

Russian Media ◽

The Media ◽

The Republic ◽

World Information ◽

The One

The dynamic development of traditional media: print, radio, television, the emergence of new computer information technologies, the globalization of the world information space makes a huge impact on the current state of the language. Mass media are the most important tool in the development and the preservation of the language. On the one hand, all the latest language changes are reflected in the media, and on the other hand, the media influence language changes and development. This article outlines the results of the study concerning the role of regional media in the distribution and popularization of the national language. A full description is given to modern Tatar-language media within the Russian media space. Nowadays, despite an active distribution of network mass media, television remains one of the main communication channels. For a large part of the Russian population, including the viewers of the Republic of Tatarstan, it is one of the most accessible ways to obtain the information in native language. In this regard, in the course of the study, they studied the experience of the main Tatar-language television and radio companies in the popularization of the national language

Download Full-text