ON THE DISAMBIGUATION OF FINITE AUTOMATA AND FUNCTIONAL TRANSDUCERS

2013 ◽  
Vol 24 (06) ◽  
pp. 847-862 ◽  
Author(s):  
MEHRYAR MOHRI

This paper introduces a new disambiguation algorithm for finite automata and functional finite-state transducers. It gives a full description of this algorithm, including a detailed pseudocode and analysis, and several illustrating examples. The algorithm is often more efficient and the result dramatically smaller than the one obtained using determinization for finite automata or the construction of Schützenberger. The unambiguous automaton or transducer created by our algorithm are never larger than those generated by the construction of Schützenberger. In fact, in a variety of cases, the size of the unambiguous transducer returned by our algorithm is only linear in that of the input transducer while the transducer created by the construction of Schützenberger is exponentially larger. Our algorithm can be used effectively in many applications to make automata and transducers more efficient to use.

2014 ◽  
Vol 25 (07) ◽  
pp. 897-916 ◽  
Author(s):  
GIOVANNI PIGHIZZINI ◽  
ANDREA PISONI

Limited automata are one-tape Turing machines that are allowed to rewrite the content of any tape cell only in the first d visits, for a fixed constant d. In the case d = 1, namely, when a rewriting is possible only during the first visit to a cell, these models have the same power of finite state automata. We prove state upper and lower bounds for the conversion of 1-limited automata into finite state automata. In particular, we prove a double exponential state gap between nondeterministic 1-limited automata and one-way deterministic finite automata. The gap reduces to a single exponential in the case of deterministic 1-limited automata. This also implies an exponential state gap between nondeterministic and deterministic 1-limited automata. Another consequence is that 1-limited automata can have less states than equivalent two-way nondeterministic finite automata. We show that this is true even if we restrict to the case of the one-letter input alphabet. For each d ≥ 2, d-limited automata are known to characterize the class of context-free languages. Using the Chomsky-Schützenberger representation for contextfree languages, we present a new conversion from context-free languages into 2-limited automata.


2021 ◽  
Vol 55 ◽  
pp. 5
Author(s):  
Benedek Nagy ◽  
Zita Kovács

Finite automata and finite state transducers belong to the bases of (theoretical) computer science with many applications. On the other hand, DNA computing and related bio-inspired paradigms are relatively new fields of computing. Watson–Crick automata are in the intersection of the above fields. These finite automata have two reading heads as they read the upper and lower strands of the input DNA molecule, respectively. In 5′ → 3′ Watson–Crick automata the two reading heads move in the same biochemical direction, that is, from the 5′ end of the strand to the direction of the 3′ end. However, in the double-stranded DNA, the DNA strands are directed in opposite way to each other, therefore 5′ → 3′ Watson–Crick automata read the input from the two extremes. In sensing 5′ → 3′ automata the automata sense if the two heads are at the same position, moreover, the computing process is finished at that time. Based on this class of automata, we define WK transducers such that, at each transition, exactly one input letter is being processed, and exactly one output letter is written on a normal output tape. Some special cases are defined and analyzed, e.g., when only one of the reading heads is being used and when the transducer has only one state. We also show that the minimal transducer is uniquely defined if the transducer is deterministic and it has marked output, i.e., the output letter written in a step identifies the reading head that is used in that transition. We have also used the functions ‘processing order’ and ‘reading heads’ to analyze these transducers.


2011 ◽  
Vol 22 (04) ◽  
pp. 883-904 ◽  
Author(s):  
CYRIL ALLAUZEN ◽  
MEHRYAR MOHRI ◽  
ASHISH RASTOGI

We present efficient algorithms for testing the finite, polynomial, and exponential ambiguity of finite automata with ε-transitions. We give an algorithm for testing the exponential ambiguity of an automaton A in time [Formula: see text], and finite or polynomial ambiguity in time [Formula: see text], where |A|E denotes the number of transitions of A. These complexities significantly improve over the previous best complexities given for the same problem. Furthermore, the algorithms presented are simple and based on a general algorithm for the composition or intersection of automata. Additionally, we give an algorithm to determine in time [Formula: see text] the degree of polynomial ambiguity of a polynomially ambiguous automaton A and present an application of our algorithms to an approximate computation of the entropy of a probabilistic automaton. We also study the double-tape ambiguity of finite-state transducers. We show that the general problem is undecidable and that it is NP-hard for acyclic transducers. We present a specific analysis of the double-tape ambiguity of transducers with bounded delay. In particular, we give a characterization of double-tape ambiguity for synchronized transducers with zero delay that can be tested in quadratic time and give an algorithm for testing the double-tape ambiguity of transducers with bounded delay.


2018 ◽  
Vol 29 (05) ◽  
pp. 825-843 ◽  
Author(s):  
Jörg Endrullis ◽  
Juhani Karhumäki ◽  
Jan Willem Klop ◽  
Aleksi Saarela

We study finite-state transducers and their power for transforming infinite words. Infinite sequences of symbols are of paramount importance in a wide range of fields, from formal languages to pure mathematics and physics. While finite automata for recognising and transforming languages are well-understood, very little is known about the power of automata to transform infinite words. The word transformation realised by finite-state transducers gives rise to a complexity comparison of words and thereby induces equivalence classes, called (transducer) degrees, and a partial order on these degrees. The ensuing hierarchy of degrees is analogous to the recursion-theoretic degrees of unsolvability, also known as Turing degrees, where the transformational devices are Turing machines. However, as a complexity measure, Turing machines are too strong: they trivialise the classification problem by identifying all computable words. Finite-state transducers give rise to a much more fine-grained, discriminating hierarchy. In contrast to Turing degrees, hardly anything is known about transducer degrees, in spite of their naturality. We use methods from linear algebra and analysis to show that there are infinitely many atoms in the transducer degrees, that is, minimal non-trivial degrees.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Bilal Elghadyry ◽  
Faissal Ouardi ◽  
Sébastien Verel

AbstractWeighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.


2007 ◽  
Vol 18 (04) ◽  
pp. 859-871
Author(s):  
MARTIN ŠIMŮNEK ◽  
BOŘIVOJ MELICHAR

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.


2003 ◽  
Vol 14 (06) ◽  
pp. 983-994 ◽  
Author(s):  
CYRIL ALLAUZEN ◽  
MEHRYAR MOHRI

Finitely subsequential transducers are efficient finite-state transducers with a finite number of final outputs and are used in a variety of applications. Not all transducers admit equivalent finitely subsequential transducers however. We briefly describe an existing generalized determinization algorithm for finitely subsequential transducers and give the first characterization of finitely subsequentiable transducers, transducers that admit equivalent finitely subsequential transducers. Our characterization shows the existence of an efficient algorithm for testing finite subsequentiability. We have fully implemented the generalized determinization algorithm and the algorithm for testing finite subsequentiability. We report experimental results showing that these algorithms are practical in large-vocabulary speech recognition applications. The theoretical formulation of our results is the equivalence of the following three properties for finite-state transducers: determinizability in the sense of the generalized algorithm, finite subsequentiability, and the twins property.


2019 ◽  
Vol 8 (5) ◽  
Author(s):  
Murshida Kh. Fatykhova ◽  
Regina I. Gazizova

The dynamic development of traditional media: print, radio, television, the emergence of new computer information technologies, the globalization of the world information space makes a huge impact on the current state of the language. Mass media are the most important tool in the development and the preservation of the language. On the one hand, all the latest language changes are reflected in the media, and on the other hand, the media influence language changes and development. This article outlines the results of the study concerning the role of regional media in the distribution and popularization of the national language. A full description is given to modern Tatar-language media within the Russian media space. Nowadays, despite an active distribution of network mass media, television remains one of the main communication channels. For a large part of the Russian population, including the viewers of the Republic of Tatarstan, it is one of the most accessible ways to obtain the information in native language. In this regard, in the course of the study, they studied the experience of the main Tatar-language television and radio companies in the popularization of the national language


Sign in / Sign up

Export Citation Format

Share Document