GENERAL ALGORITHMS FOR TESTING THE AMBIGUITY OF FINITE AUTOMATA AND THE DOUBLE-TAPE AMBIGUITY OF FINITE-STATE TRANSDUCERS

We present efficient algorithms for testing the finite, polynomial, and exponential ambiguity of finite automata with ε-transitions. We give an algorithm for testing the exponential ambiguity of an automaton A in time [Formula: see text], and finite or polynomial ambiguity in time [Formula: see text], where |A|E denotes the number of transitions of A. These complexities significantly improve over the previous best complexities given for the same problem. Furthermore, the algorithms presented are simple and based on a general algorithm for the composition or intersection of automata. Additionally, we give an algorithm to determine in time [Formula: see text] the degree of polynomial ambiguity of a polynomially ambiguous automaton A and present an application of our algorithms to an approximate computation of the entropy of a probabilistic automaton. We also study the double-tape ambiguity of finite-state transducers. We show that the general problem is undecidable and that it is NP-hard for acyclic transducers. We present a specific analysis of the double-tape ambiguity of transducers with bounded delay. In particular, we give a characterization of double-tape ambiguity for synchronized transducers with zero delay that can be tested in quadratic time and give an algorithm for testing the double-tape ambiguity of transducers with bounded delay.

Download Full-text

FINITELY SUBSEQUENTIAL TRANSDUCERS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054103002126 ◽

2003 ◽

Vol 14 (06) ◽

pp. 983-994 ◽

Cited By ~ 9

Author(s):

CYRIL ALLAUZEN ◽

MEHRYAR MOHRI

Keyword(s):

Speech Recognition ◽

Finite Number ◽

Efficient Algorithm ◽

Experimental Results ◽

Theoretical Formulation ◽

Large Vocabulary ◽

Finite State Transducers ◽

Finite State ◽

Large Vocabulary Speech Recognition

Finitely subsequential transducers are efficient finite-state transducers with a finite number of final outputs and are used in a variety of applications. Not all transducers admit equivalent finitely subsequential transducers however. We briefly describe an existing generalized determinization algorithm for finitely subsequential transducers and give the first characterization of finitely subsequentiable transducers, transducers that admit equivalent finitely subsequential transducers. Our characterization shows the existence of an efficient algorithm for testing finite subsequentiability. We have fully implemented the generalized determinization algorithm and the algorithm for testing finite subsequentiability. We report experimental results showing that these algorithms are practical in large-vocabulary speech recognition applications. The theoretical formulation of our results is the equivalence of the following three properties for finite-state transducers: determinizability in the sense of the generalized algorithm, finite subsequentiability, and the twins property.

Download Full-text

On deterministic 1-limited 5′ → 3′ sensing Watson–Crick finite-state transducers

RAIRO - Theoretical Informatics and Applications ◽

10.1051/ita/2021007 ◽

2021 ◽

Vol 55 ◽

pp. 5

Author(s):

Benedek Nagy ◽

Zita Kovács

Keyword(s):

Dna Computing ◽

Finite Automata ◽

Theoretical Computer Science ◽

Theoretical Computer ◽

Double Stranded Dna ◽

Finite State Transducers ◽

Special Cases ◽

Finite State ◽

Dna Strands ◽

Processing Order

Finite automata and finite state transducers belong to the bases of (theoretical) computer science with many applications. On the other hand, DNA computing and related bio-inspired paradigms are relatively new fields of computing. Watson–Crick automata are in the intersection of the above fields. These finite automata have two reading heads as they read the upper and lower strands of the input DNA molecule, respectively. In 5′ → 3′ Watson–Crick automata the two reading heads move in the same biochemical direction, that is, from the 5′ end of the strand to the direction of the 3′ end. However, in the double-stranded DNA, the DNA strands are directed in opposite way to each other, therefore 5′ → 3′ Watson–Crick automata read the input from the two extremes. In sensing 5′ → 3′ automata the automata sense if the two heads are at the same position, moreover, the computing process is finished at that time. Based on this class of automata, we define WK transducers such that, at each transition, exactly one input letter is being processed, and exactly one output letter is written on a normal output tape. Some special cases are defined and analyzed, e.g., when only one of the reading heads is being used and when the transducer has only one state. We also show that the minimal transducer is uniquely defined if the transducer is deterministic and it has marked output, i.e., the output letter written in a step identifies the reading head that is used in that transition. We have also used the functions ‘processing order’ and ‘reading heads’ to analyze these transducers.

Download Full-text

ON THE DISAMBIGUATION OF FINITE AUTOMATA AND FUNCTIONAL TRANSDUCERS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054113400224 ◽

2013 ◽

Vol 24 (06) ◽

pp. 847-862 ◽

Cited By ~ 3

Author(s):

MEHRYAR MOHRI

Keyword(s):

Finite Automata ◽

Full Description ◽

Finite State Transducers ◽

Finite State ◽

The One

This paper introduces a new disambiguation algorithm for finite automata and functional finite-state transducers. It gives a full description of this algorithm, including a detailed pseudocode and analysis, and several illustrating examples. The algorithm is often more efficient and the result dramatically smaller than the one obtained using determinization for finite automata or the construction of Schützenberger. The unambiguous automaton or transducer created by our algorithm are never larger than those generated by the construction of Schützenberger. In fact, in a variety of cases, the size of the unambiguous transducer returned by our algorithm is only linear in that of the input transducer while the transducer created by the construction of Schützenberger is exponentially larger. Our algorithm can be used effectively in many applications to make automata and transducers more efficient to use.

Download Full-text

Degrees of Infinite Words, Polynomials and Atoms

International Journal of Foundations of Computer Science ◽

10.1142/s0129054118420066 ◽

2018 ◽

Vol 29 (05) ◽

pp. 825-843 ◽

Cited By ~ 1

Author(s):

Jörg Endrullis ◽

Juhani Karhumäki ◽

Jan Willem Klop ◽

Aleksi Saarela

Keyword(s):

Finite Automata ◽

Classification Problem ◽

Turing Degrees ◽

Turing Machines ◽

Infinite Words ◽

Fine Grained ◽

Finite State Transducers ◽

Pure Mathematics ◽

Wide Range ◽

Finite State

We study finite-state transducers and their power for transforming infinite words. Infinite sequences of symbols are of paramount importance in a wide range of fields, from formal languages to pure mathematics and physics. While finite automata for recognising and transforming languages are well-understood, very little is known about the power of automata to transform infinite words. The word transformation realised by finite-state transducers gives rise to a complexity comparison of words and thereby induces equivalence classes, called (transducer) degrees, and a partial order on these degrees. The ensuing hierarchy of degrees is analogous to the recursion-theoretic degrees of unsolvability, also known as Turing degrees, where the transformational devices are Turing machines. However, as a complexity measure, Turing machines are too strong: they trivialise the classification problem by identifying all computable words. Finite-state transducers give rise to a much more fine-grained, discriminating hierarchy. In contrast to Turing degrees, hardly anything is known about transducer degrees, in spite of their naturality. We use methods from linear algebra and analysis to show that there are infinitely many atoms in the transducer degrees, that is, minimal non-trivial degrees.

Download Full-text

Composition of weighted finite transducers in MapReduce

Journal Of Big Data ◽

10.1186/s40537-020-00397-4 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Bilal Elghadyry ◽

Faissal Ouardi ◽

Sébastien Verel

Keyword(s):

Speech Processing ◽

Large Scale ◽

Large Scale Data ◽

Finite State Transducers ◽

Wide Range ◽

Finite State ◽

Common Operation ◽

Efficient Representation ◽

Weighted Finite State Transducers ◽

Np Hardness

AbstractWeighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.

Download Full-text

Detection of somatic mosaicism and classification of Fanconi anemia patients by analysis of the FA/BRCA pathway

Blood ◽

10.1182/blood-2004-05-1852 ◽

2005 ◽

Vol 105 (3) ◽

pp. 1329-1336 ◽

Cited By ~ 78

Author(s):

Jean Soulier ◽

Thierry Leblanc ◽

Jérôme Larghero ◽

Hélène Dastot ◽

Akiko Shimamura ◽

...

Keyword(s):

Fanconi Anemia ◽

Cancer Susceptibility ◽

Bone Marrow Failure ◽

Somatic Mosaicism ◽

Chromosome Breakage ◽

Peripheral Blood Lymphocytes ◽

Specific Analysis ◽

Number Of Patients ◽

Chromosome Fragility

AbstractFanconi anemia (FA) is characterized by congenital abnormalities, bone marrow failure, chromosome fragility, and cancer susceptibility. Eight FA-associated genes have been identified so far, the products of which function in the FA/BRCA pathway. A key event in the pathway is the monoubiquitination of the FANCD2 protein, which depends on a multiprotein FA core complex. In a number of patients, spontaneous genetic reversion can correct FA mutations, leading to somatic mosaicism. We analyzed the FA/BRCA pathway in 53 FA patients by FANCD2 immunoblots and chromosome breakage tests. Strikingly, FANCD2 monoubiquitination was detected in peripheral blood lymphocytes (PBLs) in 8 (15%) patients. FA reversion was further shown in these patients by comparison of primary fibro-blasts and PBLs. Reversion was associated with higher blood counts and clinical stability or improvement. Once constitutional FANCD2 patterns were determined, patients could be classified based on the level of FA/BRCA pathway disruption, as “FA core” (upstream inactivation; n = 47, 89%), FA-D2 (n = 4, 8%), and an unidentified downstream group (n = 2, 4%). FA-D2 and unidentified group patients were therefore relatively common, and they had more severe congenital phenotypes. These results show that specific analysis of the FA/BRCA pathway, combined with clinical and chromosome breakage data, allows a comprehensive characterization of FA patients.

Download Full-text

BORDERS AND FINITE AUTOMATA

International Journal of Foundations of Computer Science ◽

10.1142/s0129054107005029 ◽

2007 ◽

Vol 18 (04) ◽

pp. 859-871

Author(s):

MARTIN ŠIMŮNEK ◽

BOŘIVOJ MELICHAR

Keyword(s):

Pattern Matching ◽

Hamming Distance ◽

Finite Automata ◽

Music Analysis ◽

Theoretical Description ◽

Specific Form ◽

Distance Measures ◽

Computer Assisted ◽

Finite State ◽

Finite State Transducer

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.

Download Full-text

Finite State Transducers ................................................................................................................ Javier Baliosian and Dina Wonsever

Handbook of Finite State Based Models and Applications ◽

10.1201/b13055-8 ◽

2016 ◽

pp. 57-80 ◽

Cited By ~ 1

Keyword(s):

Finite State Transducers ◽

Finite State

Download Full-text

European language translation with weighted finite state transducers

Proceedings of the Third Workshop on Statistical Machine Translation - StatMT '08 ◽

10.3115/1626394.1626410 ◽

2008 ◽

Cited By ~ 1

Author(s):

Graeme Blackwood ◽

Adrià de Gispert ◽

Jamie Brunning ◽

William Byrne

Keyword(s):

Language Translation ◽

European Language ◽

Finite State Transducers ◽

Finite State ◽

Weighted Finite State Transducers

Download Full-text

RULE-BASED SYLLABIFICATION OF KOREAN WORDS WRITTEN IN LATIN USING DETERMINISTIC FINITE AUTOMATA MODELS

Jurnal Terapan Teknologi Informasi ◽

10.21460/jutei.2018.21.77 ◽

2018 ◽

Vol 2 (1) ◽

pp. 75-85

Author(s):

Rouly Doharma Sihite ◽

Aditya Wikan Mahastama

Keyword(s):

Success Rate ◽

Statistical Approach ◽

Markov Models ◽

Finite Automata ◽

Writing System ◽

Test Results ◽

Writing Systems ◽

Research Focus ◽

Finite State ◽

Catch Up

Transliteration is still a challenge in helping people to read or write from one to another writing systems. Korean transliteration has been a topic of research to automate the conversion between Hangul (Korean writing system) and Latin characters. Previous works have been done in transliterating Hangul to Latin, using statistical approach (72.2% accuracy) and Extended Markov Models (54.9% accuracy). This research focus on transliterating Latin (romanised) Korean words into Hangul, as many learners of Korean began using Latin first. Selected method is modeling the probable vowel and consonant forms and problable vowel and consonant sequences using Finite State Automata to avoid training. These models are then coded into rules which applied and tested to 100 random Korean words. Initial test results only 40% success rate in transliterating due to the nature that consonants have to be labeled as initial or final of a syllable, and some consonants missed the modeled rules. Additional rules are then added to catch-up and merge these consonants into existing proper syllables, which increased the success rate to 92%. This result is analysed further and it is found that certain consonants sequence caused syllabification problem if exist in a certain position. Other additional rules was inserted and yields 99% final success rate which also is the accuracy of transliterating Korean words written in Latin into Hangul characters in compund syllables.

Download Full-text