finite state transducers
Recently Published Documents


TOTAL DOCUMENTS

214
(FIVE YEARS 22)

H-INDEX

16
(FIVE YEARS 2)

2021 ◽  
Vol 9 ◽  
pp. 261-276
Author(s):  
Hyunji Hayley Park ◽  
Katherine J. Zhang ◽  
Coleman Haley ◽  
Kenneth Steimel ◽  
Han Liu ◽  
...  

Abstract Prior studies in multilingual language modeling (e.g., Cotterell et al., 2018; Mielke et al., 2019) disagree on whether or not inflectional morphology makes languages harder to model. We attempt to resolve the disagreement and extend those studies. We compile a larger corpus of 145 Bible translations in 92 languages and a larger number of typological features.1 We fill in missing typological data for several languages and consider corpus-based measures of morphological complexity in addition to expert-produced typological features. We find that several morphological measures are significantly associated with higher surprisal when LSTM models are trained with BPE-segmented data. We also investigate linguistically motivated subword segmentation strategies like Morfessor and Finite-State Transducers (FSTs) and find that these segmentation strategies yield better performance and reduce the impact of a language’s morphology on language modeling.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Bilal Elghadyry ◽  
Faissal Ouardi ◽  
Sébastien Verel

AbstractWeighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.


2021 ◽  
Vol 55 ◽  
pp. 5
Author(s):  
Benedek Nagy ◽  
Zita Kovács

Finite automata and finite state transducers belong to the bases of (theoretical) computer science with many applications. On the other hand, DNA computing and related bio-inspired paradigms are relatively new fields of computing. Watson–Crick automata are in the intersection of the above fields. These finite automata have two reading heads as they read the upper and lower strands of the input DNA molecule, respectively. In 5′ → 3′ Watson–Crick automata the two reading heads move in the same biochemical direction, that is, from the 5′ end of the strand to the direction of the 3′ end. However, in the double-stranded DNA, the DNA strands are directed in opposite way to each other, therefore 5′ → 3′ Watson–Crick automata read the input from the two extremes. In sensing 5′ → 3′ automata the automata sense if the two heads are at the same position, moreover, the computing process is finished at that time. Based on this class of automata, we define WK transducers such that, at each transition, exactly one input letter is being processed, and exactly one output letter is written on a normal output tape. Some special cases are defined and analyzed, e.g., when only one of the reading heads is being used and when the transducer has only one state. We also show that the minimal transducer is uniquely defined if the transducer is deterministic and it has marked output, i.e., the output letter written in a step identifies the reading head that is used in that transition. We have also used the functions ‘processing order’ and ‘reading heads’ to analyze these transducers.


2021 ◽  
pp. 104691
Author(s):  
Martin Kutrib ◽  
Andreas Malcher ◽  
Carlo Mereghetti ◽  
Beatrice Palano

2021 ◽  
Vol 55 ◽  
pp. 6
Author(s):  
Martin Kutrib ◽  
Andreas Malcher

Input-driven pushdown automata (IDPDA) are pushdown automata where the next action on the pushdown store (push, pop, nothing) is solely governed by the input symbol. Nowadays such devices are usually defined such that popping from the empty pushdown does not block the computation but continues it with empty pushdown. Here, we consider IDPDAs that have a more balanced behavior concerning pushing and popping. Digging input-driven pushdown automata (DIDPDA) are basically IDPDAs that, when forced to pop from the empty pushdown, dig a hole of the shape of the popped symbol in the bottom of the pushdown. Popping further symbols from a pushdown having a hole at the bottom deepens the current hole furthermore. The hole can only be filled up by pushing symbols previously popped. We study the impact of the new behavior of DIDPDAs on their power and compare their capacities with the capacities of ordinary IDPDAs and tinput-driven pushdown automata which are basically IDPDAs whose input may be preprocessed by length-preserving finite state transducers. It turns out that the capabilities are incomparable. We address the determinization of DIDPDAs and their descriptional complexity, closure properties, and decidability questions.


2020 ◽  
Author(s):  
Bilal Elghadyry ◽  
Faissal Ouardi ◽  
Sébastien Verel

Abstract Weighted finite-state transducers have been shown to be a general and efficient representation in many applications such as text and speech processing, computational biology, and machine learning. The composition of weighted finite-state transducers constitutes a fundamental and common operation between these applications. The NP-hardness of the composition computation problem presents a challenge that leads us to devise efficient algorithms on a large scale when considering more than two transducers. This paper describes a parallel computation of weighted finite transducers composition in MapReduce framework. To the best of our knowledge, this paper is the first to tackle this task using MapReduce methods. First, we analyze the communication cost of this problem using Afrati et al. model. Then, we propose three MapReduce methods based respectively on input alphabet mapping, state mapping, and hybrid mapping. Finally, intensive experiments on a wide range of weighted finite-state transducers are conducted to compare the proposed methods and show their efficiency for large-scale data.


Sign in / Sign up

Export Citation Format

Share Document