scholarly journals Computing the Expected Edit Distance from a String to a Probabilistic Finite-State Automaton

2017 ◽  
Vol 28 (05) ◽  
pp. 603-621 ◽  
Author(s):  
Jorge Calvo-Zaragoza ◽  
Jose Oncina ◽  
Colin de la Higuera

In a number of fields, it is necessary to compare a witness string with a distribution. One possibility is to compute the probability of the string for that distribution. Another, giving a more global view, is to compute the expected edit distance from a string randomly drawn to the witness string. This number is often used to measure the performance of a prediction, the goal then being to return the median string, or the string with smallest expected distance. To be able to measure this, computing the distance between a hypothesis and that distribution is necessary. This paper proposes two solutions for computing this value, when the distribution is defined with a probabilistic finite state automaton. The first is exact but has a cost which can be exponential in the length of the input string, whereas the second is a fully polynomial-time randomized schema.

2006 ◽  
Vol 17 (02) ◽  
pp. 379-393 ◽  
Author(s):  
YO-SUB HAN ◽  
YAJUN WANG ◽  
DERICK WOOD

We study infix-free regular languages. We observe the structural properties of finite-state automata for infix-free languages and develop a polynomial-time algorithm to determine infix-freeness of a regular language using state-pair graphs. We consider two cases: 1) A language is specified by a nondeterministic finite-state automaton and 2) a language is specified by a regular expression. Furthermore, we examine the prime infix-free decomposition of infix-free regular languages and design an algorithm for the infix-free primality test of an infix-free regular language. Moreover, we show that we can compute the prime infix-free decomposition in polynomial time. We also demonstrate that the prime infix-free decomposition is not unique.


Geophysics ◽  
1995 ◽  
Vol 60 (5) ◽  
pp. 1541-1549
Author(s):  
Kou‐Yuan Huang ◽  
Dar‐Ren Leu

Syntactic pattern recognition techniques are applied to the analysis of 1-D seismic traces to classify Ricker wavelets. Seismic Ricker wavelets have structural information, and each wavelet can be represented by a string of symbols. To recognize the strings, we use a finite‐state automaton to identify each string. The automaton can accept strings having substitution, insertion, and deletion errors of the symbols. There are two attributes, terminal symbol and weight, in each transition of the automaton. A minimum‐cost, error‐correcting, finite‐state automaton is proposed to parse the input string.


2009 ◽  
Vol 30 (5) ◽  
pp. 1343-1369 ◽  
Author(s):  
DANNY CALEGARI ◽  
KOJI FUJIWARA

AbstractA function on a discrete group is weakly combable if its discrete derivative with respect to a combing can be calculated by a finite-state automaton. A weakly combable function is bicombable if it is Lipschitz in both the left- and right-invariant word metrics. Examples of bicombable functions on word-hyperbolic groups include:(1)homomorphisms to ℤ;(2)word length with respect to a finite generating set;(3)most known explicit constructions of quasimorphisms (e.g. the Epstein–Fujiwara counting quasimorphisms).We show that bicombable functions on word-hyperbolic groups satisfy acentral limit theorem: if$\overline {\phi }_n$is the value of ϕ on a random element of word lengthn(in a certain sense), there areEandσfor which there is convergence in the sense of distribution$n^{-1/2}(\overline {\phi }_n - nE) \to N(0,\sigma )$, whereN(0,σ) denotes the normal distribution with standard deviationσ. As a corollary, we show that ifS1andS2are any two finite generating sets forG, there is an algebraic numberλ1,2depending onS1andS2such that almost every word of lengthnin theS1metric has word lengthn⋅λ1,2in theS2metric, with error of size$O(\sqrt {n})$.


2021 ◽  
Vol 178 (1-2) ◽  
pp. 59-76
Author(s):  
Emmanuel Filiot ◽  
Pierre-Alain Reynier

Copyless streaming string transducers (copyless SST) have been introduced by R. Alur and P. Černý in 2010 as a one-way deterministic automata model to define transductions of finite strings. Copyless SST extend deterministic finite state automata with a set of variables in which to store intermediate output strings, and those variables can be combined and updated all along the run, in a linear manner, i.e., no variable content can be copied on transitions. It is known that copyless SST capture exactly the class of MSO-definable string-to-string transductions, and are as expressive as deterministic two-way transducers. They enjoy good algorithmic properties. Most notably, they have decidable equivalence problem (in PSpace). On the other hand, HDT0L systems have been introduced for a while, the most prominent result being the decidability of the equivalence problem. In this paper, we propose a semantics of HDT0L systems in terms of transductions, and use it to study the class of deterministic copyful SST. Our contributions are as follows: (i)HDT0L systems and total deterministic copyful SST have the same expressive power, (ii)the equivalence problem for deterministic copyful SST and the equivalence problem for HDT0L systems are inter-reducible, in quadratic time. As a consequence, equivalence of deterministic SST is decidable, (iii)the functionality of non-deterministic copyful SST is decidable, (iv)determining whether a non-deterministic copyful SST can be transformed into an equivalent non-deterministic copyless SST is decidable in polynomial time.


Sign in / Sign up

Export Citation Format

Share Document