scholarly journals The Maximal Number of Runs in Standard Sturmian Words

10.37236/2473 ◽  
2013 ◽  
Vol 20 (1) ◽  
Author(s):  
Paweł Baturo ◽  
Marcin Piątkowski ◽  
Wojciech Rytter

We investigate some repetition problems for a very special class $\mathcal{S}$ of strings called the standard Sturmian words, which  have very compact representations in terms of sequences of integers. Usually the size of this word is exponential with respect to the size of its integer sequence, hence we are dealing with repetition problems in compressed strings. An explicit formula is given for the number $\rho(w)$ of runs in a standard word $w$. We show that $\rho(w)/|w|\le 4/5$ for each $w\in S$, and  there is an infinite sequence of strictly growing words $w_k\in {\mathcal{S}}$ such that $\lim_{k\rightarrow \infty} \frac{\rho(w_k)}{|w_k|} = \frac{4}{5}$. Moreover, we show how to compute the number of runs in a standard Sturmian word in linear time with respect to the size of its compressed representation.

2009 ◽  
Vol 20 (06) ◽  
pp. 1005-1023 ◽  
Author(s):  
PAWEŁ BATURO ◽  
MARCIN PIATKOWSKI ◽  
WOJCIECH RYTTER

The class of finite Sturmian words consists of words having particularly simple compressed representation, which is a generalization of the Fibonacci recurrence for Fibonacci words. The subword graphs of these words (especially their compacted versions) have a very special regular structure. In this paper we investigate this structure in more detail than in previous papers and show how several syntactical properties of Sturmian words follow from their graph properties. Consequently simple alternative graph-based proofs of several known facts are presented. The very special structure of subword graphs leads also to special easy algorithms computing some parameters of Sturmian words: the number of subwords, the critical factorization point, lexicographically maximal suffixes, occurrences of subwords of a fixed length, and right special factors. These algorithms work in linear time with respect to n, the size of the compressed representation of the standard word, though the words themselves can be of exponential size with respect to n. Some of the computed parameters can be also of exponential size, however we provide their linear size compressed representations. We introduce also a new concept related to standard words: Ostrowski automata.


2019 ◽  
Vol 29 (02) ◽  
pp. 161-187
Author(s):  
Joachim Gudmundsson ◽  
Majid Mirzanezhad ◽  
Ali Mohades ◽  
Carola Wenk

Computing the Fréchet distance between two polygonal curves takes roughly quadratic time. In this paper, we show that for a special class of curves the Fréchet distance computations become easier. Let [Formula: see text] and [Formula: see text] be two polygonal curves in [Formula: see text] with [Formula: see text] and [Formula: see text] vertices, respectively. We prove four results for the case when all edges of both curves are long compared to the Fréchet distance between them: (1) a linear-time algorithm for deciding the Fréchet distance between two curves, (2) an algorithm that computes the Fréchet distance in [Formula: see text] time, (3) a linear-time [Formula: see text]-approximation algorithm, and (4) a data structure that supports [Formula: see text]-time decision queries, where [Formula: see text] is the number of vertices of the query curve and [Formula: see text] the number of vertices of the preprocessed curve.


2017 ◽  
Vol 43 (2) ◽  
pp. 311-347 ◽  
Author(s):  
Miguel Ballesteros ◽  
Chris Dyer ◽  
Yoav Goldberg ◽  
Noah A. Smith

We introduce a greedy transition-based parser that learns to represent parser states using recurrent neural networks. Our primary innovation that enables us to do this efficiently is a new control structure for sequential neural networks—the stack long short-term memory unit (LSTM). Like the conventional stack data structures used in transition-based parsers, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. Our model captures three facets of the parser's state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of transition actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. In addition, we compare two different word representations: (i) standard word vectors based on look-up tables and (ii) character-based models of words. Although standard word embedding models work well in all languages, the character-based models improve the handling of out-of-vocabulary words, particularly in morphologically rich languages. Finally, we discuss the use of dynamic oracles in training the parser. During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time. Training our model with dynamic oracles yields a linear-time greedy parser with very competitive performance.


2018 ◽  
Vol 33 (2) ◽  
pp. 163
Author(s):  
Goubi Mouloud

Abstract. In this work, we define and study the generalized class of Catalan’s polynomials.Thereafter we connect them to the class of Humbert’s polynomials and re-foundthe Humbert recurrence relation [5]. This idea helps us to define a new class of generalizedHumbert’s polynomials different of those given by H. W. Gould [4] and P. N.Shrivastava [9]. Finally we establish an explicit formula for a special class of generalizedCatalan’s polynomials and get two useful combinatorial identities.


Author(s):  
Christophe Reutenauer

Christoffel introduced in 1875 a special class of words on a binary alphabet, linked to continued fractions. Some years laterMarkoff published his famous theory, called nowMarkoff theory. It characterizes certain quadratic forms, and certain real numbers by extremal inequalities. Both classes are constructed by using certain natural numbers, calledMarkoff numbers; they are characterized by a certain diophantine equality. More basically, they are constructed using certain words, essentially the Christoffel words. The link between Christoffelwords and the theory ofMarkoffwas noted by Frobenius.Motivated by this link, the book presents the classical theory of Markoff in its two aspects, based on the theory of Christoffel words. This is done in Part I of the book. Part II gives the more advanced and recent results of the theory of Christoffel words: palindromes (central words), periods, Lyndon words, Stern–Brocot tree, semi-convergents of rational numbers and finite continued fractions, geometric interpretations, conjugation, factors of Christoffel words, finite Sturmian words, free group on two generators, bases, inner automorphisms, Christoffel bases, Nielsen’s criterion, Sturmian morphisms, and positive automorphisms of this free group.


Author(s):  
Mai Alzamel ◽  
Lorraine A.K. Ayad ◽  
Giulia Bernardini ◽  
Roberto Grossi ◽  
Costas S. Iliopoulos ◽  
...  

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, …, kn-1. Our main result is an O(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in O(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.


10.37236/1614 ◽  
2000 ◽  
Vol 8 (2) ◽  
Author(s):  
Jane Pitman

An investigation of the size of $S+S$ for a finite Beatty sequence $S=(s_i)=(\lfloor i\alpha+\gamma \rfloor)$, where $\lfloor \hphantom{x} \rfloor$ denotes "floor", $\alpha$, $\gamma$ are real with $\alpha\ge 1$, and $0\le i \le k-1$ and $k\ge 3$. For $\alpha>2$, it is shown that $|S+S|$ depends on the number of "centres" of the Sturmian word $\Delta S=(s_i-s_{i-1})$, and hence that $3(k-1)\le |S+S|\le 4k-6$ if $S$ is not an arithmetic progression. A formula is obtained for the number of centres of certain finite periodic Sturmian words, and this leads to further information about $|S+S|$ in terms of finite nearest integer continued fractions.


10.37236/6074 ◽  
2017 ◽  
Vol 24 (1) ◽  
Author(s):  
Jarkko Peltomäki ◽  
Markus A. Whiteland

We introduce a square root map on Sturmian words and study its properties. Given a Sturmian word of slope $\alpha$, there exists exactly six minimal squares in its language (a minimal square does not have a square as a proper prefix). A Sturmian word $s$ of slope $\alpha$ can be written as a product of these six minimal squares: $s = X_1^2 X_2^2 X_3^2 \cdots$. The square root of $s$ is defined to be the word $\sqrt{s} = X_1 X_2 X_3 \cdots$. The main result of this paper is that $\sqrt{s}$ is also a Sturmian word of slope $\alpha$. Further, we characterize the Sturmian fixed points of the square root map, and we describe how to find the intercept of $\sqrt{s}$ and an occurrence of any prefix of $\sqrt{s}$ in $s$. Related to the square root map, we characterize the solutions of the word equation $X_1^2 X_2^2 \cdots X_n^2 = (X_1 X_2 \cdots X_n)^2$ in the language of Sturmian words of slope $\alpha$ where the words $X_i^2$ are minimal squares of slope $\alpha$.We also study the square root map in a more general setting. We explicitly construct an infinite set of non-Sturmian fixed points of the square root map. We show that the subshifts $\Omega$ generated by these words have a curious property: for all $w \in \Omega$ either $\sqrt{w} \in \Omega$ or $\sqrt{w}$ is periodic. In particular, the square root map can map an aperiodic word to a periodic word.


Author(s):  
Ante Ćustić ◽  
Stefan Lendl

AbstractThe Steiner path problem is a common generalization of the Steiner tree and the Hamiltonian path problem, in which we have to decide if for a given graph there exists a path visiting a fixed set of terminals. In the Steiner cycle problem we look for a cycle visiting all terminals instead of a path. The Steiner path cover problem is an optimization variant of the Steiner path problem generalizing the path cover problem, in which one has to cover all terminals with a minimum number of paths. We study those problems for the special class of interval graphs. We present linear time algorithms for both the Steiner path cover problem and the Steiner cycle problem on interval graphs given as endpoint sorted lists. The main contribution is a lemma showing that backward steps to non-Steiner intervals are never necessary. Furthermore, we show how to integrate this modification to the deferred-query technique of Chang et al. to obtain the linear running times.


2020 ◽  
Vol 175 (1-4) ◽  
pp. 41-58
Author(s):  
Mai Alzamel ◽  
Lorraine A.K. Ayad ◽  
Giulia Bernardini ◽  
Roberto Grossi ◽  
Costas S. Iliopoulos ◽  
...  

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, . . . , kn-1. Our main result is an 𝒪(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in 𝒪(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.


Sign in / Sign up

Export Citation Format

Share Document