The Maximal Number of Runs in Standard Sturmian Words

We investigate some repetition problems for a very special class $\mathcal{S}$ of strings called the standard Sturmian words, which have very compact representations in terms of sequences of integers. Usually the size of this word is exponential with respect to the size of its integer sequence, hence we are dealing with repetition problems in compressed strings. An explicit formula is given for the number $\rho(w)$ of runs in a standard word $w$. We show that $\rho(w)/|w|\le 4/5$ for each $w\in S$, and there is an infinite sequence of strictly growing words $w_k\in {\mathcal{S}}$ such that $\lim_{k\rightarrow \infty} \frac{\rho(w_k)}{|w_k|} = \frac{4}{5}$. Moreover, we show how to compute the number of runs in a standard Sturmian word in linear time with respect to the size of its compressed representation.

Download Full-text

USEFULNESS OF DIRECTED ACYCLIC SUBWORD GRAPHS IN PROBLEMS RELATED TO STANDARD STURMIAN WORDS

International Journal of Foundations of Computer Science ◽

10.1142/s0129054109007017 ◽

2009 ◽

Vol 20 (06) ◽

pp. 1005-1023 ◽

Cited By ~ 5

Author(s):

PAWEŁ BATURO ◽

MARCIN PIATKOWSKI ◽

WOJCIECH RYTTER

Keyword(s):

Linear Time ◽

Regular Structure ◽

Special Structure ◽

Graph Properties ◽

Alternative Graph ◽

Simple Alternative ◽

Sturmian Words ◽

Special Factors ◽

Exponential Size ◽

Standard Word

The class of finite Sturmian words consists of words having particularly simple compressed representation, which is a generalization of the Fibonacci recurrence for Fibonacci words. The subword graphs of these words (especially their compacted versions) have a very special regular structure. In this paper we investigate this structure in more detail than in previous papers and show how several syntactical properties of Sturmian words follow from their graph properties. Consequently simple alternative graph-based proofs of several known facts are presented. The very special structure of subword graphs leads also to special easy algorithms computing some parameters of Sturmian words: the number of subwords, the critical factorization point, lexicographically maximal suffixes, occurrences of subwords of a fixed length, and right special factors. These algorithms work in linear time with respect to n, the size of the compressed representation of the standard word, though the words themselves can be of exponential size with respect to n. Some of the computed parameters can be also of exponential size, however we provide their linear size compressed representations. We introduce also a new concept related to standard words: Ostrowski automata.

Download Full-text

Fast Fréchet Distance Between Curves with Long Edges

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195919500043 ◽

2019 ◽

Vol 29 (02) ◽

pp. 161-187

Author(s):

Joachim Gudmundsson ◽

Majid Mirzanezhad ◽

Ali Mohades ◽

Carola Wenk

Keyword(s):

Data Structure ◽

Approximation Algorithm ◽

Special Class ◽

Linear Time ◽

Time Algorithm ◽

Linear Time Algorithm ◽

Fréchet Distance ◽

Frechet Distance ◽

Polygonal Curves

Computing the Fréchet distance between two polygonal curves takes roughly quadratic time. In this paper, we show that for a special class of curves the Fréchet distance computations become easier. Let [Formula: see text] and [Formula: see text] be two polygonal curves in [Formula: see text] with [Formula: see text] and [Formula: see text] vertices, respectively. We prove four results for the case when all edges of both curves are long compared to the Fréchet distance between them: (1) a linear-time algorithm for deciding the Fréchet distance between two curves, (2) an algorithm that computes the Fréchet distance in [Formula: see text] time, (3) a linear-time [Formula: see text]-approximation algorithm, and (4) a data structure that supports [Formula: see text]-time decision queries, where [Formula: see text] is the number of vertices of the query curve and [Formula: see text] the number of vertices of the preprocessed curve.

Download Full-text

Greedy Transition-Based Dependency Parsing with Stack LSTMs

Computational Linguistics ◽

10.1162/coli_a_00285 ◽

2017 ◽

Vol 43 (2) ◽

pp. 311-347 ◽

Cited By ~ 2

Author(s):

Miguel Ballesteros ◽

Chris Dyer ◽

Yoav Goldberg ◽

Noah A. Smith

Keyword(s):

Neural Networks ◽

Short Term Memory ◽

Linear Time ◽

Training Data ◽

Test Time ◽

Continuous Space ◽

Look Ahead ◽

History Of ◽

Morphologically Rich Languages ◽

Standard Word

We introduce a greedy transition-based parser that learns to represent parser states using recurrent neural networks. Our primary innovation that enables us to do this efficiently is a new control structure for sequential neural networks—the stack long short-term memory unit (LSTM). Like the conventional stack data structures used in transition-based parsers, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. Our model captures three facets of the parser's state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of transition actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. In addition, we compare two different word representations: (i) standard word vectors based on look-up tables and (ii) character-based models of words. Although standard word embedding models work well in all languages, the character-based models improve the handling of out-of-vocabulary words, particularly in morphologically rich languages. Finally, we discuss the use of dynamic oracles in training the parser. During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time. Training our model with dynamic oracles yields a linear-time greedy parser with very competitive performance.

Download Full-text

ON A GENERALIZATION OF CATALAN'S POLYNOMIALS

Facta Universitatis Series Mathematics and Informatics ◽

10.22190/fumi1802163g ◽

2018 ◽

Vol 33 (2) ◽

pp. 163

Author(s):

Goubi Mouloud

Keyword(s):

Recurrence Relation ◽

Explicit Formula ◽

Special Class ◽

Combinatorial Identities ◽

New Class

Abstract. In this work, we define and study the generalized class of Catalan’s polynomials.Thereafter we connect them to the class of Humbert’s polynomials and re-foundthe Humbert recurrence relation [5]. This idea helps us to define a new class of generalizedHumbert’s polynomials different of those given by H. W. Gould [4] and P. N.Shrivastava [9]. Finally we establish an explicit formula for a special class of generalizedCatalan’s polynomials and get two useful combinatorial identities.

Download Full-text

From Christoffel Words to Markoff Numbers

10.1093/oso/9780198827542.001.0001 ◽

2018 ◽

Cited By ~ 3

Author(s):

Christophe Reutenauer

Keyword(s):

Free Group ◽

Continued Fractions ◽

Quadratic Forms ◽

Special Class ◽

Rational Numbers ◽

Real Numbers ◽

Inner Automorphisms ◽

Sturmian Words ◽

Markoff Numbers ◽

Lyndon Words

Christoffel introduced in 1875 a special class of words on a binary alphabet, linked to continued fractions. Some years laterMarkoff published his famous theory, called nowMarkoff theory. It characterizes certain quadratic forms, and certain real numbers by extremal inequalities. Both classes are constructed by using certain natural numbers, calledMarkoff numbers; they are characterized by a certain diophantine equality. More basically, they are constructed using certain words, essentially the Christoffel words. The link between Christoffelwords and the theory ofMarkoffwas noted by Frobenius.Motivated by this link, the book presents the classical theory of Markoff in its two aspects, based on the theory of Christoffel words. This is done in Part I of the book. Part II gives the more advanced and recent results of the theory of Christoffel words: palindromes (central words), periods, Lyndon words, Stern–Brocot tree, semi-convergents of rational numbers and finite continued fractions, geometric interpretations, conjugation, factors of Christoffel words, finite Sturmian words, free group on two generators, bases, inner automorphisms, Christoffel bases, Nielsen’s criterion, Sturmian morphisms, and positive automorphisms of this free group.

Download Full-text

Comparing Degenerate Strings

A Mosaic of Computational Topics: from Classical to Novel ◽

10.3233/stal200005 ◽

2020 ◽

Author(s):

Mai Alzamel ◽

Lorraine A.K. Ayad ◽

Giulia Bernardini ◽

Roberto Grossi ◽

Costas S. Iliopoulos ◽

...

Keyword(s):

Lower Bound ◽

Linear Time ◽

Simple Algorithm ◽

Time Algorithm ◽

Linear Time Algorithm ◽

Total Size ◽

Empty Intersection ◽

Compact Representations ◽

String Comparison ◽

Combinatorial Result

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, …, kn-1. Our main result is an O(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in O(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.

Download Full-text

Sumsets of Finite Beatty Sequences

The Electronic Journal of Combinatorics ◽

10.37236/1614 ◽

2000 ◽

Vol 8 (2) ◽

Cited By ~ 2

Author(s):

Jane Pitman

Keyword(s):

Arithmetic Progression ◽

Continued Fractions ◽

Sturmian Word ◽

Beatty Sequences ◽

Sturmian Words ◽

Beatty Sequence

An investigation of the size of $S+S$ for a finite Beatty sequence $S=(s_i)=(\lfloor i\alpha+\gamma \rfloor)$, where $\lfloor \hphantom{x} \rfloor$ denotes "floor", $\alpha$, $\gamma$ are real with $\alpha\ge 1$, and $0\le i \le k-1$ and $k\ge 3$. For $\alpha>2$, it is shown that $|S+S|$ depends on the number of "centres" of the Sturmian word $\Delta S=(s_i-s_{i-1})$, and hence that $3(k-1)\le |S+S|\le 4k-6$ if $S$ is not an arithmetic progression. A formula is obtained for the number of centres of certain finite periodic Sturmian words, and this leads to further information about $|S+S|$ in terms of finite nearest integer continued fractions.

Download Full-text

A Square Root Map on Sturmian Words

The Electronic Journal of Combinatorics ◽

10.37236/6074 ◽

2017 ◽

Vol 24 (1) ◽

Author(s):

Jarkko Peltomäki ◽

Markus A. Whiteland

Keyword(s):

Fixed Points ◽

General Setting ◽

Square Root ◽

Sturmian Word ◽

Word Equation ◽

Sturmian Words ◽

Infinite Set ◽

Curious Property

We introduce a square root map on Sturmian words and study its properties. Given a Sturmian word of slope $\alpha$, there exists exactly six minimal squares in its language (a minimal square does not have a square as a proper prefix). A Sturmian word $s$ of slope $\alpha$ can be written as a product of these six minimal squares: $s = X_1^2 X_2^2 X_3^2 \cdots$. The square root of $s$ is defined to be the word $\sqrt{s} = X_1 X_2 X_3 \cdots$. The main result of this paper is that $\sqrt{s}$ is also a Sturmian word of slope $\alpha$. Further, we characterize the Sturmian fixed points of the square root map, and we describe how to find the intercept of $\sqrt{s}$ and an occurrence of any prefix of $\sqrt{s}$ in $s$. Related to the square root map, we characterize the solutions of the word equation $X_1^2 X_2^2 \cdots X_n^2 = (X_1 X_2 \cdots X_n)^2$ in the language of Sturmian words of slope $\alpha$ where the words $X_i^2$ are minimal squares of slope $\alpha$.We also study the square root map in a more general setting. We explicitly construct an infinite set of non-Sturmian fixed points of the square root map. We show that the subshifts $\Omega$ generated by these words have a curious property: for all $w \in \Omega$ either $\sqrt{w} \in \Omega$ or $\sqrt{w}$ is periodic. In particular, the square root map can map an aperiodic word to a periodic word.

Download Full-text

The Steiner cycle and path cover problem on interval graphs

Journal of Combinatorial Optimization ◽

10.1007/s10878-021-00757-7 ◽

2021 ◽

Author(s):

Ante Ćustić ◽

Stefan Lendl

Keyword(s):

Steiner Tree ◽

Special Class ◽

Linear Time ◽

Hamiltonian Path ◽

Interval Graphs ◽

Path Cover ◽

Fixed Set ◽

Minimum Number ◽

Cover Problem ◽

Linear Time Algorithms

AbstractThe Steiner path problem is a common generalization of the Steiner tree and the Hamiltonian path problem, in which we have to decide if for a given graph there exists a path visiting a fixed set of terminals. In the Steiner cycle problem we look for a cycle visiting all terminals instead of a path. The Steiner path cover problem is an optimization variant of the Steiner path problem generalizing the path cover problem, in which one has to cover all terminals with a minimum number of paths. We study those problems for the special class of interval graphs. We present linear time algorithms for both the Steiner path cover problem and the Steiner cycle problem on interval graphs given as endpoint sorted lists. The main contribution is a lemma showing that backward steps to non-Steiner intervals are never necessary. Furthermore, we show how to integrate this modification to the deferred-query technique of Chang et al. to obtain the linear running times.

Download Full-text

Comparing Degenerate Strings

Fundamenta Informaticae ◽

10.3233/fi-2020-1947 ◽

2020 ◽

Vol 175 (1-4) ◽

pp. 41-58

Author(s):

Mai Alzamel ◽

Lorraine A.K. Ayad ◽

Giulia Bernardini ◽

Roberto Grossi ◽

Costas S. Iliopoulos ◽

...

Keyword(s):

Lower Bound ◽

Linear Time ◽

Simple Algorithm ◽

Time Algorithm ◽

Linear Time Algorithm ◽

Total Size ◽

Empty Intersection ◽

Compact Representations ◽

String Comparison ◽

Combinatorial Result

Uncertain sequences are compact representations of sets of similar strings. They highlight common segments by collapsing them, and explicitly represent varying segments by listing all possible options. A generalized degenerate string (GD string) is a type of uncertain sequence. Formally, a GD string Ŝ is a sequence of n sets of strings of total size N, where the ith set contains strings of the same length ki but this length can vary between different sets. We denote by W the sum of these lengths k0, k1, . . . , kn-1. Our main result is an 𝒪(N + M)-time algorithm for deciding whether two GD strings of total sizes N and M, respectively, over an integer alphabet, have a non-empty intersection. This result is based on a combinatorial result of independent interest: although the intersection of two GD strings can be exponential in the total size of the two strings, it can be represented in linear space. We then apply our string comparison tool to devise a simple algorithm for computing all palindromes in Ŝ in 𝒪(min{W, n2}N)-time. We complement this upper bound by showing a similar conditional lower bound for computing maximal palindromes in Ŝ. We also show that a result, which is essentially the same as our string comparison linear-time algorithm, can be obtained by employing an automata-based approach.

Download Full-text