scholarly journals Analysis of the average depth in a suffix tree under a Markov model

2005 ◽  
Vol DMTCS Proceedings vol. AD,... (Proceedings) ◽  
Author(s):  
Julien Fayolle ◽  
Mark Daniel Ward

International audience In this report, we prove that under a Markovian model of order one, the average depth of suffix trees of index n is asymptotically similar to the average depth of tries (a.k.a. digital trees) built on n independent strings. This leads to an asymptotic behavior of $(\log{n})/h + C$ for the average of the depth of the suffix tree, where $h$ is the entropy of the Markov model and $C$ is constant. Our proof compares the generating functions for the average depth in tries and in suffix trees; the difference between these generating functions is shown to be asymptotically small. We conclude by using the asymptotic behavior of the average depth in a trie under the Markov model found by Jacquet and Szpankowski ([JaSz91]).

2021 ◽  
Vol 14 (1) ◽  
pp. 234-247
Author(s):  
Victorien Fourtoua Konane

In this work, we modeled the behavior of a battery. After having formulated a Markovian model, we evaluated the delivered capacity as well as the gained capacity. We, likewise, evaluated the mean number of pulses and studied the asymptotic behavior and the variance of this mean number. As a last resort, we introduced an extension of the Markov model.


2005 ◽  
Vol DMTCS Proceedings vol. AD,... (Proceedings) ◽  
Author(s):  
Mark Daniel Ward ◽  
Wojciech Szpankowski

International audience In a suffix tree, the multiplicity matching parameter (MMP) $M_n$ is the number of leaves in the subtree rooted at the branching point of the $(n+1)$st insertion. Equivalently, the MMP is the number of pointers into the database in the Lempel-Ziv '77 data compression algorithm. We prove that the MMP asymptotically follows the logarithmic series distribution plus some fluctuations. In the proof we compare the distribution of the MMP in suffix trees to its distribution in tries built over independent strings. Our results are derived by both probabilistic and analytic techniques of the analysis of algorithms. In particular, we utilize combinatorics on words, bivariate generating functions, pattern matching, recurrence relations, analytical poissonization and depoissonization, the Mellin transform, and complex analysis.


2006 ◽  
Vol DMTCS Proceedings vol. AG,... (Proceedings) ◽  
Author(s):  
Manuel Lladser

International audience Given an integer $m \geq 1$, let $\| \cdot \|$ be a norm in $\mathbb{R}^{m+1}$ and let $\mathbb{S}_+^m$ denote the set of points $\mathbf{d}=(d_0,\ldots,d_m)$ in $\mathbb{R}^{m+1}$ with nonnegative coordinates and such that $\| \mathbf{d} \|=1$. Consider for each $1 \leq j \leq m$ a function $f_j(z)$ that is analytic in an open neighborhood of the point $z=0$ in the complex plane and with possibly negative Taylor coefficients. Given $\mathbf{n}=(n_0,\ldots,n_m)$ in $\mathbb{Z}^{m+1}$ with nonnegative coordinates, we develop a method to systematically associate a parameter-varying integral to study the asymptotic behavior of the coefficient of $z^{n_0}$ of the Taylor series of $\prod_{j=1}^m \{f_j(z)\}^{n_j}$, as $\| \mathbf{n} \| \to \infty$. The associated parameter-varying integral has a phase term with well specified properties that make the asymptotic analysis of the integral amenable to saddle-point methods: for many $\mathbf{d} \in \mathbb{S}_+^m$, these methods ensure uniform asymptotic expansions for $[z^{n_0}] \prod_{j=1}^m \{f_j(z)\}^{n_j}$ provided that $\mathbf{n}/ \| \mathbf{n} \|$ stays sufficiently close to $\mathbf{d}$ as $\| \mathbf{n} \| \to \infty$. Our method finds applications in studying the asymptotic behavior of the coefficients of a certain multivariable generating functions as well as in problems related to the Lagrange inversion formula for instance in the context random planar maps.


2015 ◽  
Vol DMTCS Proceedings, 27th... (Proceedings) ◽  
Author(s):  
Adrien Boussicault ◽  
Simone Rinaldi ◽  
Samanta Socci

International audience We present a new method to obtain the generating functions for directed convex polyominoes according to several different statistics including: width, height, size of last column/row and number of corners. This method can be used to study different families of directed convex polyominoes: symmetric polyominoes, parallelogram polyominoes. In this paper, we apply our method to determine the generating function for directed $k$-convex polyominoes.We show it is a rational function and we study its asymptotic behavior. Nous présentons une nouvelle méthode générique pour obtenir facilement et rapidement les fonctions génératrices des polyominos dirigés convexes avec différentes combinaisons de statistiques : hauteur, largeur, longueur de la dernière ligne/colonne et nombre de coins. La méthode peut être utilisée pour énumérer différentes familles de polyominos dirigés convexes: les polyominos symétriques, les polyominos parallélogrammes. De cette façon, nouscalculons la fonction génératrice des polyominos dirigés $k$-convexes, nous montrons qu’elle est rationnelle et nous étudions son comportement asymptotique.


2003 ◽  
Vol DMTCS Proceedings vol. AC,... (Proceedings) ◽  
Author(s):  
Michel Nguyên Thê

International audience This paper gives a survey of the limit distributions of the areas of different types of random walks, namely Dyck paths, bilateral Dyck paths, meanders, and Bernoulli random walks, using the technology of generating functions only.


10.37236/1517 ◽  
2000 ◽  
Vol 7 (1) ◽  
Author(s):  
Charles Knessl ◽  
Wojciech Szpankowski

We study the limiting distribution of the height in a generalized trie in which external nodes are capable to store up to $b$ items (the so called $b$-tries). We assume that such a tree is built from $n$ random strings (items) generated by an unbiased memoryless source. In this paper, we discuss the case when $b$ and $n$ are both large. We shall identify five regions of the height distribution that should be compared to three regions obtained for fixed $b$. We prove that for most $n$, the limiting distribution is concentrated at the single point $k_1=\lfloor \log_2 (n/b)\rfloor +1$ as $n,b\to \infty$. We observe that this is quite different than the height distribution for fixed $b$, in which case the limiting distribution is of an extreme value type concentrated around $(1+1/b)\log_2 n$. We derive our results by analytic methods, namely generating functions and the saddle point method. We also present some numerical verification of our results.


2011 ◽  
Vol DMTCS Proceedings vol. AO,... (Proceedings) ◽  
Author(s):  
Hoda Bidkhori

International audience In this paper we study finite Eulerian posets which are binomial or Sheffer. These important classes of posets are related to the theory of generating functions and to geometry. The results of this paper are organized as follows: (1) We completely determine the structure of Eulerian binomial posets and, as a conclusion, we are able to classify factorial functions of Eulerian binomial posets; (2) We give an almost complete classification of factorial functions of Eulerian Sheffer posets by dividing the original question into several cases; (3) In most cases above, we completely determine the structure of Eulerian Sheffer posets, a result stronger than just classifying factorial functions of these Eulerian Sheffer posets. We also study Eulerian triangular posets. This paper answers questions posed by R. Ehrenborg and M. Readdy. This research is also motivated by the work of R. Stanley about recognizing the \emphboolean lattice by looking at smaller intervals. Nous étudions les ensembles partiellement ordonnés finis (EPO) qui sont soit binomiaux soit de type Sheffer (deux notions reliées aux séries génératrices et à la géométrie). Nos résultats sont les suivants: (1) nous déterminons la structure des EPO Euleriens et binomiaux; nous classifions ainsi les fonctions factorielles de tous ces EPO; (2) nous donnons une classification presque complète des fonctions factorielles des EPO Euleriens de type Sheffer; (3) dans la plupart de ces cas, nous déterminons complètement la structure des EPO Euleriens et Sheffer, ce qui est plus fort que classifier leurs fonctions factorielles. Nous étudions aussi les EPO Euleriens triangulaires. Cet article répond à des questions de R. Ehrenborg and M. Readdy. Il est aussi motivé par le travail de R. Stanley sur la reconnaissance du treillis booléen via l'étude des petits intervalles.


2009 ◽  
Vol DMTCS Proceedings vol. AK,... (Proceedings) ◽  
Author(s):  
Tamás Lengyel

International audience Let $n$ and $k$ be positive integers, $d(k)$ and $\nu_2(k)$ denote the number of ones in the binary representation of $k$ and the highest power of two dividing $k$, respectively. De Wannemacker recently proved for the Stirling numbers of the second kind that $\nu_2(S(2^n,k))=d(k)-1, 1\leq k \leq 2^n$. Here we prove that $\nu_2(S(c2^n,k))=d(k)-1, 1\leq k \leq 2^n$, for any positive integer $c$. We improve and extend this statement in some special cases. For the difference, we obtain lower bounds on $\nu_2(S(c2^{n+1}+u,k)-S(c2^n+u,k))$ for any nonnegative integer $u$, make a conjecture on the exact order and, for $u=0$, prove part of it when $k \leq 6$, or $k \geq 5$ and $d(k) \leq 2$. The proofs rely on congruential identities for power series and polynomials related to the Stirling numbers and Bell polynomials, and some divisibility properties.


2012 ◽  
Vol DMTCS Proceedings vol. AQ,... (Proceedings) ◽  
Author(s):  
Jeffrey Gaither ◽  
Yushi Homma ◽  
Mark Sellke ◽  
Mark Daniel Ward

International audience We use probabilistic and combinatorial tools on strings to discover the average number of 2-protected nodes in tries and in suffix trees. Our analysis covers both the uniform and non-uniform cases. For instance, in a uniform trie with $n$ leaves, the number of 2-protected nodes is approximately 0.803$n$, plus small first-order fluctuations. The 2-protected nodes are an emerging way to distinguish the interior of a tree from the fringe.


2015 ◽  
Vol DMTCS Proceedings, 27th... (Proceedings) ◽  
Author(s):  
Lenny Tevlin

International audience This paper contains two results. First, I propose a $q$-generalization of a certain sequence of positive integers, related to Catalan numbers, introduced by Zeilberger, see Lassalle (2010). These $q$-integers are palindromic polynomials in $q$ with positive integer coefficients. The positivity depends on the positivity of a certain difference of products of $q$-binomial coefficients.To this end, I introduce a new inversion/major statistics on lattice walks. The difference in $q$-binomial coefficients is then seen as a generating function of weighted walks that remain in the upper half-plan. Cet document contient deux résultats. Tout d’abord, je vous propose un $q$-generalization d’une certaine séquence de nombres entiers positifs, liés à nombres de Catalan, introduites par Zeilberger (Lassalle, 2010). Ces $q$-integers sont des polynômes palindromiques à $q$ à coefficients entiers positifs. La positivité dépend de la positivité d’une certaine différence de produits de $q$-coefficients binomial.Pour ce faire, je vous présente une nouvelle inversion/major index sur les chemins du réseau. La différence de $q$-binomial coefficients est alors considérée comme une fonction de génération de trajets pondérés qui restent dans le demi-plan supérieur.


Sign in / Sign up

Export Citation Format

Share Document