scholarly journals Analysis of the multiplicity matching parameter in suffix trees

2005 ◽  
Vol DMTCS Proceedings vol. AD,... (Proceedings) ◽  
Author(s):  
Mark Daniel Ward ◽  
Wojciech Szpankowski

International audience In a suffix tree, the multiplicity matching parameter (MMP) $M_n$ is the number of leaves in the subtree rooted at the branching point of the $(n+1)$st insertion. Equivalently, the MMP is the number of pointers into the database in the Lempel-Ziv '77 data compression algorithm. We prove that the MMP asymptotically follows the logarithmic series distribution plus some fluctuations. In the proof we compare the distribution of the MMP in suffix trees to its distribution in tries built over independent strings. Our results are derived by both probabilistic and analytic techniques of the analysis of algorithms. In particular, we utilize combinatorics on words, bivariate generating functions, pattern matching, recurrence relations, analytical poissonization and depoissonization, the Mellin transform, and complex analysis.

2006 ◽  
Vol DMTCS Proceedings vol. AG,... (Proceedings) ◽  
Author(s):  
Chris Deugau ◽  
Frank Ruskey

International audience We show that a family of generalized meta-Fibonacci sequences arise when counting the number of leaves at the largest level in certain infinite sequences of k-ary trees and restricted compositions of an integer. For this family of generalized meta-Fibonacci sequences and two families of related sequences we derive ordinary generating functions and recurrence relations.


2011 ◽  
Vol 61 (4) ◽  
Author(s):  
Khurshid Mir

AbstractIn this paper, a size-biased Consul distribution (SBCOND) is defined. Recurrence relations for central moments and the moments about origin are obtained. Different estimation methods for the parameters of the model are discussed. A comparative analysis is done among the three different estimation methods and the proposed model is compared with the generalized logarithmic series distribution (GLSD) and simple Consul distribution.


2005 ◽  
Vol DMTCS Proceedings vol. AD,... (Proceedings) ◽  
Author(s):  
Julien Fayolle ◽  
Mark Daniel Ward

International audience In this report, we prove that under a Markovian model of order one, the average depth of suffix trees of index n is asymptotically similar to the average depth of tries (a.k.a. digital trees) built on n independent strings. This leads to an asymptotic behavior of $(\log{n})/h + C$ for the average of the depth of the suffix tree, where $h$ is the entropy of the Markov model and $C$ is constant. Our proof compares the generating functions for the average depth in tries and in suffix trees; the difference between these generating functions is shown to be asymptotically small. We conclude by using the asymptotic behavior of the average depth in a trie under the Markov model found by Jacquet and Szpankowski ([JaSz91]).


10.37236/1052 ◽  
2006 ◽  
Vol 13 (1) ◽  
Author(s):  
Brad Jackson ◽  
Frank Ruskey

We consider a family of meta-Fibonacci sequences which arise in studying the number of leaves at the largest level in certain infinite sequences of binary trees, restricted compositions of an integer, and binary compact codes. For this family of meta-Fibonacci sequences and two families of related sequences we derive ordinary generating functions and recurrence relations. Included in these families of sequences are several well-known sequences in the Online Encyclopedia of Integer Sequences (OEIS).


2021 ◽  
Vol 9 (3) ◽  
pp. 151-155
Author(s):  
Fehim J Wani ◽  

The Generalized Logarithmic Series Distribution (GLSD) adds an extra parameter to the usual logarithmic series distribution and was introduced by Jain and Gupta (1973). This distribution has found applications in various fields. The estimation of parameters of generalized logarithmic series distribution was studied by the methods of maximum likelihood, moments, minimum chi square and weighted discrepancies. The GLSD was fitted to counts of red mites on apple leaves and it was observed that all the estimation techniques perform well in estimating the parameters of generalized logarithmic series distribution but with varying degree of non-significance.


2012 ◽  
Vol DMTCS Proceedings vol. AQ,... (Proceedings) ◽  
Author(s):  
Jeffrey Gaither ◽  
Yushi Homma ◽  
Mark Sellke ◽  
Mark Daniel Ward

International audience We use probabilistic and combinatorial tools on strings to discover the average number of 2-protected nodes in tries and in suffix trees. Our analysis covers both the uniform and non-uniform cases. For instance, in a uniform trie with $n$ leaves, the number of 2-protected nodes is approximately 0.803$n$, plus small first-order fluctuations. The 2-protected nodes are an emerging way to distinguish the interior of a tree from the fringe.


Author(s):  
R. Giancarlo ◽  
R. Grossi

We discuss the suffix tree generalization to matrices in this chapter. We extend the suffix tree notion (described in Chapter 3) from text strings to text matrices whose entries are taken from an ordered alphabet with the aim of solving pattern-matching problems. This suffix tree generalization can be efficiently used to implement low-level routines for Computer Vision, Data Compression, Geographic Information Systems and Visual Databases. We examine the submatrices in the form of the text’s contiguous parts that still have a matrix shape. Representing these text submatrices as “suitably formatted” strings stored in a compacted trie is the rationale behind suffix trees for matrices. The choice of the format inevitably influences suffix tree construction time and space complexity. We first deal with square matrices and show that many suffix tree families can be defined for the same input matrix according to the matrix’s string representations. We can store each suffix tree in linear space and give an efficient construction algorithm whose input is both the matrix and the string representation chosen. We then treat rectangular matrices and define their corresponding suffix trees by means of some general rules which we list formally. We show that there is a super-linear lower bound to the space required (in contrast with the linear space required by suffix trees for square matrices). We give a simple example of one of these suffix trees. The last part of the chapter illustrates some technical results regarding suffix trees for square matrices: we show how to achieve an expected linear-time suffix tree construction for a constant-size alphabet under some mild probabilistic assumptions about the input distribution. We begin by defining a wide class of string representations for square matrices. We let Σ denote an ordered alphabet of characters and introduce another alphabet of five special characters, called shapes. A shape is one of the special characters taken from set {IN,SW,NW,SE,NE}. Shape IN encodes the 1x1 matrix generated from the empty matrix by creating a square.


Sign in / Sign up

Export Citation Format

Share Document