scholarly journals Counting Markov Types

2010 ◽  
Vol DMTCS Proceedings vol. AM,... (Proceedings) ◽  
Author(s):  
Philippe Jacquet ◽  
Charles Knessl ◽  
Wojciech Szpankowski

International audience The method of types is one of the most popular techniques in information theory and combinatorics. Two sequences of equal length have the same type if they have identical empirical distributions. In this paper, we focus on Markov types, that is, sequences generated by a Markov source (of order one). We note that sequences having the same Markov type share the same so called $\textit{balanced frequency matrix}$ that counts the number of distinct pairs of symbols. We enumerate the number of Markov types for sequences of length $n$ over an alphabet of size $m$. This turns out to coincide with the number of the balanced frequency matrices as well as with the number of special $\textit{linear diophantine equations}$, and also balanced directed multigraphs. For fixed $m$ we prove that the number of Markov types is asymptotically equal to $d(m) \frac{n^{m^{2-m}}}{(m^2-m)!}$, where $d(m)$ is a constant for which we give an integral representation. For $m \to \infty$ we conclude that asymptotically the number of types is equivalent to $\frac{\sqrt{2}m^{3m/2} e^{m^2}}{m^{2m^2} 2^m \pi^{m/2}} n^{m^2-m}$ provided that $m=o(n^{1/4})$ (however, our techniques work for $m=o(\sqrt{n})$). These findings are derived by analytical techniques ranging from multidimensional generating functions to the saddle point method.

2019 ◽  
Vol 234 (5) ◽  
pp. 291-299
Author(s):  
Anton Shutov ◽  
Andrey Maleev

Abstract A new approach to the problem of coordination sequences of periodic structures is proposed. It is based on the concept of layer-by-layer growth and on the study of geodesics in periodic graphs. We represent coordination numbers as sums of so called sector coordination numbers arising from the growth polygon of the graph. In each sector we obtain a canonical form of the geodesic chains and reduce the calculation of the sector coordination numbers to solution of the linear Diophantine equations. The approach is illustrated by the example of the 2-homogeneous kra graph. We obtain three alternative descriptions of the coordination sequences: explicit formulas, generating functions and recurrent relations.


2010 ◽  
Vol DMTCS Proceedings vol. AN,... (Proceedings) ◽  
Author(s):  
Sheng Chen ◽  
Nan Li ◽  
Steven V Sam

International audience Let $P$ be a polytope with rational vertices. A classical theorem of Ehrhart states that the number of lattice points in the dilations $P(n) = nP$ is a quasi-polynomial in $n$. We generalize this theorem by allowing the vertices of $P(n)$ to be arbitrary rational functions in $n$. In this case we prove that the number of lattice points in $P(n)$ is a quasi-polynomial for $n$ sufficiently large. Our work was motivated by a conjecture of Ehrhart on the number of solutions to parametrized linear Diophantine equations whose coefficients are polynomials in $n$, and we explain how these two problems are related. Soit $P$ un polytope avec sommets rationelles. Un théorème classique des Ehrhart déclare que le nombre de points du réseau dans les dilatations $P(n) = nP$ est un quasi-polynôme en $n$. Nous généralisons ce théorème en permettant à des sommets de $P(n)$ comme arbitraire fonctions rationnelles en $n$. Dans ce cas, nous prouvons que le nombre de points du réseau en $P(n)$ est une quasi-polynôme pour $n$ assez grand. Notre travail a été motivée par une conjecture d'Ehrhart sur le nombre de solutions à linéaire paramétrée Diophantine équations dont les coefficients sont des polyômes en $n$, et nous expliquer comment ces deux problèmes sont liés.


2008 ◽  
Vol DMTCS Proceedings vol. AI,... (Proceedings) ◽  
Author(s):  
Wojciech Szpankowski

International audience Analytic information theory aims at studying problems of information theory using analytic techniques of computer science and combinatorics. Following Hadamard's precept, these problems are tackled by complex analysis methods such as generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, and singularity analysis. This approach lies at the crossroad of computer science and information theory. In this survey we concentrate on one facet of information theory (i.e., source coding better known as data compression), namely the $\textit{redundancy rate}$ problem. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We further restrict our interest to the $\textit{average}$ redundancy for $\textit{known}$ sources, that is, when statistics of information sources are known. We present precise analyses of three types of lossless data compression schemes, namely fixed-to-variable (FV) length codes, variable-to-fixed (VF) length codes, and variable-to-variable (VV) length codes. In particular, we investigate average redundancy of Huffman, Tunstall, and Khodak codes. These codes have succinct representations as $\textit{trees}$, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf).


2012 ◽  
Vol DMTCS Proceedings vol. AQ,... (Proceedings) ◽  
Author(s):  
Philippe Jacquet ◽  
Wojciech Szpankowski

International audience String complexity is defined as the cardinality of a set of all distinct words (factors) of a given string. For two strings, we define $\textit{joint string complexity}$ as the set of words that are common to both strings. We also relax this definition and introduce $\textit{joint semi-complexity}$ restricted to the common words appearing at least twice in both strings. String complexity finds a number of applications from capturing the richness of a language to finding similarities between two genome sequences. In this paper we analyze joint complexity and joint semi-complexity when both strings are generated by a Markov source. The problem turns out to be quite challenging requiring subtle singularity analysis and saddle point method over infinity many saddle points leading to novel oscillatory phenomena with single and double periodicities.


2021 ◽  
Vol 81 (12) ◽  
Author(s):  
A. Andreev ◽  
A. Popolitov ◽  
A. Sleptsov ◽  
A. Zhabin

AbstractWe investigate the structural constants of the KP hierarchy, which appear as universal coefficients in the paper of Natanzon–Zabrodin arXiv:1509.04472. It turns out that these constants have a combinatorial description in terms of transport coefficients in the theory of flow networks. Considering its properties we want to point out three novel directions of KP combinatorial structure research: connection with topological recursion, eigenvalue model for the structural constants and its deformations, possible deformations of KP hierarchy in terms of the structural constants. Firstly, in this paper we study the internal structure of these coefficients which involves: (1) construction of generating functions that have interesting properties by themselves; (2) restrictions on topological recursion initial data; (3) construction of integral representation or matrix model for these coefficients with non-trivial Ward identities. This shows that these coefficients appear in various problems of mathematical physics, which increases their value and significance. Secondly, we discuss their role in integrability of KP hierarchy considering possible deformation of these coefficients without changing the equations on $$\tau $$ τ -function. We consider several plausible deformations. While most failed even very basic checks, one deformation (involving Macdonald polynomials) passes all the simple checks and requires more thorough study.


2018 ◽  
Vol 18 (2) ◽  
pp. 185-188
Author(s):  
Satish Kumar ◽  
◽  
Deepak Gupta ◽  
Hari Kishan

2003 ◽  
Vol DMTCS Proceedings vol. AC,... (Proceedings) ◽  
Author(s):  
Michel Nguyên Thê

International audience This paper gives a survey of the limit distributions of the areas of different types of random walks, namely Dyck paths, bilateral Dyck paths, meanders, and Bernoulli random walks, using the technology of generating functions only.


10.37236/1517 ◽  
2000 ◽  
Vol 7 (1) ◽  
Author(s):  
Charles Knessl ◽  
Wojciech Szpankowski

We study the limiting distribution of the height in a generalized trie in which external nodes are capable to store up to $b$ items (the so called $b$-tries). We assume that such a tree is built from $n$ random strings (items) generated by an unbiased memoryless source. In this paper, we discuss the case when $b$ and $n$ are both large. We shall identify five regions of the height distribution that should be compared to three regions obtained for fixed $b$. We prove that for most $n$, the limiting distribution is concentrated at the single point $k_1=\lfloor \log_2 (n/b)\rfloor +1$ as $n,b\to \infty$. We observe that this is quite different than the height distribution for fixed $b$, in which case the limiting distribution is of an extreme value type concentrated around $(1+1/b)\log_2 n$. We derive our results by analytic methods, namely generating functions and the saddle point method. We also present some numerical verification of our results.


Sign in / Sign up

Export Citation Format

Share Document