Poisson Approximation for the Number of Repeats in a Stationary Markov Chain

Detection of repeated sequences within complete genomes is a powerful tool to help understanding genome dynamics and species evolutionary history. To distinguish significant repeats from those that can be obtained just by chance, statistical methods have to be developed. In this paper we show that the distribution of the number of long repeats in long sequences generated by stationary Markov chains can be approximated by a Poisson distribution with explicit parameter. Thanks to the Chen-Stein method we provide a bound for the approximation error; this bound converges to 0 as soon as the length n of the sequence tends to ∞ and the length t of the repeats satisfies n2ρt = O(1) for some 0 < ρ < 1. Using this Poisson approximation, p-values can then be easily calculated to determine if a given genome is significantly enriched in repeats of length t.

Download Full-text

Improved compound Poisson approximation for the number of occurrences of any rare word family in a stationary markov chain

Advances in Applied Probability ◽

10.1017/s0001867800001634 ◽

2007 ◽

Vol 39 (01) ◽

pp. 128-140 ◽

Cited By ~ 1

Author(s):

Etienne Roquain ◽

Sophie Schbath

Keyword(s):

Markov Chain ◽

Poisson Distribution ◽

Approximation Error ◽

Rare Event ◽

Poisson Approximation ◽

Rare Word ◽

Compound Poisson ◽

Compound Poisson Approximation ◽

Stein Method ◽

Poisson Approximations

We derive a new compound Poisson distribution with explicit parameters to approximate the number of overlapping occurrences of any set of words in a Markovian sequence. Using the Chen-Stein method, we provide a bound for the approximation error. This error converges to 0 under the rare event condition, even for overlapping families, which improves previous results. As a consequence, we also propose Poisson approximations for the declumped count and the number of competing renewals.

Download Full-text

Improved compound Poisson approximation for the number of occurrences of any rare word family in a stationary markov chain

Advances in Applied Probability ◽

10.1239/aap/1175266472 ◽

2007 ◽

Vol 39 (1) ◽

pp. 128-140 ◽

Cited By ~ 16

Author(s):

Etienne Roquain ◽

Sophie Schbath

Keyword(s):

Markov Chain ◽

Poisson Distribution ◽

Approximation Error ◽

Rare Event ◽

Poisson Approximation ◽

Rare Word ◽

Compound Poisson ◽

Compound Poisson Approximation ◽

Stein Method ◽

Poisson Approximations

We derive a new compound Poisson distribution with explicit parameters to approximate the number of overlapping occurrences of any set of words in a Markovian sequence. Using the Chen-Stein method, we provide a bound for the approximation error. This error converges to 0 under the rare event condition, even for overlapping families, which improves previous results. As a consequence, we also propose Poisson approximations for the declumped count and the number of competing renewals.

Download Full-text

Compound Poisson approximation for the Johnson-Mehl model

Journal of Applied Probability ◽

10.1017/s002190020001528x ◽

2000 ◽

Vol 37 (01) ◽

pp. 101-117

Author(s):

Torkel Erhardsson

Keyword(s):

Total Variation ◽

Poisson Distribution ◽

Approximation Error ◽

Poisson Approximation ◽

Total Variation Distance ◽

Compound Poisson ◽

Compound Poisson Approximation ◽

Random Intervals ◽

Variation Distance ◽

The One

We consider the uncovered set (i.e. the complement of the union of growing random intervals) in the one-dimensional Johnson-Mehl model. Let S(z,L) be the number of components of this set at time z > 0 which intersect (0, L]. An explicit bound is known for the total variation distance between the distribution of S(z,L) and a Poisson distribution, but due to clumping of the components the bound can be rather large. We here give a bound for the total variation distance between the distribution of S(z,L) and a simple compound Poisson distribution (a Pólya-Aeppli distribution). The bound is derived by interpreting S(z,L) as the number of visits to a ‘rare’ set by a Markov chain, and applying results on compound Poisson approximation for Markov chains by Erhardsson. It is shown that under a mild condition, if z→∞ and L→∞ in a proper fashion, then both the Pólya-Aeppli and the Poisson approximation error bounds converge to 0, but the convergence of the former is much faster.

Download Full-text

Compound Poisson approximation for the Johnson-Mehl model

Journal of Applied Probability ◽

10.1239/jap/1014842271 ◽

2000 ◽

Vol 37 (1) ◽

pp. 101-117 ◽

Cited By ~ 2

Author(s):

Torkel Erhardsson

Keyword(s):

Total Variation ◽

Poisson Distribution ◽

Approximation Error ◽

Poisson Approximation ◽

Total Variation Distance ◽

Compound Poisson ◽

Compound Poisson Approximation ◽

Random Intervals ◽

Variation Distance ◽

The One

We consider the uncovered set (i.e. the complement of the union of growing random intervals) in the one-dimensional Johnson-Mehl model. Let S(z,L) be the number of components of this set at time z > 0 which intersect (0, L]. An explicit bound is known for the total variation distance between the distribution of S(z,L) and a Poisson distribution, but due to clumping of the components the bound can be rather large. We here give a bound for the total variation distance between the distribution of S(z,L) and a simple compound Poisson distribution (a Pólya-Aeppli distribution). The bound is derived by interpreting S(z,L) as the number of visits to a ‘rare’ set by a Markov chain, and applying results on compound Poisson approximation for Markov chains by Erhardsson. It is shown that under a mild condition, if z→∞ and L→∞ in a proper fashion, then both the Pólya-Aeppli and the Poisson approximation error bounds converge to 0, but the convergence of the former is much faster.

Download Full-text

Poisson Approximation for the Non-Overlapping Appearances of Several Words in Markov Chains

Combinatorics Probability Computing ◽

10.1017/s096354830100476x ◽

2001 ◽

Vol 10 (4) ◽

pp. 293-308 ◽

Cited By ~ 8

Author(s):

OURANIA CHRYSSAPHINOU ◽

STAVROS PAPASTAVRIDIS ◽

EUTICHIA VAGGELATOU

Keyword(s):

Markov Chain ◽

Markov Chains ◽

State Space ◽

Total Variation ◽

Analogous Result ◽

Poisson Approximation ◽

Total Variation Distance ◽

Numerical Example ◽

Stationary Markov Chain ◽

Variation Distance

Let X1, …, Xn be a sequence of r.v.s produced by a stationary Markov chain with state space an alphabet Ω = {ω1, …, ωq}, q [ges ] 2. We consider a set of words {A1, …, Ar}, r [ges ] 2, with letters from the alphabet Ω. We allow the words to have self-overlaps as well as overlaps between them. Let [Escr ] denote the event of the appearance of a word from the set {A1, …, Ar} at a given position. Moreover, define by N the number of non-overlapping (competing renewal) appearances of [Escr ] in the sequence X1, …, Xn. We derive a bound on the total variation distance between the distribution of N and a Poisson distribution with parameter [ ]N. The Stein–Chen method and combinatorial arguments concerning the structure of words are employed. As a corollary, we obtain an analogous result for the i.i.d. case. Furthermore, we prove that, under quite general conditions, the r.v. N converges in distribution to a Poisson r.v. A numerical example is presented to illustrate the performance of the bound in the Markov case.

Download Full-text

Asymptotic behaviour of sample weighted circuits representing recurrent Markov chains

Journal of Applied Probability ◽

10.1017/s0021900200039103 ◽

1990 ◽

Vol 27 (03) ◽

pp. 545-556 ◽

Cited By ~ 2

Author(s):

S. Kalpazidou

Keyword(s):

Markov Chain ◽

Markov Chains ◽

Asymptotic Behaviour ◽

Probabilistic Interpretation ◽

Stationary Markov Chain

The asymptotic behaviour of the sequence (𝒞 n (ω), wc,n (ω)/n), is studied where 𝒞 n (ω) is the class of all cycles c occurring along the trajectory ωof a recurrent strictly stationary Markov chain (ξ n ) until time n and wc,n (ω) is the number of occurrences of the cycle c until time n. The previous sequence of sample weighted classes converges almost surely to a class of directed weighted cycles (𝒞∞, ω c ) which represents uniquely the chain (ξ n ) as a circuit chain, and ω c is given a probabilistic interpretation.

Download Full-text

On the absorption probabilities and mean time for absorption for discrete Markov chains

Monte Carlo Methods and Applications ◽

10.1515/mcma-2021-2084 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Nikolaos Halidias

Keyword(s):

Markov Chain ◽

Random Walk ◽

Markov Chains ◽

Generating Function ◽

Discrete Time ◽

Probability Generating Function ◽

The Mean ◽

Mean Time ◽

Absorption Probabilities

Abstract In this note we study the probability and the mean time for absorption for discrete time Markov chains. In particular, we are interested in estimating the mean time for absorption when absorption is not certain and connect it with some other known results. Computing a suitable probability generating function, we are able to estimate the mean time for absorption when absorption is not certain giving some applications concerning the random walk. Furthermore, we investigate the probability for a Markov chain to reach a set A before reach B generalizing this result for a sequence of sets A 1 , A 2 , … , A k {A_{1},A_{2},\dots,A_{k}} .

Download Full-text

Proportional lumpability and proportional bisimilarity

Acta Informatica ◽

10.1007/s00236-021-00404-y ◽

2021 ◽

Author(s):

Andrea Marin ◽

Carla Piazza ◽

Sabina Rossi

Keyword(s):

Markov Chain ◽

Markov Chains ◽

Upper And Lower Bounds ◽

General Definition ◽

Performance Indices ◽

State Aggregation ◽

Structural Regularity ◽

Original Definition ◽

Aggregation Technique ◽

Definition Of

AbstractIn this paper, we deal with the lumpability approach to cope with the state space explosion problem inherent to the computation of the stationary performance indices of large stochastic models. The lumpability method is based on a state aggregation technique and applies to Markov chains exhibiting some structural regularity. Moreover, it allows one to efficiently compute the exact values of the stationary performance indices when the model is actually lumpable. The notion of quasi-lumpability is based on the idea that a Markov chain can be altered by relatively small perturbations of the transition rates in such a way that the new resulting Markov chain is lumpable. In this case, only upper and lower bounds on the performance indices can be derived. Here, we introduce a novel notion of quasi-lumpability, named proportional lumpability, which extends the original definition of lumpability but, differently from the general definition of quasi-lumpability, it allows one to derive exact stationary performance indices for the original process. We then introduce the notion of proportional bisimilarity for the terms of the performance process algebra PEPA. Proportional bisimilarity induces a proportional lumpability on the underlying continuous-time Markov chains. Finally, we prove some compositionality results and show the applicability of our theory through examples.

Download Full-text

A Bayesian model for binary Markov chains

International Journal of Mathematics and Mathematical Sciences ◽

10.1155/s0161171204202319 ◽

2004 ◽

Vol 2004 (8) ◽

pp. 421-429 ◽

Cited By ~ 2

Author(s):

Souad Assoudou ◽

Belkheir Essebbar

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Markov Chains ◽

Bayesian Estimation ◽

Bayesian Model ◽

Transition Probabilities ◽

Simulated Data ◽

Bayesian Estimator ◽

Jeffreys Prior ◽

Data Set

This note is concerned with Bayesian estimation of the transition probabilities of a binary Markov chain observed from heterogeneous individuals. The model is founded on the Jeffreys' prior which allows for transition probabilities to be correlated. The Bayesian estimator is approximated by means of Monte Carlo Markov chain (MCMC) techniques. The performance of the Bayesian estimates is illustrated by analyzing a small simulated data set.

Download Full-text