A Finite-Sample, Distribution-Free, Probabilistic Lower Bound on Mutual Information

2011 ◽  
Vol 23 (7) ◽  
pp. 1862-1898 ◽  
Author(s):  
Nathan D. VanderKraats ◽  
Arunava Banerjee

For any memoryless communication channel with a binary-valued input and a one-dimensional real-valued output, we introduce a probabilistic lower bound on the mutual information given empirical observations on the channel. The bound is built on the Dvoretzky-Kiefer-Wolfowitz inequality and is distribution free. A quadratic time algorithm is described for computing the bound and its corresponding class-conditional distribution functions. We compare our approach to existing techniques and show the superiority of our bound to a method inspired by Fano’s inequality where the continuous random variable is discretized.

1971 ◽  
Vol 8 (03) ◽  
pp. 431-453 ◽  
Author(s):  
J. Durbin

Let w(t), 0 ≦ t ≦ ∞, be a Brownian motion process, i.e., a zero-mean separable normal process with Pr{w(0) = 0} = 1, E{w(t 1)w(t 2)}= min (t 1, t 2), and let a, b denote the boundaries defined by y = a(t), y = b(t), where b(0) < 0 < a(0) and b(t) < a(t), 0 ≦ t ≦ T ≦ ∞. A basic problem in many fields such as diffusion theory, gambler's ruin, collective risk, Kolmogorov-Smirnov statistics, cumulative-sum methods, sequential analysis and optional stopping is that of calculating the probability that a sample path of w(t) crosses a or b before t = T. This paper shows how this probability may be computed for sufficiently smooth boundaries by numerical solution of integral equations for the first-passage distribution functions. The technique used is to approximate the integral equations by linear recursions whose coefficients are estimated by linearising the boundaries within subintervals. The results are extended to cover the tied-down process subject to the condition w(1) = 0. Some related results for the Poisson process and the sample distribution function are given. The procedures suggested are exemplified numerically, first by computing the probability that the tied-down Brownian motion process crosses a particular curved boundary for which the true probability is known, and secondly by computing the finite-sample and asymptotic powers of the Kolmogorov-Smirnov test against a shift in mean of the exponential distribution.


1971 ◽  
Vol 8 (3) ◽  
pp. 431-453 ◽  
Author(s):  
J. Durbin

Let w(t), 0 ≦ t ≦ ∞, be a Brownian motion process, i.e., a zero-mean separable normal process with Pr{w(0) = 0} = 1, E{w(t1)w(t2)}= min (t1, t2), and let a, b denote the boundaries defined by y = a(t), y = b(t), where b(0) < 0 < a(0) and b(t) < a(t), 0 ≦ t ≦ T ≦ ∞. A basic problem in many fields such as diffusion theory, gambler's ruin, collective risk, Kolmogorov-Smirnov statistics, cumulative-sum methods, sequential analysis and optional stopping is that of calculating the probability that a sample path of w(t) crosses a or b before t = T. This paper shows how this probability may be computed for sufficiently smooth boundaries by numerical solution of integral equations for the first-passage distribution functions. The technique used is to approximate the integral equations by linear recursions whose coefficients are estimated by linearising the boundaries within subintervals. The results are extended to cover the tied-down process subject to the condition w(1) = 0. Some related results for the Poisson process and the sample distribution function are given. The procedures suggested are exemplified numerically, first by computing the probability that the tied-down Brownian motion process crosses a particular curved boundary for which the true probability is known, and secondly by computing the finite-sample and asymptotic powers of the Kolmogorov-Smirnov test against a shift in mean of the exponential distribution.


2020 ◽  
Vol 21 (3) ◽  
pp. 201-216
Author(s):  
Ingo Hoffmann ◽  
Christoph J. Börner

Purpose This paper aims to evaluate the accuracy of a quantile estimate. Especially when estimating high quantiles from a few data, the quantile estimator itself is a random number with its own distribution. This distribution is first determined and then it is shown how the accuracy of the quantile estimation can be assessed in practice. Design/methodology/approach The paper considers the situation that the parent distribution of the data is unknown, the tail is modeled with the generalized pareto distribution and the quantile is finally estimated using the fitted tail model. Based on well-known theoretical preliminary studies, the finite sample distribution of the quantile estimator is determined and the accuracy of the estimator is quantified. Findings In general, the algebraic representation of the finite sample distribution of the quantile estimator was found. With the distribution, all statistical quantities can be determined. In particular, the expected value, the variance and the bias of the quantile estimator are calculated to evaluate the accuracy of the estimation process. Scaling laws could be derived and it turns out that with a fat tail and few data, the bias and the variance increase massively. Research limitations/implications Currently, the research is limited to the form of the tail, which is interesting for the financial sector. Future research might consider problems where the tail has a finite support or the tail is over-fat. Practical implications The ability to calculate error bands and the bias for the quantile estimator is equally important for financial institutions, as well as regulators and auditors. Originality/value Understanding the quantile estimator as a random variable and analyzing and evaluating it based on its distribution gives researchers, regulators, auditors and practitioners new opportunities to assess risk.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 533
Author(s):  
Milan S. Derpich ◽  
Jan Østergaard

We present novel data-processing inequalities relating the mutual information and the directed information in systems with feedback. The internal deterministic blocks within such systems are restricted only to be causal mappings, but are allowed to be non-linear and time varying, and randomized by their own external random input, can yield any stochastic mapping. These randomized blocks can for example represent source encoders, decoders, or even communication channels. Moreover, the involved signals can be arbitrarily distributed. Our first main result relates mutual and directed information and can be interpreted as a law of conservation of information flow. Our second main result is a pair of data-processing inequalities (one the conditional version of the other) between nested pairs of random sequences entirely within the closed loop. Our third main result introduces and characterizes the notion of in-the-loop (ITL) transmission rate for channel coding scenarios in which the messages are internal to the loop. Interestingly, in this case the conventional notions of transmission rate associated with the entropy of the messages and of channel capacity based on maximizing the mutual information between the messages and the output turn out to be inadequate. Instead, as we show, the ITL transmission rate is the unique notion of rate for which a channel code attains zero error probability if and only if such an ITL rate does not exceed the corresponding directed information rate from messages to decoded messages. We apply our data-processing inequalities to show that the supremum of achievable (in the usual channel coding sense) ITL transmission rates is upper bounded by the supremum of the directed information rate across the communication channel. Moreover, we present an example in which this upper bound is attained. Finally, we further illustrate the applicability of our results by discussing how they make possible the generalization of two fundamental inequalities known in networked control literature.


Sign in / Sign up

Export Citation Format

Share Document