Information-theoretic inequalities for contoured probability distributions

2002 ◽  
Vol 48 (8) ◽  
pp. 2377-2383 ◽  
Author(s):  
O.G. Guleryuz ◽  
E. Lutwak ◽  
Deane Yang ◽  
Gaoyong Zhang
2021 ◽  
Vol 18 (2) ◽  
pp. 172988142199958
Author(s):  
Larkin Folsom ◽  
Masahiro Ono ◽  
Kyohei Otsu ◽  
Hyoshin Park

Mission-critical exploration of uncertain environments requires reliable and robust mechanisms for achieving information gain. Typical measures of information gain such as Shannon entropy and KL divergence are unable to distinguish between different bimodal probability distributions or introduce bias toward one mode of a bimodal probability distribution. The use of a standard deviation (SD) metric reduces bias while retaining the ability to distinguish between higher and lower risk distributions. Areas of high SD can be safely explored through observation with an autonomous Mars Helicopter allowing safer and faster path plans for ground-based rovers. First, this study presents a single-agent information-theoretic utility-based path planning method for a highly correlated uncertain environment. Then, an information-theoretic two-stage multiagent rapidly exploring random tree framework is presented, which guides Mars helicopter through regions of high SD to reduce uncertainty for the rover. In a Monte Carlo simulation, we compare our information-theoretic framework with a rover-only approach and a naive approach, in which the helicopter scouts ahead of the rover along its planned path. Finally, the model is demonstrated in a case study on the Jezero region of Mars. Results show that the information-theoretic helicopter improves the travel time for the rover on average when compared with the rover alone or with the helicopter scouting ahead along the rover’s initially planned route.


1994 ◽  
Vol 72 (3-4) ◽  
pp. 130-133
Author(s):  
Paul B. Slater

Guiasu employed a statistical estimation principle to derive time-independent Schrödinger equations for the position but, as is usual, not the spin of a particle. Here, on the other hand, this principle is used to obtain Schrödinger-like equations for the spin but not the position of a particle. Steady states are described by continuous probability distributions, obtained by information-theoretic arguments, over spin measurements, states, and wave functions. These distributions serve as weight functions for orthogonal polynomials. Associated "wave functions," products of the polynomials and the square root of the weight function, satisfy differential equations, reducing to time-independent Schrödinger form at the point corresponding to the fully mixed spin-1/2 state.


Author(s):  
Venkateshan Kannan ◽  
Jesper Tegner

AbstractWe propose a novel systematic procedure of non-linear data transformation for an adaptive algorithm in the context of network reverse-engineering using information theoretic methods. Our methodology is rooted in elucidating and correcting for the specific biases in the estimation techniques for mutual information (MI) given a finite sample of data. These are, in turn, tied to lack of well-defined bounds for numerical estimation of MI for continuous probability distributions from finite data. The nature and properties of the inevitable bias is described, complemented by several examples illustrating their form and variation. We propose an adaptive partitioning scheme for MI estimation that effectively transforms the sample data using parameters determined from its local and global distribution guaranteeing a more robust and reliable reconstruction algorithm. Together with a normalized measure (Shared Information Metric) we report considerably enhanced performance both for


2020 ◽  
Vol 34 (04) ◽  
pp. 5908-5915
Author(s):  
Yuan Sun ◽  
Wei Wang ◽  
Michael Kirley ◽  
Xiaodong Li ◽  
Jeffrey Chan

Feature selection has been shown to be beneficial for many data mining and machine learning tasks, especially for big data analytics. Mutual Information (MI) is a well-known information-theoretic approach used to evaluate the relevance of feature subsets and class labels. However, estimating high-dimensional MI poses significant challenges. Consequently, a great deal of research has focused on using low-order MI approximations or computing a lower bound on MI called Variational Information (VI). These methods often require certain assumptions made on the probability distributions of features such that these distributions are realistic yet tractable to compute. In this paper, we reveal two sets of distribution assumptions underlying many MI and VI based methods: Feature Independence Distribution and Geometric Mean Distribution. We systematically analyze their strengths and weaknesses and propose a logical extension called Arithmetic Mean Distribution, which leads to an unbiased and normalised estimation of probability densities. We conduct detailed empirical studies across a suite of 29 real-world classification problems and illustrate improved prediction accuracy of our methods based on the identification of more informative features, thus providing support for our theoretical findings.


2014 ◽  
Vol 14 (11&12) ◽  
pp. 996-1013
Author(s):  
Alexey E. Rastegin

The information-theoretic approach to Bell's theorem is developed with use of the conditional $q$-entropies. The $q$-entropic measures fulfill many similar properties to the standard Shannon entropy. In general, both the locality and noncontextuality notions are usually treated with use of the so-called marginal scenarios. These hypotheses lead to the existence of a joint probability distribution, which marginalizes to all particular ones. Assuming the existence of such a joint probability distribution, we derive the family of inequalities of Bell's type in terms of conditional $q$-entropies for all $q\geq1$. Quantum violations of the new inequalities are exemplified within the Clauser--Horne--Shimony--Holt (CHSH) and Klyachko--Can--Binicio\v{g}lu--Shumovsky (KCBS) scenarios. An extension to the case of $n$-cycle scenario is briefly mentioned. The new inequalities with conditional $q$-entropies allow to expand a class of probability distributions, for which the nonlocality or contextuality can be detected within entropic formulation. The $q$-entropic inequalities can also be useful in analyzing cases with detection inefficiencies. Using two models of such a kind, we consider some potential advantages of the $q$-entropic formulation.


Author(s):  
Bertrand Charpentier ◽  
Thomas Bonald

We introduce the tree sampling divergence (TSD), an information-theoretic metric for assessing the quality of the hierarchical clustering of a graph. Any hierarchical clustering of a graph can be represented as a tree whose nodes correspond to clusters of the graph. The TSD is the Kullback-Leibler divergence between two probability distributions over the nodes of this tree: those induced respectively by sampling at random edges and node pairs of the graph. A fundamental property of the proposed metric is that it is interpretable in terms of graph reconstruction. Specifically, it quantifies the ability to reconstruct the graph from the tree in terms of information loss. In particular, the TSD is maximum when perfect reconstruction is feasible, i.e., when the graph has a complete hierarchical structure. Another key property of TSD is that it applies to any tree, not necessarily binary. In particular, the TSD can be used to compress a binary tree while minimizing the information loss in terms of graph reconstruction, so as to get a compact representation of the hierarchical structure of a graph. We illustrate the behavior of TSD compared to existing metrics on experiments based on both synthetic and real datasets.


Sign in / Sign up

Export Citation Format

Share Document