Information-theoretic inequalities for contoured probability distributions

Mission-critical exploration of uncertain environments requires reliable and robust mechanisms for achieving information gain. Typical measures of information gain such as Shannon entropy and KL divergence are unable to distinguish between different bimodal probability distributions or introduce bias toward one mode of a bimodal probability distribution. The use of a standard deviation (SD) metric reduces bias while retaining the ability to distinguish between higher and lower risk distributions. Areas of high SD can be safely explored through observation with an autonomous Mars Helicopter allowing safer and faster path plans for ground-based rovers. First, this study presents a single-agent information-theoretic utility-based path planning method for a highly correlated uncertain environment. Then, an information-theoretic two-stage multiagent rapidly exploring random tree framework is presented, which guides Mars helicopter through regions of high SD to reduce uncertainty for the rover. In a Monte Carlo simulation, we compare our information-theoretic framework with a rover-only approach and a naive approach, in which the helicopter scouts ahead of the rover along its planned path. Finally, the model is demonstrated in a case study on the Jezero region of Mars. Results show that the information-theoretic helicopter improves the travel time for the rover on average when compared with the rover alone or with the helicopter scouting ahead along the rover’s initially planned route.

Download Full-text

A Unifying Variational Perspective on Some Fundamental Information Theoretic Inequalities

IEEE Transactions on Information Theory ◽

10.1109/tit.2013.2274514 ◽

2013 ◽

Vol 59 (11) ◽

pp. 7132-7148 ◽

Cited By ~ 1

Author(s):

Sangwoo Park ◽

Erchin Serpedin ◽

Khalid Qaraqe

Keyword(s):

Information Theoretic ◽

Fundamental Information ◽

Information Theoretic Inequalities

Download Full-text

Correction to “A Unifying Variational Perspective on Some Fundamental Information Theoretic Inequalities”

IEEE Transactions on Information Theory ◽

10.1109/tit.2016.2568208 ◽

2016 ◽

Vol 62 (7) ◽

pp. 4356-4357

Author(s):

Sangwoo Park ◽

Erchin Serpedin ◽

Khalid Qaraqe

Keyword(s):

Information Theoretic ◽

Fundamental Information ◽

Information Theoretic Inequalities

Download Full-text

Rearrangements and information theoretic inequalities

2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton) ◽

10.1109/allerton.2018.8636000 ◽

2018 ◽

Author(s):

James Melbourne

Keyword(s):

Information Theoretic ◽

Information Theoretic Inequalities

Download Full-text

Schrödinger-Iike equations for spin measurements, states, and wave functions, based on the statistical approach of Guiasu

Canadian Journal of Physics ◽

10.1139/p94-021 ◽

1994 ◽

Vol 72 (3-4) ◽

pp. 130-133

Author(s):

Paul B. Slater

Keyword(s):

Statistical Estimation ◽

Statistical Approach ◽

Probability Distributions ◽

Wave Functions ◽

The Other ◽

Weight Functions ◽

Information Theoretic ◽

Mixed Spin ◽

Spin Measurements ◽

Schrödinger Form

Guiasu employed a statistical estimation principle to derive time-independent Schrödinger equations for the position but, as is usual, not the spin of a particle. Here, on the other hand, this principle is used to obtain Schrödinger-like equations for the spin but not the position of a particle. Steady states are described by continuous probability distributions, obtained by information-theoretic arguments, over spin measurements, states, and wave functions. These distributions serve as weight functions for orthogonal polynomials. Associated "wave functions," products of the polynomials and the square root of the weight function, satisfy differential equations, reducing to time-independent Schrödinger form at the point corresponding to the fully mixed spin-1/2 state.

Download Full-text

Adaptive input data transformation for improved network reconstruction with information theoretic algorithms

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2016-0013 ◽

2016 ◽

Vol 15 (6) ◽

Author(s):

Venkateshan Kannan ◽

Jesper Tegner

Keyword(s):

Probability Distributions ◽

Reconstruction Algorithm ◽

Data Transformation ◽

Numerical Estimation ◽

Finite Sample ◽

Information Theoretic ◽

Adaptive Partitioning ◽

Sample Data ◽

Shared Information ◽

Information Metric

AbstractWe propose a novel systematic procedure of non-linear data transformation for an adaptive algorithm in the context of network reverse-engineering using information theoretic methods. Our methodology is rooted in elucidating and correcting for the specific biases in the estimation techniques for mutual information (MI) given a finite sample of data. These are, in turn, tied to lack of well-defined bounds for numerical estimation of MI for continuous probability distributions from finite data. The nature and properties of the inevitable bias is described, complemented by several examples illustrating their form and variation. We propose an adaptive partitioning scheme for MI estimation that effectively transforms the sample data using parameters determined from its local and global distribution guaranteeing a more robust and reliable reconstruction algorithm. Together with a normalized measure (Shared Information Metric) we report considerably enhanced performance both for

Download Full-text

Revisiting Probability Distribution Assumptions for Information Theoretic Feature Selection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6050 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5908-5915

Author(s):

Yuan Sun ◽

Wei Wang ◽

Michael Kirley ◽

Xiaodong Li ◽

Jeffrey Chan

Keyword(s):

Feature Selection ◽

Probability Distributions ◽

Empirical Studies ◽

Geometric Mean ◽

Big Data Analytics ◽

Theoretic Approach ◽

Classification Problems ◽

Information Theoretic ◽

Class Labels ◽

Information Theoretic Approach

Feature selection has been shown to be beneficial for many data mining and machine learning tasks, especially for big data analytics. Mutual Information (MI) is a well-known information-theoretic approach used to evaluate the relevance of feature subsets and class labels. However, estimating high-dimensional MI poses significant challenges. Consequently, a great deal of research has focused on using low-order MI approximations or computing a lower bound on MI called Variational Information (VI). These methods often require certain assumptions made on the probability distributions of features such that these distributions are realistic yet tractable to compute. In this paper, we reveal two sets of distribution assumptions underlying many MI and VI based methods: Feature Independence Distribution and Geometric Mean Distribution. We systematically analyze their strengths and weaknesses and propose a logical extension called Arithmetic Mean Distribution, which leads to an unbiased and normalised estimation of probability densities. We conduct detailed empirical studies across a suite of 29 real-world classification problems and illustrate improved prediction accuracy of our methods based on the identification of more informative features, thus providing support for our theoretical findings.

Download Full-text

Tests for quantum contextuality in terms of Q-entropies

Quantum Information and Computation ◽

10.26421/qic14.11-12-7 ◽

2014 ◽

Vol 14 (11&12) ◽

pp. 996-1013

Author(s):

Alexey E. Rastegin

Keyword(s):

Probability Distribution ◽

Probability Distributions ◽

Joint Probability ◽

Joint Probability Distribution ◽

Theoretic Approach ◽

Entropic Inequalities ◽

Information Theoretic ◽

The Family ◽

Information Theoretic Approach ◽

New Inequalities

The information-theoretic approach to Bell's theorem is developed with use of the conditional $q$-entropies. The $q$-entropic measures fulfill many similar properties to the standard Shannon entropy. In general, both the locality and noncontextuality notions are usually treated with use of the so-called marginal scenarios. These hypotheses lead to the existence of a joint probability distribution, which marginalizes to all particular ones. Assuming the existence of such a joint probability distribution, we derive the family of inequalities of Bell's type in terms of conditional $q$-entropies for all $q\geq1$. Quantum violations of the new inequalities are exemplified within the Clauser--Horne--Shimony--Holt (CHSH) and Klyachko--Can--Binicio\v{g}lu--Shumovsky (KCBS) scenarios. An extension to the case of $n$-cycle scenario is briefly mentioned. The new inequalities with conditional $q$-entropies allow to expand a class of probability distributions, for which the nonlocality or contextuality can be detected within entropic formulation. The $q$-entropic inequalities can also be useful in analyzing cases with detection inefficiencies. Using two models of such a kind, we consider some potential advantages of the $q$-entropic formulation.

Download Full-text

Tree Sampling Divergence: An Information-Theoretic Metric for Hierarchical Graph Clustering

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/286 ◽

2019 ◽

Author(s):

Bertrand Charpentier ◽

Thomas Bonald

Keyword(s):

Hierarchical Clustering ◽

Hierarchical Structure ◽

Probability Distributions ◽

Fundamental Property ◽

Information Loss ◽

Perfect Reconstruction ◽

Compact Representation ◽

Information Theoretic ◽

Graph Reconstruction ◽

Leibler Divergence

We introduce the tree sampling divergence (TSD), an information-theoretic metric for assessing the quality of the hierarchical clustering of a graph. Any hierarchical clustering of a graph can be represented as a tree whose nodes correspond to clusters of the graph. The TSD is the Kullback-Leibler divergence between two probability distributions over the nodes of this tree: those induced respectively by sampling at random edges and node pairs of the graph. A fundamental property of the proposed metric is that it is interpretable in terms of graph reconstruction. Specifically, it quantifies the ability to reconstruct the graph from the tree in terms of information loss. In particular, the TSD is maximum when perfect reconstruction is feasible, i.e., when the graph has a complete hierarchical structure. Another key property of TSD is that it applies to any tree, not necessarily binary. In particular, the TSD can be used to compress a binary tree while minimizing the information loss in terms of graph reconstruction, so as to get a compact representation of the hierarchical structure of a graph. We illustrate the behavior of TSD compared to existing metrics on experiments based on both synthetic and real datasets.

Download Full-text

Scalable Automated Proving of Information Theoretic Inequalities with Proximal Algorithms

2019 IEEE International Symposium on Information Theory (ISIT) ◽

10.1109/isit.2019.8849799 ◽

2019 ◽

Cited By ~ 2

Author(s):

Lin Ling ◽

Chee Wei Tan ◽

Siu-Wai Ho ◽

Raymond W. Yeung

Keyword(s):

Proximal Algorithms ◽

Information Theoretic ◽

Information Theoretic Inequalities

Download Full-text