scholarly journals Tree Sampling Divergence: An Information-Theoretic Metric for Hierarchical Graph Clustering

Author(s):  
Bertrand Charpentier ◽  
Thomas Bonald

We introduce the tree sampling divergence (TSD), an information-theoretic metric for assessing the quality of the hierarchical clustering of a graph. Any hierarchical clustering of a graph can be represented as a tree whose nodes correspond to clusters of the graph. The TSD is the Kullback-Leibler divergence between two probability distributions over the nodes of this tree: those induced respectively by sampling at random edges and node pairs of the graph. A fundamental property of the proposed metric is that it is interpretable in terms of graph reconstruction. Specifically, it quantifies the ability to reconstruct the graph from the tree in terms of information loss. In particular, the TSD is maximum when perfect reconstruction is feasible, i.e., when the graph has a complete hierarchical structure. Another key property of TSD is that it applies to any tree, not necessarily binary. In particular, the TSD can be used to compress a binary tree while minimizing the information loss in terms of graph reconstruction, so as to get a compact representation of the hierarchical structure of a graph. We illustrate the behavior of TSD compared to existing metrics on experiments based on both synthetic and real datasets.

Author(s):  
Ryan Ka Yau Lai ◽  
Youngah Do

This article explores a method of creating confidence bounds for information-theoretic measures in linguistics, such as entropy, Kullback-Leibler Divergence (KLD), and mutual information. We show that a useful measure of uncertainty can be derived from simple statistical principles, namely the asymptotic distribution of the maximum likelihood estimator (MLE) and the delta method. Three case studies from phonology and corpus linguistics are used to demonstrate how to apply it and examine its robustness against common violations of its assumptions in linguistics, such as insufficient sample size and non-independence of data points.


2021 ◽  
Vol 18 (2) ◽  
pp. 172988142199958
Author(s):  
Larkin Folsom ◽  
Masahiro Ono ◽  
Kyohei Otsu ◽  
Hyoshin Park

Mission-critical exploration of uncertain environments requires reliable and robust mechanisms for achieving information gain. Typical measures of information gain such as Shannon entropy and KL divergence are unable to distinguish between different bimodal probability distributions or introduce bias toward one mode of a bimodal probability distribution. The use of a standard deviation (SD) metric reduces bias while retaining the ability to distinguish between higher and lower risk distributions. Areas of high SD can be safely explored through observation with an autonomous Mars Helicopter allowing safer and faster path plans for ground-based rovers. First, this study presents a single-agent information-theoretic utility-based path planning method for a highly correlated uncertain environment. Then, an information-theoretic two-stage multiagent rapidly exploring random tree framework is presented, which guides Mars helicopter through regions of high SD to reduce uncertainty for the rover. In a Monte Carlo simulation, we compare our information-theoretic framework with a rover-only approach and a naive approach, in which the helicopter scouts ahead of the rover along its planned path. Finally, the model is demonstrated in a case study on the Jezero region of Mars. Results show that the information-theoretic helicopter improves the travel time for the rover on average when compared with the rover alone or with the helicopter scouting ahead along the rover’s initially planned route.


2021 ◽  
Author(s):  
Jacob Atticus Armstrong Goodall

Abstract A duality theorem is stated and proved for a minimax vector optimization problem where the vectors are elements of the set of products of compact Polish spaces. A special case of this theorem is derived to show that two metrics on the space of probability distributions on countable products of Polish spaces are identical. The appendix includes a proof that, under the appropriate conditions, the function studied in the optimisation problem is indeed a metric. The optimisation problem is comparable to multi-commodity optimal transport where there is dependence between commodities. This paper builds on the work of R.S. MacKay who introduced the metrics in the context of complexity science in [4] and [5]. The metrics have the advantage of measuring distance uniformly over the whole network while other metrics on probability distributions fail to do so (e.g total variation, Kullback–Leibler divergence, see [5]). This opens up the potential of mathematical optimisation in the setting of complexity science.


2002 ◽  
Vol 11 (1) ◽  
pp. 79-95 ◽  
Author(s):  
DUDLEY STARK ◽  
A. GANESH ◽  
NEIL O’CONNELL

We study the asymptotic behaviour of the relative entropy (to stationarity) for a commonly used model for riffle shuffling a deck of n cards m times. Our results establish and were motivated by a prediction in a recent numerical study of Trefethen and Trefethen. Loosely speaking, the relative entropy decays approximately linearly (in m) for m < log2n, and approximately exponentially for m > log2n. The deck becomes random in this information-theoretic sense after m = 3/2 log2n shuffles.


Author(s):  
M. Vidyasagar

This chapter provides an introduction to some elementary aspects of information theory, including entropy in its various forms. Entropy refers to the level of uncertainty associated with a random variable (or more precisely, the probability distribution of the random variable). When there are two or more random variables, it is worthwhile to study the conditional entropy of one random variable with respect to another. The last concept is relative entropy, also known as the Kullback–Leibler divergence, which measures the “disparity” between two probability distributions. The chapter first considers convex and concave functions before discussing the properties of the entropy function, conditional entropy, uniqueness of the entropy function, and the Kullback–Leibler divergence.


2020 ◽  
Vol 2020 (9) ◽  
Author(s):  
Steven B. Giddings ◽  
Gustavo J. Turiaci

Abstract We investigate contributions of spacetime wormholes, describing baby universe emission and absorption, to calculations of entropies and correlation functions, for example those based on the replica method. We find that the rules of the “wormhole calculus”, developed in the 1980s, together with standard quantum mechanical prescriptions for computing entropies and correlators, imply definite rules for limited patterns of connection between replica factors in simple calculations. These results stand in contrast with assumptions that all topologies connecting replicas should be summed over, and call into question the explanation for the latter. In a “free” approximation baby universes introduce probability distributions for coupling constants, and we review and extend arguments that successive experiments in a “parent” universe increasingly precisely fix such couplings, resulting in ultimately pure evolution. Once this has happened, the nontrivial question remains of how topology-changing effects can modify the standard description of black hole information loss.


2011 ◽  
Vol 11 (2-3) ◽  
pp. 263-296 ◽  
Author(s):  
SHAY B. COHEN ◽  
ROBERT J. SIMMONS ◽  
NOAH A. SMITH

AbstractWeighted logic programming, a generalization of bottom-up logic programming, is a well-suited framework for specifying dynamic programming algorithms. In this setting, proofs correspond to the algorithm's output space, such as a path through a graph or a grammatical derivation, and are given a real-valued score (often interpreted as a probability) that depends on the real weights of the base axioms used in the proof. The desired output is a function over all possible proofs, such as a sum of scores or an optimal score. We describe the product transformation, which can merge two weighted logic programs into a new one. The resulting program optimizes a product of proof scores from the original programs, constituting a scoring function known in machine learning as a “product of experts.” Through the addition of intuitive constraining side conditions, we show that several important dynamic programming algorithms can be derived by applying product to weighted logic programs corresponding to simpler weighted logic programs. In addition, we show how the computation of Kullback–Leibler divergence, an information-theoretic measure, can be interpreted using product.


1994 ◽  
Vol 72 (3-4) ◽  
pp. 130-133
Author(s):  
Paul B. Slater

Guiasu employed a statistical estimation principle to derive time-independent Schrödinger equations for the position but, as is usual, not the spin of a particle. Here, on the other hand, this principle is used to obtain Schrödinger-like equations for the spin but not the position of a particle. Steady states are described by continuous probability distributions, obtained by information-theoretic arguments, over spin measurements, states, and wave functions. These distributions serve as weight functions for orthogonal polynomials. Associated "wave functions," products of the polynomials and the square root of the weight function, satisfy differential equations, reducing to time-independent Schrödinger form at the point corresponding to the fully mixed spin-1/2 state.


2011 ◽  
Vol 139 (7) ◽  
pp. 2156-2162 ◽  
Author(s):  
Steven V. Weijs ◽  
Nick van de Giesen

Abstract Recently, an information-theoretical decomposition of Kullback–Leibler divergence into uncertainty, reliability, and resolution was introduced. In this article, this decomposition is generalized to the case where the observation is uncertain. Along with a modified decomposition of the divergence score, a second measure, the cross-entropy score, is presented, which measures the estimated information loss with respect to the truth instead of relative to the uncertain observations. The difference between the two scores is equal to the average observational uncertainty and vanishes when observations are assumed to be perfect. Not acknowledging for observation uncertainty can lead to both overestimation and underestimation of forecast skill, depending on the nature of the noise process.


2013 ◽  
Vol 427-429 ◽  
pp. 1537-1543 ◽  
Author(s):  
Ya Fen Wang ◽  
Feng Zhen Zhang ◽  
Shan Jian Liu ◽  
Meng Huang

In this paper, we study an information theoretic approach to image similarity measurement for content-base image retrieval. In this novel scheme, similarities are measured by the amount of information the images contained about one another mutual information (MI). The given approach is based on the premise that two similar images should have high mutual information, or equivalently, the querying image should convey high information about those similar to it. The method first generates a set of statistically representative visual patterns and uses the distributions of these patterns as images content descriptors. To measure the similarity of two images, we develop a method to compute the mutual information between their content descriptors. Two images with larger descriptor mutual information are regarded as more similar. We present experimental results, which demonstrate that mutual information is a more effective image similarity measure than those have been used in the literature such as Kullback-Leibler divergence and L2 norms.


Sign in / Sign up

Export Citation Format

Share Document