Data compression to define information content of hydrological time series

Abstract. When inferring models from hydrological data or calibrating hydrological models, we might be interested in the information content of those data to quantify how much can potentially be learned from them. In this work we take a perspective from (algorithmic) information theory (AIT) to discuss some underlying issues regarding this question. In the information-theoretical framework, there is a strong link between information content and data compression. We exploit this by using data compression performance as a time series analysis tool and highlight the analogy to information content, prediction, and learning (understanding is compression). The analysis is performed on time series of a set of catchments, searching for the mechanisms behind compressibility. We discuss both the deeper foundation from algorithmic information theory, some practical results and the inherent difficulties in answering the question: "How much information is contained in this data?". The conclusion is that the answer to this question can only be given once the following counter-questions have been answered: (1) Information about which unknown quantities? (2) What is your current state of knowledge/beliefs about those quantities? Quantifying information content of hydrological data is closely linked to the question of separating aleatoric and epistemic uncertainty and quantifying maximum possible model performance, as addressed in current hydrological literature. The AIT perspective teaches us that it is impossible to answer this question objectively, without specifying prior beliefs. These beliefs are related to the maximum complexity one is willing to accept as a law and what is considered as random.

Download Full-text

Data compression to define information content of hydrological time series

Hydrology and Earth System Sciences ◽

10.5194/hess-17-3171-2013 ◽

2013 ◽

Vol 17 (8) ◽

pp. 3171-3187 ◽

Cited By ~ 15

Author(s):

S. V. Weijs ◽

N. van de Giesen ◽

M. B. Parlange

Keyword(s):

Information Theory ◽

Time Series ◽

Data Compression ◽

Information Content ◽

Epistemic Uncertainty ◽

Analysis Tool ◽

Algorithmic Information Theory ◽

Data Set ◽

Hydrological Data ◽

Algorithmic Information

Abstract. When inferring models from hydrological data or calibrating hydrological models, we are interested in the information content of those data to quantify how much can potentially be learned from them. In this work we take a perspective from (algorithmic) information theory, (A)IT, to discuss some underlying issues regarding this question. In the information-theoretical framework, there is a strong link between information content and data compression. We exploit this by using data compression performance as a time series analysis tool and highlight the analogy to information content, prediction and learning (understanding is compression). The analysis is performed on time series of a set of catchments. We discuss both the deeper foundation from algorithmic information theory, some practical results and the inherent difficulties in answering the following question: "How much information is contained in this data set?". The conclusion is that the answer to this question can only be given once the following counter-questions have been answered: (1) information about which unknown quantities? and (2) what is your current state of knowledge/beliefs about those quantities? Quantifying information content of hydrological data is closely linked to the question of separating aleatoric and epistemic uncertainty and quantifying maximum possible model performance, as addressed in the current hydrological literature. The AIT perspective teaches us that it is impossible to answer this question objectively without specifying prior beliefs.

Download Full-text

Intelligence and Unambitiousness Using Algorithmic Information Theory

IEEE Journal on Selected Areas in Information Theory ◽

10.1109/jsait.2021.3073844 ◽

2021 ◽

pp. 1-1

Author(s):

Michael K. Cohen ◽

Badri Vellambi ◽

Marcus Hutter

Keyword(s):

Information Theory ◽

Algorithmic Information Theory ◽

Algorithmic Information

Download Full-text

Understanding how replication processes can maintain systems away from equilibrium using Algorithmic Information Theory

Biosystems ◽

10.1016/j.biosystems.2015.11.008 ◽

2016 ◽

Vol 140 ◽

pp. 8-22 ◽

Cited By ~ 3

Author(s):

Sean D. Devine

Keyword(s):

Information Theory ◽

Algorithmic Information Theory ◽

Algorithmic Information

Download Full-text

A NOTE ON MONTE CARLO PRIMALITY TESTS AND ALGORITHMIC INFORMATION THEORY

Information, Randomness & Incompleteness ◽

10.1142/9789814434058_0016 ◽

1987 ◽

pp. 191-196

Author(s):

GREGORY J. CHAITIN ◽

JACOB T. SCHWARTZ

Keyword(s):

Information Theory ◽

Monte Carlo ◽

Algorithmic Information Theory ◽

Algorithmic Information

Download Full-text

Distributed Compression through the Lens of Algorithmic Information Theory: A Primer

Mathematics Almost Everywhere ◽

10.1142/9789813237315_0004 ◽

2018 ◽

pp. 47-66

Author(s):

Marius Zimand

Keyword(s):

Information Theory ◽

Algorithmic Information Theory ◽

Algorithmic Information ◽

Distributed Compression

Download Full-text

Independence Conservation and Evolutionary Algorithms

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.2.1.holloway.2 ◽

2020 ◽

Vol 2 (1) ◽

pp. 32-35

Author(s):

Eric Holloway

Keyword(s):

Information Theory ◽

Evolutionary Algorithms ◽

Algorithmic Information Theory ◽

Fitness Functions ◽

Information Law ◽

Algorithmic Information

Leonid Levin developed the first stochastic conservation of information law, describing it as "torturing an uninformed witness cannot give information about the crime." Levin's law unifies both the deterministic and stochastic cases of conservation of information. A proof of Levin's law from Algorithmic Information Theory is given as well as a discussion of its implications in evolutionary algorithms and fitness functions.

Download Full-text

Zipf’s Law, unbounded complexity and open-ended evolution

Journal of The Royal Society Interface ◽

10.1098/rsif.2018.0395 ◽

2018 ◽

Vol 15 (149) ◽

pp. 20180395 ◽

Cited By ~ 7

Author(s):

Bernat Corominas-Murtra ◽

Luís F. Seoane ◽

Ricard Solé

Keyword(s):

Information Theory ◽

Information Content ◽

Multiple Scales ◽

Evolutionary Process ◽

Zipf’S Law ◽

Shannon Information ◽

Zipf's Law ◽

Paradoxical Situation ◽

Algorithmic Information ◽

Standard Information

A major problem for evolutionary theory is understanding the so-called open-ended nature of evolutionary change, from its definition to its origins. Open-ended evolution (OEE) refers to the unbounded increase in complexity that seems to characterize evolution on multiple scales. This property seems to be a characteristic feature of biological and technological evolution and is strongly tied to the generative potential associated with combinatorics, which allows the system to grow and expand their available state spaces. Interestingly, many complex systems presumably displaying OEE, from language to proteins, share a common statistical property: the presence of Zipf’s Law. Given an inventory of basic items (such as words or protein domains) required to build more complex structures (sentences or proteins) Zipf’s Law tells us that most of these elements are rare whereas a few of them are extremely common. Using algorithmic information theory, in this paper we provide a fundamental definition for open-endedness, which can be understood as postulates . Its statistical counterpart, based on standard Shannon information theory, has the structure of a variational problem which is shown to lead to Zipf’s Law as the expected consequence of an evolutionary process displaying OEE. We further explore the problem of information conservation through an OEE process and we conclude that statistical information (standard Shannon information) is not conserved, resulting in the paradoxical situation in which the increase of information content has the effect of erasing itself. We prove that this paradox is solved if we consider non-statistical forms of information. This last result implies that standard information theory may not be a suitable theoretical framework to explore the persistence and increase of the information content in OEE systems.

Download Full-text