Model Selection Using Information Theory and the MDL Principle

Abstract Estimating temporal changes in a target population from phylogenetic or count data is an important problem in ecology and epidemiology. Reliable estimates can provide key insights into the climatic and biological drivers influencing the diversity or structure of that population and evidence hypotheses concerning its future growth or decline. In infectious disease applications, the individuals infected across an epidemic form the target population. The renewal model estimates the effective reproduction number, R, of the epidemic from counts of observed incident cases. The skyline model infers the effective population size, N, underlying a phylogeny of sequences sampled from that epidemic. Practically, R measures ongoing epidemic growth while N informs on historical caseload. While both models solve distinct problems, the reliability of their estimates depends on p-dimensional piecewise-constant functions. If p is misspecified, the model might underfit significant changes or overfit noise and promote a spurious understanding of the epidemic, which might misguide intervention policies or misinform forecasts. Surprisingly, no transparent yet principled approach for optimizing p exists. Usually, p is heuristically set, or obscurely controlled via complex algorithms. We present a computable and interpretable p-selection method based on the minimum description length (MDL) formalism of information theory. Unlike many standard model selection techniques, MDL accounts for the additional statistical complexity induced by how parameters interact. As a result, our method optimizes p so that R and N estimates properly and meaningfully adapt to available data. It also outperforms comparable Akaike and Bayesian information criteria on several classification problems, given minimal knowledge of the parameter space, and exposes statistical similarities among renewal, skyline, and other models in biology. Rigorous and interpretable model selection is necessary if trustworthy and justifiable conclusions are to be drawn from piecewise models. [Coalescent processes; epidemiology; information theory; model selection; phylodynamics; renewal models; skyline plots]

Download Full-text

Information-theory-based model selection for determining the main vector and period of transmission of Potato virus Y

Annals of Applied Biology ◽

10.1111/j.1744-7348.2011.00501.x ◽

2011 ◽

Vol 159 (3) ◽

pp. 414-427 ◽

Cited By ~ 18

Author(s):

S.M. Kirchner ◽

T.F. Döring ◽

L.H. Hiltunen ◽

E. Virtanen ◽

J.P.T. Valkonen

Keyword(s):

Information Theory ◽

Model Selection ◽

Potato Virus ◽

Potato Virus Y ◽

Selection For

Download Full-text

Model Selection and Testing by the MDL Principle

Information Theory and Statistical Learning ◽

10.1007/978-0-387-84816-7_2 ◽

2008 ◽

pp. 25-43 ◽

Cited By ~ 4

Author(s):

Jorma Rissanen

Keyword(s):

Model Selection ◽

Mdl Principle

Download Full-text

Parsimonious model selection using information theory: a modified selection rule

Ecology ◽

10.1002/ecy.3475 ◽

2021 ◽

Author(s):

Luke A. Yates ◽

Shane A. Richards ◽

Barry W. Brook

Keyword(s):

Information Theory ◽

Model Selection ◽

Selection Rule ◽

Parsimonious Model

Download Full-text

Information Theory and Log-Likelihood Models: A Basis for Model Selection and Inference

Model Selection and Inference ◽

10.1007/978-1-4757-2917-7_2 ◽

1998 ◽

pp. 32-74 ◽

Cited By ~ 1

Author(s):

Kenneth P. Burnham ◽

David R. Anderson

Keyword(s):

Information Theory ◽

Model Selection ◽

Log Likelihood

Download Full-text

Model Selection

10.1093/oso/9780192843470.003.0009 ◽

2021 ◽

pp. 149-164

Author(s):

Timothy E. Essington

Keyword(s):

Information Theory ◽

Model Selection ◽

Goodness Of Fit ◽

Danaus Plexippus ◽

Predictive Ability ◽

Information Criterion ◽

Monarch Butterfly ◽

Evaluation Framework ◽

Alternative Hypotheses ◽

Types Of Information

The chapter “Model Selection” provides a brief overview of alternative ways of thinking about what is (are) the best model(s) and shows the core motivation behind the Akaike information criterion as a measure of the predictive ability of a model fitted via maximum likelihood. It then gives stepwise practical guidance for using information theory as a basis of model selection, including nested versus non-nested models, goodness of fit, and overdispersion. Advanced topics cover some of the philosophy of information theory, other types of information theory criteria, and other ways of evaluating the predictive ability of models. As an example, the chapter examines the case of the Western monarch butterfly (Danaus plexippus plexippus), which, over the past two decades, has experienced a 97% decline from its historical average abundance, declining by 86% from 2017 to 2018 alone. Undoubtedly, there is more than one cause—indeed, overwintering habitat loss and pesticide use are both believed to be important contributors to the decline. Adopting a hypothesis-evaluation framework makes it possible to consider multiple alternative hypotheses simultaneously and measure degrees of support for alternative hypotheses.

Download Full-text

The Connection between Bayesian Inference and Information Theory for Model Selection, Information Gain and Experimental Design

Entropy ◽

10.3390/e21111081 ◽

2019 ◽

Vol 21 (11) ◽

pp. 1081 ◽

Cited By ~ 3

Author(s):

Sergey Oladyshkin ◽

Wolfgang Nowak

Keyword(s):

Information Theory ◽

Monte Carlo ◽

Bayesian Inference ◽

Experimental Design ◽

Model Selection ◽

Relative Entropy ◽

Information Entropy ◽

Bayesian Model ◽

Information Gain ◽

Information Criteria

We show a link between Bayesian inference and information theory that is useful for model selection, assessment of information entropy and experimental design. We align Bayesian model evidence (BME) with relative entropy and cross entropy in order to simplify computations using prior-based (Monte Carlo) or posterior-based (Markov chain Monte Carlo) BME estimates. On the one hand, we demonstrate how Bayesian model selection can profit from information theory to estimate BME values via posterior-based techniques. Hence, we use various assumptions including relations to several information criteria. On the other hand, we demonstrate how relative entropy can profit from BME to assess information entropy during Bayesian updating and to assess utility in Bayesian experimental design. Specifically, we emphasize that relative entropy can be computed avoiding unnecessary multidimensional integration from both prior and posterior-based sampling techniques. Prior-based computation does not require any assumptions, however posterior-based estimates require at least one assumption. We illustrate the performance of the discussed estimates of BME, information entropy and experiment utility using a transparent, non-linear example. The multivariate Gaussian posterior estimate includes least assumptions and shows the best performance for BME estimation, information entropy and experiment utility from posterior-based sampling.

Download Full-text

COSMOLOGICAL MODEL SELECTION

International Journal of Modern Physics A ◽

10.1142/s0217751x08039736 ◽

2008 ◽

Vol 23 (06) ◽

pp. 787-802 ◽

Cited By ~ 4

Author(s):

PIA MUKHERJEE ◽

DAVID PARKINSON

Keyword(s):

Information Theory ◽

Model Selection ◽

Cosmological Model ◽

Bayesian Statistics ◽

Recent Progress ◽

Bayesian Framework ◽

Selection Model ◽

Selection Methods ◽

Primary Model ◽

Cosmological Data

We give an overview of the recent progress in the field of cosmological model selection. Model selection statistics, such as those based on information theory and on Bayesian statistics are introduced and discussed. In the Bayesian framework, the marginalised model likelihood, or evidence, is the primary model selection statistic. We describe different methods of computing the evidence, and focus in particular on Nested Sampling. We describe the results of applying model selection methods to new cosmological data such as the CMB measurements by WMAP.

Download Full-text

Addressing the Need for a Model Selection Framework in Systems Biology Using Information Theory

Proceedings of the IEEE ◽

10.1109/jproc.2016.2560121 ◽

2017 ◽

Vol 105 (2) ◽

pp. 330-339 ◽

Cited By ~ 5

Author(s):

Frank DeVilbiss ◽

Doraiswami Ramkrishna

Keyword(s):

Information Theory ◽

Systems Biology ◽

Model Selection ◽

Selection Framework

Download Full-text