scholarly journals Selecting Amongst Multinomial Models: An Apologia for Normalized Maximum Likelihood

2019 ◽  
Author(s):  
David Kellen ◽  
Karl Christoph Klauer

The modeling of multinomial data has seen tremendous progress since Riefer and Batchelder’s (1988) seminal paper. One recurring challenge, however, concerns theavailability of relative performance measures that strike an ideal balance between goodness of fit and functional flexibility. One approach to the problem of model selection is Normalized Maximum Likelihood (NML), a solution derived from the Minimum Description Length principle. In the present work we provide an R implementation of a Gibbs sampler that can be used to compute NML for models of joint multinomial data. We discuss the application of NML in different examples, compare NML with Bayes Factors, and show how it constitutes an important addition to researchers’ toolboxes.

2004 ◽  
Vol 16 (9) ◽  
pp. 1763-1768 ◽  
Author(s):  
Daniel J. Navarro

An applied problem is discussed in which two nested psychological models of retention are compared using minimum description length (MDL). The standard Fisher information approximation to the normalized maximum likelihood is calculated for these two models, with the result that the full model is assigned a smaller complexity, even for moderately large samples. A geometric interpretation for this behavior is considered, along with its practical implications.


2019 ◽  
Author(s):  
Danielle Navarro

An applied problem is discussed in which two nested psychological models of retention are compared using minimum description length (MDL). The standard Fisher information approximation to the normalized maximum likelihood is calculated for these two models, with the result that the full model is assigned a smaller complexity, even for moderately large samples. A geometric interpretation for this behavior is considered, along with its practical implications.


Author(s):  
Yanxue Wang ◽  
Jiawei Xiang ◽  
Jiang Zhansi ◽  
Yang Lianfa ◽  
Zhengjia He

Vibration signals are usually affected by noise, which is in turn related to the measurement and data processing procedures. This paper presents a new subband adaptive denoising method for detective impulsive signatures based on minimum description length principle with improved normalized maximum likelihood density model. The threshold of the proposed denoising method is determined automatically without the need to estimate the noise variance. The effectiveness of the proposed denoising method over VisuaShrink, BayesShrink and minimum description length denoising methods are given through simulation and practical applications.


Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 997
Author(s):  
Pham Thuc Hung ◽  
Kenji Yamanishi

In this paper, we propose a novel information criteria-based approach to select the dimensionality of the word2vec Skip-gram (SG). From the perspective of the probability theory, SG is considered as an implicit probability distribution estimation under the assumption that there exists a true contextual distribution among words. Therefore, we apply information criteria with the aim of selecting the best dimensionality so that the corresponding model can be as close as possible to the true distribution. We examine the following information criteria for the dimensionality selection problem: the Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), and Sequential Normalized Maximum Likelihood (SNML) criterion. SNML is the total codelength required for the sequential encoding of a data sequence on the basis of the minimum description length. The proposed approach is applied to both the original SG model and the SG Negative Sampling model to clarify the idea of using information criteria. Additionally, as the original SNML suffers from computational disadvantages, we introduce novel heuristics for its efficient computation. Moreover, we empirically demonstrate that SNML outperforms both BIC and AIC. In comparison with other evaluation methods for word embedding, the dimensionality selected by SNML is significantly closer to the optimal dimensionality obtained by word analogy or word similarity tasks.


2019 ◽  
Author(s):  
Jay I. Myung ◽  
Danielle Navarro ◽  
Mark A. Pitt

The Minimum Description Length (MDL) principle is an information theoretic approach to inductive inference that originated in algorithmic coding theory. In this approach, data are viewed as codes to be compressed by the model. From this perspective, models are compared on their ability to compress a data set by extracting useful information in the data apart from random noise. The goal of model selection is to identify the model, from a set of candidate models, that permits the shortest description length (code) of the data. Since Rissanen originally formalized the problem using the crude ‘two-part code’ MDL method in the 1970s, many significant strides have been made, especially in the 1990s, with the culmination of the development of the refined ‘universal code’ MDL method, dubbed Normalized Maximum Likelihood (NML). It represents an elegant solution to the model selection problem. The present paper provides a tutorial review on these latest developments with a special focus on NML. An application example of NML in cognitive modeling is also provided.


2019 ◽  
Author(s):  
Michael David Lee ◽  
Danielle Navarro

Clustering is one of the most basic and useful methods of data analysis. This chapter describes a number of powerful clustering models, developed in psychology, for representing objects using data that measure the similarities between pairs of objects. These models place few restrictions on how objects are assigned to clusters,and allow for very general measures of the similarities between objects and clusters.Geometric Complexity Criteria (GCC) are derived for these models, and are used to fit the models to similarity data in a way that balances goodness-of-fit with complexity. Complexity analyses, based on the GCC, are presented for the two most widely used psychological clustering models, known as “additive clustering”and “additive trees”


Sign in / Sign up

Export Citation Format

Share Document