normalized maximum likelihood
Recently Published Documents


TOTAL DOCUMENTS

52
(FIVE YEARS 13)

H-INDEX

10
(FIVE YEARS 1)

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Zhenghui Hu ◽  
Fei Li ◽  
Minjia Cheng ◽  
Junhui Shui ◽  
Yituo Tang ◽  
...  

AbstractUnified Granger causality analysis (uGCA) alters conventional two-stage Granger causality analysis into a unified code-length guided framework. We have presented several forms of uGCA methods to investigate causal connectivities, and different forms of uGCA have their own characteristics, which capable of approaching the ground truth networks well in their suitable contexts. In this paper, we considered comparing these several forms of uGCA in detail, then recommend a relatively more robust uGCA method among them, uGCA-NML, to reply to more general scenarios. Then, we clarified the distinguished advantages of uGCA-NML in a synthetic 6-node network. Moreover, uGCA-NML presented its good robustness in mental arithmetic experiments, which identified a stable similarity among causal networks under visual/auditory stimulus. Whereas, due to its commendable stability and accuracy, uGCA-NML will be a prior choice in this unified causal investigation paradigm.


Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 997
Author(s):  
Pham Thuc Hung ◽  
Kenji Yamanishi

In this paper, we propose a novel information criteria-based approach to select the dimensionality of the word2vec Skip-gram (SG). From the perspective of the probability theory, SG is considered as an implicit probability distribution estimation under the assumption that there exists a true contextual distribution among words. Therefore, we apply information criteria with the aim of selecting the best dimensionality so that the corresponding model can be as close as possible to the true distribution. We examine the following information criteria for the dimensionality selection problem: the Akaike’s Information Criterion (AIC), Bayesian Information Criterion (BIC), and Sequential Normalized Maximum Likelihood (SNML) criterion. SNML is the total codelength required for the sequential encoding of a data sequence on the basis of the minimum description length. The proposed approach is applied to both the original SG model and the SG Negative Sampling model to clarify the idea of using information criteria. Additionally, as the original SNML suffers from computational disadvantages, we introduce novel heuristics for its efficient computation. Moreover, we empirically demonstrate that SNML outperforms both BIC and AIC. In comparison with other evaluation methods for word embedding, the dimensionality selected by SNML is significantly closer to the optimal dimensionality obtained by word analogy or word similarity tasks.


2021 ◽  
Author(s):  
Zhenghui Hu ◽  
Fei Li ◽  
Minjia Cheng ◽  
Junhui Shui ◽  
Yituo Tang ◽  
...  

Abstract Unified Granger causality analysis (uGCA) alters conventional two-stage Granger causality analysis into a unified code-length guided framework. We have presented several forms of uGCA methods to investigate causal connectivities, and different forms of uGCA have their own characteristics, which capable of approaching the ground truth networks well in their suitable contexts. In this paper, we considered comparing these several forms of uGCA in detail, then recommend a relatively more robust uGCA method among them, uGCA-NML, to reply to more general scenarios. Then, we clarified the distinguished advantages of uGCA-NML in a synthetic 6-node network. Moreover, uGCA-NML presented its good robustness in mental arithmetic experiments, which identified a stable similarity among causal networks under visual/auditory stimulus. Whereas, due to its commendable stability and accuracy, uGCA-NML will be a prior choice in this unified causal investigation paradigm.


Author(s):  
So Hirai ◽  
Kenji Yamanishi

This paper addresses the issues of how we can quantify structural information for nonparametric distributions and how we can detect its changes. Structural information refers to an index for a global understanding of a data distribution. When we consider the problem of clustering using a parametric model such as a Gaussian mixture model, the number of mixture components (clusters) can be thought of as structural information in the model. However, there does not exist any notion of structural information for nonparametric modeling of data. In this paper we introduce a novel notion of {\em kernel complexity} (KC) as structural information in the nonparametric setting. The key idea of KC is to combine the information bias inspired by the Gini index with the information quantity measured in terms of the normalized maximum likelihood (NML) code length. We empirically show that KC has a property similar to the number of clusters in a parametric model. We further propose a framework for structural change detection with KC in nonparametric distributions. With synthetic and real data sets we empirically demonstrate that our framework enables us to detect structural changes underlying the data and their early warning signals.


2019 ◽  
Author(s):  
David Kellen ◽  
Karl Christoph Klauer

The modeling of multinomial data has seen tremendous progress since Riefer and Batchelder’s (1988) seminal paper. One recurring challenge, however, concerns theavailability of relative performance measures that strike an ideal balance between goodness of fit and functional flexibility. One approach to the problem of model selection is Normalized Maximum Likelihood (NML), a solution derived from the Minimum Description Length principle. In the present work we provide an R implementation of a Gibbs sampler that can be used to compute NML for models of joint multinomial data. We discuss the application of NML in different examples, compare NML with Bayes Factors, and show how it constitutes an important addition to researchers’ toolboxes.


2019 ◽  
Author(s):  
Danielle Navarro

An applied problem is discussed in which two nested psychological models of retention are compared using minimum description length (MDL). The standard Fisher information approximation to the normalized maximum likelihood is calculated for these two models, with the result that the full model is assigned a smaller complexity, even for moderately large samples. A geometric interpretation for this behavior is considered, along with its practical implications.


2019 ◽  
Author(s):  
Peter Grünwald ◽  
Danielle Navarro

We review the normalized maximum likelihood (NML) criterion for selecting among competing models. NML is generally justified on information-theoretic grounds, via the principle of minimum description length (MDL), in a derivation that “does not assume the existence of a true, data-generating distribution.” Since this “agnostic” claim has been a source of some recent confusion in the psychological literature, we explain in detail what is meant by this statement. In doing so we discuss the work presented by Karabatsos and Walker (2006), who propose an alternative Bayesian decision-theoretic characterization of NML, which leads them to conclude that the claim of agnosticity is meaningless. In the KW derivation, one part of the NML criterion (the likelihood term) arises from placing a Dirichlet process prior over possible data-generating distributions, and the other part (the complexity term) is folded into a loss function. Whereas in the original derivations of NML, the complexity term arises naturally, in the KW derivation its mathematical form is taken for granted and not explained any further. We argue that for this reason, the KW characterization is incomplete; relatedly, we question the relevance of the characterization and we argue that their main conclusion about agnosticity does not follow.


Sign in / Sign up

Export Citation Format

Share Document