scholarly journals Hidden Node Detection between Observable Nodes Based on Bayesian Clustering

Entropy ◽  
2019 ◽  
Vol 21 (1) ◽  
pp. 32 ◽  
Author(s):  
Keisuke Yamazaki ◽  
Yoichi Motomura

Structure learning is one of the main concerns in studies of Bayesian networks. In the present paper, we consider networks consisting of both observable and hidden nodes, and propose a method to investigate the existence of a hidden node between observable nodes, where all nodes are discrete. This corresponds to the model selection problem between the networks with and without the middle hidden node. When the network includes a hidden node, it has been known that there are singularities in the parameter space, and the Fisher information matrix is not positive definite. Then, the many conventional criteria for structure learning based on the Laplace approximation do not work. The proposed method is based on Bayesian clustering, and its asymptotic property justifies the result; the redundant labels are eliminated and the simplest structure is detected even if there are singularities.

2006 ◽  
Vol 18 (5) ◽  
pp. 1007-1065 ◽  
Author(s):  
Shun-ichi Amari ◽  
Hyeyoung Park ◽  
Tomoko Ozeki

The parameter spaces of hierarchical systems such as multilayer perceptrons include singularities due to the symmetry and degeneration of hidden units. A parameter space forms a geometrical manifold, called the neuromanifold in the case of neural networks. Such a model is identified with a statistical model, and a Riemannian metric is given by the Fisher information matrix. However, the matrix degenerates at singularities. Such a singular structure is ubiquitous not only in multilayer perceptrons but also in the gaussian mixture probability densities, ARMA time-series model, and many other cases. The standard statistical paradigm of the Cramér-Rao theorem does not hold, and the singularity gives rise to strange behaviors in parameter estimation, hypothesis testing, Bayesian inference, model selection, and in particular, the dynamics of learning from examples. Prevailing theories so far have not paid much attention to the problem caused by singularity, relying only on ordinary statistical theories developed for regular (nonsingular) models. Only recently have researchers remarked on the effects of singularity, and theories are now being developed. This article gives an overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures. We demonstrate our recent results on these problems. Simple toy models are also used to show explicit solutions. We explain that the maximum likelihood estimator is no longer subject to the gaussian distribution even asymptotically, because the Fisher information matrix degenerates, that the model selection criteria such as AIC, BIC, and MDL fail to hold in these models, that a smooth Bayesian prior becomes singular in such models, and that the trajectories of dynamics of learning are strongly affected by the singularity, causing plateaus or slow manifolds in the parameter space. The natural gradient method is shown to perform well because it takes the singular geometrical structure into account. The generalization error and the training error are studied in some examples.


2016 ◽  
Vol 27 (11) ◽  
pp. 1650092 ◽  
Author(s):  
Michel Nguiffo Boyom ◽  
Robert A. Wolak

A family of probability distributions parametrized by an open domain [Formula: see text] in [Formula: see text] defines the Fisher information matrix on this domain which is positive semi-definite. In information geometry, the standard assumption has been that the Fisher information matrix tensor is positive definite defining in this way a Riemannian metric on [Formula: see text]. It seems to be quite a strong condition. In general, not much can be said about the Fisher information matrix tensor. To develop a more general theory, we weaken the assumption and replace “positive definite” by the existence of a suitable torsion-free connection. It permits us to define naturally a foliation with a transversely Hessian structure. We develop the theory of transversely Hessian foliations along the lines of the classical theory.


2012 ◽  
Vol 51 (1) ◽  
pp. 115-130
Author(s):  
Sergei Leonov ◽  
Alexander Aliev

ABSTRACT We provide some details of the implementation of optimal design algorithm in the PkStaMp library which is intended for constructing optimal sampling schemes for pharmacokinetic (PK) and pharmacodynamic (PD) studies. We discuss different types of approximation of individual Fisher information matrix and describe a user-defined option of the library.


Sign in / Sign up

Export Citation Format

Share Document