marginal likelihood
Recently Published Documents


TOTAL DOCUMENTS

248
(FIVE YEARS 69)

H-INDEX

25
(FIVE YEARS 4)

Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1703
Author(s):  
Shouta Sugahara ◽  
Maomi Ueno

Earlier studies have shown that classification accuracies of Bayesian networks (BNs) obtained by maximizing the conditional log likelihood (CLL) of a class variable, given the feature variables, were higher than those obtained by maximizing the marginal likelihood (ML). However, differences between the performances of the two scores in the earlier studies may be attributed to the fact that they used approximate learning algorithms, not exact ones. This paper compares the classification accuracies of BNs with approximate learning using CLL to those with exact learning using ML. The results demonstrate that the classification accuracies of BNs obtained by maximizing the ML are higher than those obtained by maximizing the CLL for large data. However, the results also demonstrate that the classification accuracies of exact learning BNs using the ML are much worse than those of other methods when the sample size is small and the class variable has numerous parents. To resolve the problem, we propose an exact learning augmented naive Bayes classifier (ANB), which ensures a class variable with no parents. The proposed method is guaranteed to asymptotically estimate the identical class posterior to that of the exactly learned BN. Comparison experiments demonstrated the superior performance of the proposed method.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1545
Author(s):  
Chi-Ken Lu ◽  
Patrick Shafto

Deep Gaussian Processes (DGPs) were proposed as an expressive Bayesian model capable of a mathematically grounded estimation of uncertainty. The expressivity of DPGs results from not only the compositional character but the distribution propagation within the hierarchy. Recently, it was pointed out that the hierarchical structure of DGP well suited modeling the multi-fidelity regression, in which one is provided sparse observations with high precision and plenty of low fidelity observations. We propose the conditional DGP model in which the latent GPs are directly supported by the fixed lower fidelity data. Then the moment matching method is applied to approximate the marginal prior of conditional DGP with a GP. The obtained effective kernels are implicit functions of the lower-fidelity data, manifesting the expressivity contributed by distribution propagation within the hierarchy. The hyperparameters are learned via optimizing the approximate marginal likelihood. Experiments with synthetic and high dimensional data show comparable performance against other multi-fidelity regression methods, variational inference, and multi-output GP. We conclude that, with the low fidelity data and the hierarchical DGP structure, the effective kernel encodes the inductive bias for true function allowing the compositional freedom.


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e12438
Author(s):  
Sebastian Höhna ◽  
Michael J. Landis ◽  
John P. Huelsenbeck

In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.


2021 ◽  
Author(s):  
Kazuhiro Yamaguchi

This research reviewed the recent development of parameter estimation methods in item response theory models. Various new methods to manage the computational burden problem with respect to the item factor analysis and multidimensional item response models, which have high dimensional factors, were introduced. Monte Carlo integral methods, approximation methods for marginal likelihood, new optimization methods, and techniques used in the machine learning field were employed for the estimation methods. Theoretically, a new type of asymptotical setting, that assumes infinite number of sample sizes and items, was considered. Several methods were classified apart from the maximum likelihood method or Bayesian method. Theoretical development of interval estimation methods for individual latent traits were also proposed and they provided highly accurate intervals


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1387
Author(s):  
Chi-Ken Lu ◽  
Patrick Shafto

It is desirable to combine the expressive power of deep learning with Gaussian Process (GP) in one expressive Bayesian learning model. Deep kernel learning showed success as a deep network used for feature extraction. Then, a GP was used as the function model. Recently, it was suggested that, albeit training with marginal likelihood, the deterministic nature of a feature extractor might lead to overfitting, and replacement with a Bayesian network seemed to cure it. Here, we propose the conditional deep Gaussian process (DGP) in which the intermediate GPs in hierarchical composition are supported by the hyperdata and the exposed GP remains zero mean. Motivated by the inducing points in sparse GP, the hyperdata also play the role of function supports, but are hyperparameters rather than random variables. It follows our previous moment matching approach to approximate the marginal prior for conditional DGP with a GP carrying an effective kernel. Thus, as in empirical Bayes, the hyperdata are learned by optimizing the approximate marginal likelihood which implicitly depends on the hyperdata via the kernel. We show the equivalence with the deep kernel learning in the limit of dense hyperdata in latent space. However, the conditional DGP and the corresponding approximate inference enjoy the benefit of being more Bayesian than deep kernel learning. Preliminary extrapolation results demonstrate expressive power from the depth of hierarchy by exploiting the exact covariance and hyperdata learning, in comparison with GP kernel composition, DGP variational inference and deep kernel learning. We also address the non-Gaussian aspect of our model as well as way of upgrading to a full Bayes inference.


2021 ◽  
Vol 31 (6) ◽  
Author(s):  
Kimmo Suotsalo ◽  
Yingying Xu ◽  
Jukka Corander ◽  
Johan Pensar

AbstractLearning vector autoregressive models from multivariate time series is conventionally approached through least squares or maximum likelihood estimation. These methods typically assume a fully connected model which provides no direct insight to the model structure and may lead to highly noisy estimates of the parameters. Because of these limitations, there has been an increasing interest towards methods that produce sparse estimates through penalized regression. However, such methods are computationally intensive and may become prohibitively time-consuming when the number of variables in the model increases. In this paper we adopt an approximate Bayesian approach to the learning problem by combining fractional marginal likelihood and pseudo-likelihood. We propose a novel method, PLVAR, that is both faster and produces more accurate estimates than the state-of-the-art methods based on penalized regression. We prove the consistency of the PLVAR estimator and demonstrate the attractive performance of the method on both simulated and real-world data.


2021 ◽  
Vol 159 ◽  
pp. 108255
Author(s):  
Daniel Siefman ◽  
Mathieu Hursin ◽  
Georg Schnabel ◽  
Henrik Sjöstrand

2021 ◽  
Author(s):  
Michael R May ◽  
Carl Rothfels

Time-calibrated phylogenetic trees are fundamental to a wide range of evolutionary studies. Typically, these trees are inferred in a Bayesian framework, with the phylogeny itself treated as a parameter with a prior distribution (a "tree prior"). This prior distribution is often a variant of the stochastic birth-death process, which models speciation events, extinction events, and sampling events (of extinct and/or extant lineages). However, the samples produced by this process are observations, so their probability should be viewed as a likelihood rather than a prior probability. We show that treating the samples as part of the prior results in incorrect marginal likelihood estimates and can result in model-comparison approaches disfavoring the best model within a set of candidate models. The ability to correctly compare the fit of competing tree models is critical to accurate phylogenetic estimates, especially of divergence times, and also to studying the processes that govern lineage diversification. We outline potential remedies, and provide guidance for researchers interested in comparing the fit of competing tree models.


Sign in / Sign up

Export Citation Format

Share Document