scholarly journals Multi-view Restricted Boltzmann Machines with Posterior Consistency

Author(s):  
Ding Shifei ◽  
Zhang Nan ◽  
Zhang Jian
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Guanglei Xu ◽  
William S. Oates

AbstractRestricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ($$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.


Author(s):  
Oswin Krause ◽  
Asja Fischer ◽  
Christian Igel

Estimating the normalization constants (partition functions) of energy-based probabilistic models (Markov random fields) with a high accuracy is required for measuring performance, monitoring the training progress of adaptive models, and conducting likelihood ratio tests. We devised a unifying theoretical framework for algorithms for estimating the partition function, including Annealed Importance Sampling (AIS) and Bennett's Acceptance Ratio method (BAR). The unification reveals conceptual similarities of and differences between different approaches and suggests new algorithms. The framework is based on a generalized form of Crooks' equality, which links the expectation over a distribution of samples generated by a transition operator to the expectation over the distribution induced by the reversed operator. Different ways of sampling, such as parallel tempering and path sampling, are covered by the framework. We performed experiments in which we estimated the partition function of restricted Boltzmann machines (RBMs) and Ising models. We found that BAR using parallel tempering worked well with a small number of bridging distributions, while path sampling based AIS performed best with many bridging distributions. The normalization constant is measured w.r.t.~a reference distribution, and the choice of this distribution turned out to be very important in our experiments. Overall, BAR gave the best empirical results, outperforming AIS.


Sign in / Sign up

Export Citation Format

Share Document