Approximate Learning Algorithm for Restricted Boltzmann Machines

We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs are trained more effectively than restricted Boltzmann machines (RBM) with the same number of nodes. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This method also outperforms the reinforcement learning method that uses RBMs.

Download Full-text

Learning algorithm in restricted Boltzmann machines using Kullback-Leibler importance estimation procedure

Nonlinear Theory and Its Applications IEICE ◽

10.1587/nolta.2.153 ◽

2011 ◽

Vol 2 (2) ◽

pp. 153-164 ◽

Cited By ~ 1

Author(s):

Muneki Yasuda ◽

Tetsuharu Sakurai ◽

Kazuyuki Tanaka

Keyword(s):

Learning Algorithm ◽

Estimation Procedure ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines

Download Full-text

Convergence Analysis of Contrastive Divergence Algorithm Based on Gradient Method with Errors

Mathematical Problems in Engineering ◽

10.1155/2015/350102 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Xuesi Ma ◽

Xiaojie Wang

Keyword(s):

Finite Number ◽

Gibbs Sampling ◽

Gradient Method ◽

Convergence Theorem ◽

Learning Algorithm ◽

Restricted Boltzmann Machines ◽

Convergence Conditions ◽

Boltzmann Machines ◽

Contrastive Divergence ◽

Step Number

Contrastive Divergence has become a common way to train Restricted Boltzmann Machines; however, its convergence has not been made clear yet. This paper studies the convergence of Contrastive Divergence algorithm. We relate Contrastive Divergence algorithm to gradient method with errors and derive convergence conditions of Contrastive Divergence algorithm using the convergence theorem of gradient method with errors. We give specific convergence conditions of Contrastive Divergence learning algorithm for Restricted Boltzmann Machines in which both visible units and hidden units can only take a finite number of values. Two new convergence conditions are obtained by specifying the learning rate. Finally, we give specific conditions that the step number of Gibbs sampling must be satisfied in order to guarantee the Contrastive Divergence algorithm convergence.

Download Full-text

Boltzmann machines with clusters of stochastic binary units

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962316500185 ◽

2016 ◽

Vol 07 (02) ◽

pp. 1650018

Author(s):

Da Teng ◽

Zhang Li ◽

Guanghong Gong ◽

Liang Han

Keyword(s):

Gaussian Distribution ◽

Learning Algorithm ◽

Hidden Variables ◽

Recognition Task ◽

Boltzmann Machine ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

New Variant ◽

Deep Boltzmann Machine ◽

New Learning

The original restricted Boltzmann machines (RBMs) are extended by replacing the binary visible and hidden variables with clusters of binary units, and a new learning algorithm for training deep Boltzmann machine of this new variant is proposed. The sum of binary units of each cluster is approximated by a Gaussian distribution. Experiments demonstrate that the proposed Boltzmann machines can achieve good performance in the MNIST handwritten digital recognition task.

Download Full-text

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks

Neural Computation ◽

10.1162/neco.2008.04-07-510 ◽

2008 ◽

Vol 20 (6) ◽

pp. 1631-1649 ◽

Cited By ~ 357

Author(s):

Nicolas Le Roux ◽

Yoshua Bengio

Keyword(s):

Learning Algorithm ◽

Network Models ◽

Building Blocks ◽

Discrete Distributions ◽

Belief Networks ◽

Deep Belief Networks ◽

Neural Network Models ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

Representational Power

Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.

Download Full-text

Analysis on Noisy Boltzmann Machines and Noisy Restricted Boltzmann Machines

IEEE Access ◽

10.1109/access.2021.3102275 ◽

2021 ◽

pp. 1-1

Author(s):

Wenhao Lu ◽

Chi-Sing Leung ◽

John Sum

Keyword(s):

Restricted Boltzmann Machines ◽

Boltzmann Machines

Download Full-text

Adaptive hyperparameter updating for training restricted Boltzmann machines on quantum annealers

Scientific Reports ◽

10.1038/s41598-021-82197-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Guanglei Xu ◽

William S. Oates

Keyword(s):

Neural Network ◽

Maximum Likelihood ◽

Image Reconstruction ◽

Image Recognition ◽

Shannon Entropy ◽

Reconstruction Error ◽

Likelihood Method ◽

Restricted Boltzmann Machines ◽

Boltzmann Machines ◽

D Wave

AbstractRestricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ($$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.

Download Full-text