Approximate Learning Algorithm for Restricted Boltzmann Machines

Author(s):  
Muneki Yasuda ◽  
Kazuyuki Tanaka
2018 ◽  
Vol 18 (1&2) ◽  
pp. 51-74 ◽  
Author(s):  
Daniel Crawford ◽  
Anna Levit ◽  
Navid Ghadermarzy ◽  
Jaspreet S. Oberoi ◽  
Pooya Ronagh

We investigate whether quantum annealers with select chip layouts can outperform classical computers in reinforcement learning tasks. We associate a transverse field Ising spin Hamiltonian with a layout of qubits similar to that of a deep Boltzmann machine (DBM) and use simulated quantum annealing (SQA) to numerically simulate quantum sampling from this system. We design a reinforcement learning algorithm in which the set of visible nodes representing the states and actions of an optimal policy are the first and last layers of the deep network. In absence of a transverse field, our simulations show that DBMs are trained more effectively than restricted Boltzmann machines (RBM) with the same number of nodes. We then develop a framework for training the network as a quantum Boltzmann machine (QBM) in the presence of a significant transverse field for reinforcement learning. This method also outperforms the reinforcement learning method that uses RBMs.


2015 ◽  
Vol 2015 ◽  
pp. 1-9 ◽  
Author(s):  
Xuesi Ma ◽  
Xiaojie Wang

Contrastive Divergence has become a common way to train Restricted Boltzmann Machines; however, its convergence has not been made clear yet. This paper studies the convergence of Contrastive Divergence algorithm. We relate Contrastive Divergence algorithm to gradient method with errors and derive convergence conditions of Contrastive Divergence algorithm using the convergence theorem of gradient method with errors. We give specific convergence conditions of Contrastive Divergence learning algorithm for Restricted Boltzmann Machines in which both visible units and hidden units can only take a finite number of values. Two new convergence conditions are obtained by specifying the learning rate. Finally, we give specific conditions that the step number of Gibbs sampling must be satisfied in order to guarantee the Contrastive Divergence algorithm convergence.


Author(s):  
Da Teng ◽  
Zhang Li ◽  
Guanghong Gong ◽  
Liang Han

The original restricted Boltzmann machines (RBMs) are extended by replacing the binary visible and hidden variables with clusters of binary units, and a new learning algorithm for training deep Boltzmann machine of this new variant is proposed. The sum of binary units of each cluster is approximated by a Gaussian distribution. Experiments demonstrate that the proposed Boltzmann machines can achieve good performance in the MNIST handwritten digital recognition task.


2008 ◽  
Vol 20 (6) ◽  
pp. 1631-1649 ◽  
Author(s):  
Nicolas Le Roux ◽  
Yoshua Bengio

Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Guanglei Xu ◽  
William S. Oates

AbstractRestricted Boltzmann Machines (RBMs) have been proposed for developing neural networks for a variety of unsupervised machine learning applications such as image recognition, drug discovery, and materials design. The Boltzmann probability distribution is used as a model to identify network parameters by optimizing the likelihood of predicting an output given hidden states trained on available data. Training such networks often requires sampling over a large probability space that must be approximated during gradient based optimization. Quantum annealing has been proposed as a means to search this space more efficiently which has been experimentally investigated on D-Wave hardware. D-Wave implementation requires selection of an effective inverse temperature or hyperparameter ($$\beta $$ β ) within the Boltzmann distribution which can strongly influence optimization. Here, we show how this parameter can be estimated as a hyperparameter applied to D-Wave hardware during neural network training by maximizing the likelihood or minimizing the Shannon entropy. We find both methods improve training RBMs based upon D-Wave hardware experimental validation on an image recognition problem. Neural network image reconstruction errors are evaluated using Bayesian uncertainty analysis which illustrate more than an order magnitude lower image reconstruction error using the maximum likelihood over manually optimizing the hyperparameter. The maximum likelihood method is also shown to out-perform minimizing the Shannon entropy for image reconstruction.


Sign in / Sign up

Export Citation Format

Share Document