STUDYING THE EFFECT OF ADAPTIVE MOMENTUM IN IMPROVING THE ACCURACY OF GRADIENT DESCENT BACK PROPAGATION ALGORITHM ON CLASSIFICATION PROBLEMS

2012 ◽  
Vol 09 ◽  
pp. 432-439 ◽  
Author(s):  
MUHAMMAD ZUBAIR REHMAN ◽  
NAZRI MOHD. NAWI

Despite being widely used in the practical problems around the world, Gradient Descent Back-propagation algorithm comes with problems like slow convergence and convergence to local minima. Previous researchers have suggested certain modifications to improve the convergence in gradient Descent Back-propagation algorithm such as careful selection of input weights and biases, learning rate, momentum, network topology, activation function and value for 'gain' in the activation function. This research proposed an algorithm for improving the working performance of back-propagation algorithm which is 'Gradient Descent with Adaptive Momentum (GDAM)' by keeping the gain value fixed during all network trials. The performance of GDAM is compared with 'Gradient Descent with fixed Momentum (GDM)' and 'Gradient Descent Method with Adaptive Gain (GDM-AG)'. The learning rate is fixed to 0.4 and maximum epochs are set to 3000 while sigmoid activation function is used for the experimentation. The results show that GDAM is a better approach than previous methods with an accuracy ratio of 1.0 for classification problems like Wine Quality, Mushroom and Thyroid disease.

2012 ◽  
Vol 09 ◽  
pp. 448-455 ◽  
Author(s):  
NORHAMREEZA ABDUL HAMID ◽  
NAZRI MOHD NAWI ◽  
ROZAIDA GHAZALI ◽  
MOHD NAJIB MOHD SALLEH

This paper presents a new method to improve back propagation algorithm from getting stuck with local minima problem and slow convergence speeds which caused by neuron saturation in the hidden layer. In this proposed algorithm, each training pattern has its own activation functions of neurons in the hidden layer that are adjusted by the adaptation of gain parameters together with adaptive momentum and learning rate value during the learning process. The efficiency of the proposed algorithm is compared with the conventional back propagation gradient descent and the current working back propagation gradient descent with adaptive gain by means of simulation on three benchmark problems namely iris, glass and thyroid.


Author(s):  
Mohammed Sarhan Al_Duais ◽  
Fatma Susilawati. Mohamad

<span lang="EN-US">The man problem of batch back propagation (BBP) algorithm is slow training and there are several parameters needs to be adjusted manually, also suffers from saturation training.</span><span lang="EN-US">The learning rate and momentum factor are significant parameters for increasing the efficiency of the (BBP). In this study, we created a new dynamic function of each learning rate and momentum facor. We present the DBBPLM algorithm, which trains with a dynamic function for each the learning rate and momentum factor.<br /> A Sigmoid function used as activation function. The XOR problem, balance, breast cancer and iris dataset were used as benchmarks for testing the effects of the dynamic DBBPLM algorithm. All the experiments were performed on Matlab 2012 a. The stop training was determined ten power -5. From the experimental results, the DBBPLM algorithm provides superior performance in terms of training, and faster training with higher accuracy compared to the BBP algorithm and with existing works.</span>


1999 ◽  
Vol 09 (04) ◽  
pp. 273-284 ◽  
Author(s):  
ROELOF K. BROUWER

This paper illustrates the use of a powerful language, called J, that is ideal for simulating neural networks. The use of J is demonstrated by its application to a gradient descent method for training a multilayer perceptron. It is also shown how the back-propagation algorithm can be easily generalized to multilayer networks without any increase in complexity and that the algorithm can be completely expressed in an array notation which is directly executable through J. J is a general purpose language, which means that its user is given a flexibility not available in neural network simulators or in software packages such as MATLAB. Yet, because of its numerous operators, J allows a very succinct code to be used, leading to a tremendous decrease in development time.


2016 ◽  
Vol 114 ◽  
pp. 79-87 ◽  
Author(s):  
Alaa Ali Hameed ◽  
Bekir Karlik ◽  
Mohammad Shukri Salman

2018 ◽  
Author(s):  
Kazunori D Yamada

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.


Sign in / Sign up

Export Citation Format

Share Document