Momentum Acceleration of Quasi-Newton Training for Neural Networks

Author(s):  
Shahrzad Mahboubi ◽  
S. Indrapriyadarsini ◽  
Hiroshi Ninomiya ◽  
Hideki Asai
Keyword(s):  
Author(s):  
S. Indrapriyadarsini ◽  
Shahrzad Mahboubi ◽  
Hiroshi Ninomiya ◽  
Takeshi Kamio ◽  
Hideki Asai

Gradient based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method though less commonly used in training neural networks, are known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks and briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.


Author(s):  
S. Indrapriyadarsini ◽  
Shahrzad Mahboubi ◽  
Hiroshi Ninomiya ◽  
Takeshi Kamio ◽  
Hideki Asai

Gradient based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method though less commonly used in training neural networks, are known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks and briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.


2009 ◽  
Vol 36 (9) ◽  
pp. 1491-1505 ◽  
Author(s):  
Seema Chauhan ◽  
R. K. Shrivastava

The present study aims to apply artificial neural networks (ANNs) for reference evapotranspiration (ETo) prediction. Three different feed-forward artifical neural network (ANN) models, each using varied input combinations of previous months ETo, have been trained and tested. The output of the network was the one-month-ahead ETo. The networks learned to forecast one-month-ahead ETo for Mahanadi reservoir project area using the three learning methods namely quasi-Newton algorithm, Levenberg–Marquardt algorithm and backpropagation with adaptive learning rate algorithm. The training results were compared with each other, and performance evaluations were done for untrained data. The performance evaluations measured were standard error of estimates (SEE), raw standard error of estimates (RSEE), and model efficiency. The best ANN architecture for prediction of ETo was obtained for Mahanadi reservoir project area. The monthly reference evapotranspiration data were estimated by the Penman–Monteith method and used for training and testing of the ANN models. Further ANNs predicted results were compared with those obtained using the statistical multiple regression technique. Based on results obtained, the ANN model with architecture of 3–9-1 (three, nine, and one neuron(s) in the input, hidden, and output layers, respectively) trained using quasi-Newton algorithm was found to be the best amongst all the models with minimum SEE and RSEE of 0.45 and 0.45 mm/d respectively and maximum model efficiency of 93%. It is concluded that ANN can be used to predict ETo.


2014 ◽  
Vol 2014 ◽  
pp. 1-8
Author(s):  
Luma N. M. Tawfiq ◽  
Othman M. Salih

The aim of this paper is to presents a parallel processor technique for solving eigenvalue problem for ordinary differential equations using artificial neural networks. The proposed network is trained by back propagation with different training algorithms quasi-Newton, Levenberg-Marquardt, and Bayesian Regulation. The next objective of this paper was to compare the performance of aforementioned algorithms with regard to predicting ability.


Sign in / Sign up

Export Citation Format

Share Document