Mixing neural networks and the Newton method for the kinematics of simple cable-driven parallel robots with sagging cables

Gradient based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method though less commonly used in training neural networks, are known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks and briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.

Download Full-text

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

10.20944/preprints202112.0097.v1 ◽

2021 ◽

Author(s):

S. Indrapriyadarsini ◽

Shahrzad Mahboubi ◽

Hiroshi Ninomiya ◽

Takeshi Kamio ◽

Hideki Asai

Keyword(s):

Neural Networks ◽

Newton Method ◽

Trust Region ◽

Nonlinear Problems ◽

Classification Problem ◽

Second Order ◽

Highly Nonlinear ◽

Gradient Based ◽

Quasi Newton ◽

Second Order Methods

Gradient based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method though less commonly used in training neural networks, are known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks and briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.

Download Full-text

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

Algorithms ◽

10.3390/a15010006 ◽

2021 ◽

Vol 15 (1) ◽

pp. 6

Author(s):

S. Indrapriyadarsini ◽

Shahrzad Mahboubi ◽

Hiroshi Ninomiya ◽

Takeshi Kamio ◽

Hideki Asai

Keyword(s):

Neural Networks ◽

Newton Method ◽

Trust Region ◽

Nonlinear Problems ◽

Classification Problem ◽

Second Order ◽

Highly Nonlinear ◽

Gradient Based ◽

Quasi Newton ◽

Second Order Methods

Gradient-based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have been shown to speed up the convergence of the BFGS method using the Nesterov’s acclerated gradient and momentum terms. The SR1 quasi-Newton method, though less commonly used in training neural networks, is known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov’s gradient for training neural networks, and to briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.

Download Full-text

A learning algorithm for multilayered neural networks: a Newton method using automatic differentiation

IJCNN-91-Seattle International Joint Conference on Neural Networks ◽

10.1109/ijcnn.1991.155623 ◽

2002 ◽

Cited By ~ 1

Author(s):

T. Yoshida

Keyword(s):

Neural Networks ◽

Newton Method ◽

Automatic Differentiation ◽

Learning Algorithm

Download Full-text

Solving the Forward Kinematics of Cable-Driven Parallel Robots with Neural Networks and Interval Arithmetic

Computational Kinematics - Mechanisms and Machine Science ◽

10.1007/978-94-007-7214-4_12 ◽

2013 ◽

pp. 103-110 ◽

Cited By ~ 8

Author(s):

Valentin Schmidt ◽

Bertram Müller ◽

Andreas Pott

Keyword(s):

Neural Networks ◽

Interval Arithmetic ◽

Parallel Robots ◽

Forward Kinematics

Download Full-text

Finite-Time Stable Versions of the Continuous Newton Method and Applications to Neural Networks

IFAC Proceedings Volumes ◽

10.3182/20090921-3-tr-3005.00042 ◽

2009 ◽

Vol 42 (19) ◽

pp. 231-236

Author(s):

Navid Noroozi ◽

Paknosh Karimaghaee ◽

Ali Akbar Safavi ◽

Amit Bhaya

Keyword(s):

Neural Networks ◽

Newton Method ◽

Finite Time ◽

Continuous Newton Method

Download Full-text

Distributed Newton Methods for Deep Neural Networks

Neural Computation ◽

10.1162/neco_a_01088 ◽

2018 ◽

Vol 30 (6) ◽

pp. 1673-1724 ◽

Cited By ~ 5

Author(s):

Chien-Chih Wang ◽

Kent Loong Tan ◽

Chun-Ting Chen ◽

Yu-Hsiang Lin ◽

S. Sathiya Keerthi ◽

...

Keyword(s):

Neural Networks ◽

Newton Method ◽

Deep Neural Networks ◽

Deep Structure ◽

Gradient Methods ◽

Communication Cost ◽

Test Accuracy ◽

Running Time ◽

Distributed Training ◽

Newton Direction

Deep learning involves a difficult nonconvex optimization problem with a large number of weights between any two adjacent layers of a deep structure. To handle large data sets or complicated networks, distributed training is needed, but the calculation of function, gradient, and Hessian is expensive. In particular, the communication and the synchronization cost may become a bottleneck. In this letter, we focus on situations where the model is distributedly stored and propose a novel distributed Newton method for training deep neural networks. By variable and feature-wise data partitions and some careful designs, we are able to explicitly use the Jacobian matrix for matrix-vector products in the Newton method. Some techniques are incorporated to reduce the running time as well as memory consumption. First, to reduce the communication cost, we propose a diagonalization method such that an approximate Newton direction can be obtained without communication between machines. Second, we consider subsampled Gauss-Newton matrices for reducing the running time as well as the communication cost. Third, to reduce the synchronization cost, we terminate the process of finding an approximate Newton direction even though some nodes have not finished their tasks. Details of some implementation issues in distributed environments are thoroughly investigated. Experiments demonstrate that the proposed method is effective for the distributed training of deep neural networks. Compared with stochastic gradient methods, it is more robust and may give better test accuracy.

Download Full-text

Convergence of Quasi-Newton Method for Fully Complex-Valued Neural Networks

Neural Processing Letters ◽

10.1007/s11063-017-9621-7 ◽

2017 ◽

Vol 46 (3) ◽

pp. 961-968 ◽

Cited By ~ 2

Author(s):

Dongpo Xu ◽

Jian Dong ◽

Chengdong Zhang

Keyword(s):

Neural Networks ◽

Newton Method ◽

Quasi Newton ◽

Complex Valued

Download Full-text

Mixing neural networks and the Newton method for the kinematics of simple cable-driven parallel robots with sagging cables

Parallel robots pose accuracy compensation using artificial neural networks

On the Sliding Mode Control of Redundant Parallel Robots Using Neural Networks

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov’s Gradient for Training Neural Networks

A learning algorithm for multilayered neural networks: a Newton method using automatic differentiation

Solving the Forward Kinematics of Cable-Driven Parallel Robots with Neural Networks and Interval Arithmetic

Finite-Time Stable Versions of the Continuous Newton Method and Applications to Neural Networks

Distributed Newton Methods for Deep Neural Networks

Convergence of Quasi-Newton Method for Fully Complex-Valued Neural Networks

Export Citation Format