Convergence of Batch Gradient Method Based on the Entropy Error Function for Feedforward Neural Networks

Yan Xiong; Xin Tong

doi:10.1007/s11063-020-10374-w

Convergence of Gradient Method With Momentum for Two-Layer Feedforward Neural Networks

IEEE Transactions on Neural Networks ◽

10.1109/tnn.2005.863460 ◽

2006 ◽

Vol 17 (2) ◽

pp. 522-525 ◽

Cited By ~ 45

Author(s):

N. Zhang ◽

W. Wu ◽

G. Zheng

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text

Boundedness and Convergence of Online Gradient Method With Penalty for Feedforward Neural Networks

IEEE Transactions on Neural Networks ◽

10.1109/tnn.2009.2020848 ◽

2009 ◽

Vol 20 (6) ◽

pp. 1050-1054 ◽

Cited By ~ 50

Author(s):

Huisheng Zhang ◽

Wei Wu ◽

Fei Liu ◽

Mingchen Yao

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text

Boundedness and Convergence of Online Gradient Method with Penalty for Linear Output Feedforward Neural Networks

Neural Processing Letters ◽

10.1007/s11063-009-9104-6 ◽

2009 ◽

Vol 29 (3) ◽

pp. 205-212 ◽

Cited By ~ 11

Author(s):

Huisheng Zhang ◽

Wei Wu

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text

Advanced supervised learning in multi-layer perceptrons to the recognition tasks based on correlation indicator

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(1)-2 ◽

2021 ◽

Vol 33 (1) ◽

pp. 33-46

Author(s):

Nikolay Anatolievich Vershkov ◽

Mikhail Grigoryevich Babenko ◽

Viktor Andreevich Kuchukov ◽

Natalia Nikolaevna Kuchukova

Keyword(s):

Neural Networks ◽

Information Transmission ◽

Error Function ◽

Feedforward Neural Networks ◽

Oscillatory System ◽

Ideal Observer ◽

Global Extremum ◽

The Neural Network ◽

Handwritten Digit ◽

The Mathematical Model

The article deals with the problem of recognition of handwritten digits using feedforward neural networks (perceptrons) using a correlation indicator. The proposed method is based on the mathematical model of the neural network as an oscillatory system similar to the information transmission system. The article uses theoretical developments of the authors to search for the global extremum of the error function in artificial neural networks. The handwritten digit image is considered as a one-dimensional input discrete signal representing a combination of "perfect digit writing" and noise, which describes the deviation of the input implementation from "perfect writing". The ideal observer criterion (Kotelnikov criterion), which is widely used in information transmission systems and describes the probability of correct recognition of the input signal, is used to form the loss function. In the article is carried out a comparative analysis of the convergence of learning and experimentally obtained sequences on the basis of the correlation indicator and widely used in the tasks of classification of the function CrossEntropyLoss with the use of the optimizer and without it. Based on the experiments carried out, it is concluded that the proposed correlation indicator has an advantage of 2-3 times.

Download Full-text

Classification and Prediction with Neural Networks

Data Mining and Medical Knowledge Management ◽

10.4018/978-1-60566-218-3.ch004 ◽

2011 ◽

pp. 76-107

Author(s):

Arnošt Veselý

Keyword(s):

Neural Networks ◽

Error Function ◽

Descent Method ◽

Cross Entropy ◽

Gradient Descent Method ◽

Classification Problems ◽

Network Training ◽

The Neural Network ◽

Gradient Calculation ◽

Entropy Error

This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it demonstrates that in classification problems one should use cross-entropy error function rather than the usual sum-of-square error function. Using gradient descent method for finding the minimum of the cross entropy error function, leads to the well-known backpropagation of error scheme of gradient calculation if at the output layer of the neural network the neurons with logistic or softmax output functions are used. The author believes that understanding the underlying theory presented in this chapter will help researchers in medical informatics to choose more suitable network architectures for medical applications and that it helps them to carry out the network training more effectively.

Download Full-text

Training Pi-Sigma Network by Online Gradient Algorithm with Penalty for Small Weight Update

Neural Computation ◽

10.1162/neco.2007.19.12.3356 ◽

2007 ◽

Vol 19 (12) ◽

pp. 3356-3368 ◽

Cited By ~ 10

Author(s):

Yan Xiong ◽

Wei Wu ◽

Xidai Kang ◽

Chao Zhang

Keyword(s):

Neural Networks ◽

Numerical Experiments ◽

Error Function ◽

Feedforward Neural Networks ◽

Gradient Algorithm ◽

Training Method ◽

Penalty Term ◽

Slow Convergence ◽

Output Layer ◽

Adaptive Penalty

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.

Download Full-text

Convergence of an Online Split-Complex Gradient Algorithm for Complex-Valued Neural Networks

Discrete Dynamics in Nature and Society ◽

10.1155/2010/829692 ◽

2010 ◽

Vol 2010 ◽

pp. 1-27

Author(s):

Huisheng Zhang ◽

Dongpo Xu ◽

Zhiping Wang

Keyword(s):

Neural Networks ◽

Fixed Point ◽

Adaptive Learning ◽

Gradient Method ◽

Error Function ◽

Gradient Algorithm ◽

Training Procedure ◽

Weight Sequence ◽

Complex Valued ◽

Complex Gradient

The online gradient method has been widely used in training neural networks. We consider in this paper an online split-complex gradient algorithm for complex-valued neural networks. We choose an adaptive learning rate during the training procedure. Under certain conditions, by firstly showing the monotonicity of the error function, it is proved that the gradient of the error function tends to zero and the weight sequence tends to a fixed point. A numerical example is given to support the theoretical findings.

Download Full-text

Deterministic convergence of conjugate gradient method for feedforward neural networks

Neurocomputing ◽

10.1016/j.neucom.2011.03.016 ◽

2011 ◽

Vol 74 (14-15) ◽

pp. 2368-2376 ◽

Cited By ~ 26

Author(s):

Jian Wang ◽

Wei Wu ◽

Jacek M. Zurada

Keyword(s):

Neural Networks ◽

Conjugate Gradient Method ◽

Conjugate Gradient ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text

An online gradient method with momentum for two-layer feedforward neural networks

Applied Mathematics and Computation ◽

10.1016/j.amc.2009.02.038 ◽

2009 ◽

Vol 212 (2) ◽

pp. 488-498 ◽

Cited By ~ 22

Author(s):

Naimin Zhang

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text

Deterministic convergence of chaos injection-based gradient method for training feedforward neural networks

Cognitive Neurodynamics ◽

10.1007/s11571-014-9323-z ◽

2015 ◽

Vol 9 (3) ◽

pp. 331-340 ◽

Cited By ~ 4

Author(s):

Huisheng Zhang ◽

Ying Zhang ◽

Dongpo Xu ◽

Xiaodong Liu

Keyword(s):

Neural Networks ◽

Gradient Method ◽

Feedforward Neural Networks

Download Full-text