Convergence Analysis of Inverse Iterative Neural Networks with L2 Penalty

Abstract The iterative inversion of neural networks has been used in solving problems of adaptive control due to its good performance of information processing. In this paper an iterative inversion neural network with L2 penalty term has been presented trained by using the classical gradient descent method. We mainly focus on the theoretical analysis of this proposed algorithm such as monotonicity of error function, boundedness of input sequences and weak (strong) convergence behavior. For bounded property of inputs, we rigorously proved that the feasible solutions of input are restricted in a measurable field. The weak convergence means that the gradient of error function with respect to input tends to zero as the iterations go to infinity while the strong convergence stands for the iterative sequence of input vectors convergence to a fixed optimal point.

Download Full-text

Classification and Prediction with Neural Networks

Data Mining and Medical Knowledge Management ◽

10.4018/978-1-60566-218-3.ch004 ◽

2011 ◽

pp. 76-107

Author(s):

Arnošt Veselý

Keyword(s):

Neural Networks ◽

Error Function ◽

Descent Method ◽

Cross Entropy ◽

Gradient Descent Method ◽

Classification Problems ◽

Network Training ◽

The Neural Network ◽

Gradient Calculation ◽

Entropy Error

This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it demonstrates that in classification problems one should use cross-entropy error function rather than the usual sum-of-square error function. Using gradient descent method for finding the minimum of the cross entropy error function, leads to the well-known backpropagation of error scheme of gradient calculation if at the output layer of the neural network the neurons with logistic or softmax output functions are used. The author believes that understanding the underlying theory presented in this chapter will help researchers in medical informatics to choose more suitable network architectures for medical applications and that it helps them to carry out the network training more effectively.

Download Full-text

SOME METHODS OF ADAPTIVE MULTILAYER NEURAL NETWORKS TRAINING

International Journal of Computing ◽

10.47839/ijc.3.1.259 ◽

2014 ◽

pp. 99-106

Author(s):

Leonid Makhnist ◽

Nikolaj Maniakov ◽

Nikolaj Maniakov

Keyword(s):

Neural Networks ◽

Basic Concept ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method ◽

New Techniques ◽

Adaptive Training ◽

Multilayer Neural Networks

Is proposed two new techniques for multilayer neural networks training. Its basic concept is based on the gradient descent method. For every methodic are showed formulas for calculation of the adaptive training steps. Presented matrix algorithmizations for all of these techniques are very helpful in its program realization.

Download Full-text

Meteorological Data Forecast using RNN

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch050 ◽

2020 ◽

pp. 905-920

Author(s):

Stefan Balluff ◽

Jörg Bendfeld ◽

Stefan Krauter

Keyword(s):

Neural Networks ◽

Wind Speed ◽

Linear Prediction ◽

Learning Algorithm ◽

Meteorological Data ◽

System Modeling ◽

Descent Method ◽

Gradient Descent Method ◽

Earth System Modeling ◽

Set Up

Gathering knowledge not only of the current but also the upcoming wind speed is getting more and more important as the experience of operating and maintaining wind turbines is increasing. Not only with regards to operation and maintenance tasks such as gearbox and generator checks but moreover due to the fact that energy providers have to sell the right amount of their converted energy at the European energy markets, the knowledge of the wind and hence electrical power of the next day is of key importance. Selling more energy as has been offered is penalized as well as offering less energy as contractually promised. In addition to that the price per offered kWh decreases in case of a surplus of energy. Achieving a forecast there are various methods in computer science: fuzzy logic, linear prediction or neural networks. This paper presents current results of wind speed forecasts using recurrent neural networks (RNN) and the gradient descent method plus a backpropagation learning algorithm. Data used has been extracted from NASA's Modern Era-Retrospective analysis for Research and Applications (MERRA) which is calculated by a GEOS-5 Earth System Modeling and Data Assimilation system. The presented results show that wind speed data can be forecasted using historical data for training the RNN. Nevertheless, the current set up system lacks robustness and can be improved further with regards to accuracy.

Download Full-text

Convergence Analysis of Multilayer Feedforward Networks Trained with Penalty Terms: A Review

Journal of Applied Computer Science Methods ◽

10.1515/jacsm-2015-0011 ◽

2015 ◽

Vol 7 (2) ◽

pp. 89-103 ◽

Cited By ~ 2

Author(s):

Jian Wang ◽

Guoling Yang ◽

Shan Liu ◽

Jacek M. Zurada

Keyword(s):

Gradient Descent ◽

Convergence Property ◽

Quality Criterion ◽

Feedforward Neural Networks ◽

Descent Method ◽

Gradient Descent Method ◽

Penalty Term ◽

Network Pruning ◽

Gradient Based ◽

Pruning Algorithms

Abstract Gradient descent method is one of the popular methods to train feedforward neural networks. Batch and incremental modes are the two most common methods to practically implement the gradient-based training for such networks. Furthermore, since generalization is an important property and quality criterion of a trained network, pruning algorithms with the addition of regularization terms have been widely used as an efficient way to achieve good generalization. In this paper, we review the convergence property and other performance aspects of recently researched training approaches based on different penalization terms. In addition, we show the smoothing approximation tricks when the penalty term is non-differentiable at origin.

Download Full-text

THE APPLICATION OF FEEDFORWARD NEURAL NETWORKS IN VLSI FABRICATION PROCESS OPTIMIZATION

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026801000032 ◽

2001 ◽

Vol 01 (01) ◽

pp. 83-90 ◽

Cited By ~ 3

Author(s):

WANG XIANGDONG ◽

WANG SHOUJUE

Keyword(s):

Neural Networks ◽

Manufacturing Process ◽

Large Scale ◽

Maximum Yield ◽

Feedforward Neural Networks ◽

Descent Method ◽

Gradient Descent Method ◽

Manufacturing Process Control ◽

Wafer Probing ◽

Process Learning

In this paper, we present a neural-based manufacturing process control system for semiconductor factories to improve the die yield. A model based on neural networks is proposed to simulate Very Large-Scale Integrated (VLSI) manufacturing process. Learning from the historical processing lists with Radial Basis Function (RBF), we simulate the functional relationship between the wafer probing parameters and the die yield. Then we use a gradient-descent method to search a set of 'optimal' parameters that lead to the maximum yield of the model. At last, we adjust the specification in the practical semiconductor manufacturing process. The average die yield increased from 51.7% to 57.5% after the system had been applied in Huajing Corporation.

Download Full-text

ANN based method for improving gold price forecasting accuracy through modified gradient descent methods

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i1.pp46-57 ◽

2020 ◽

Vol 9 (1) ◽

pp. 46

Author(s):

Shilpa Verma ◽

G. T. Thampi ◽

Madhuri Rao

Keyword(s):

Gradient Descent ◽

Likelihood Function ◽

Error Function ◽

Complex Problem ◽

Descent Method ◽

Gold Price ◽

Price Forecasting ◽

Gradient Descent Method ◽

Forecasting Accuracy ◽

Median Error

Forecast of prices of financial assets including gold is of considerable importance for planning the economy. For centuries, people have been holding gold for many important reasons such as smoothening inflation fluctuations, protection from an economic crisis, sound investment etc.. Forecasting of gold prices is therefore an ever important exercise undertaken both by individuals and groups. Various local, global, political, psychological and economic factors make such a forecast a complex problem. Data analysts have been increasingly applying Artificial Intelligence (AI) techniques to make such forecasts. In the present work an inter comparison of gold price forecasting in Indian market is first done by employing a few classical Artificial Neural Network (ANN) techniques, namely Gradient Descent Method (GDM), Resilient Backpropagation method (RP), Scaled Conjugate Gradient method (SCG), Levenberg-Marquardt method (LM), Bayesian Regularization method (BR), One Step Secant method (OSS) and BFGS Quasi Newton method (BFG). Improvement in forecasting accuracy is achieved by proposing and developing a few modified GDM algorithms that incorporate different optimization functions by replacing the standard quadratic error function of classical GDM. Various optimization functions investigated in the present work are Mean median error function (MMD), Cauchy error function (CCY), Minkowski error function (MKW), Log cosh error function (LCH) and Negative logarithmic likelihood function (NLG). Modified algorithms incorporating these optimization functions are referred to here by GDM_MMD, GDM_CCY, GDM_KWK, GDM_LCH and GDM_NLG respectively. Gold price forecasting is then done by employing these algorithms and the results are analysed. The results of our study suggest that the forecasting efficiency improves considerably on applying the modified methods proposed by us.

Download Full-text

Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode

Applied Sciences ◽

10.3390/app9214568 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4568

Author(s):

Hyeyoung Park ◽

Kwanyong Lee

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Learning Algorithm ◽

Descent Method ◽

Benchmark Problems ◽

Stochastic Neural Networks ◽

Gradient Descent Method ◽

Natural Gradient ◽

Convergence Properties ◽

Data Set

Gradient descent method is an essential algorithm for learning of neural networks. Among diverse variations of gradient descent method that have been developed for accelerating learning speed, the natural gradient learning is based on the theory of information geometry on stochastic neuromanifold, and is known to have ideal convergence properties. Despite its theoretical advantages, the pure natural gradient has some limitations that prevent its practical usage. In order to get the explicit value of the natural gradient, it is required to know true probability distribution of input variables, and to calculate inverse of a matrix with the square size of the number of parameters. Though an adaptive estimation of the natural gradient has been proposed as a solution, it was originally developed for online learning mode, which is computationally inefficient for the learning of large data set. In this paper, we propose a novel adaptive natural gradient estimation for mini-batch learning mode, which is commonly adopted for big data analysis. For two representative stochastic neural network models, we present explicit rules of parameter updates and learning algorithm. Through experiments on three benchmark problems, we confirm that the proposed method has superior convergence properties to the conventional methods.

Download Full-text

Training Pi-Sigma Network by Online Gradient Algorithm with Penalty for Small Weight Update

Neural Computation ◽

10.1162/neco.2007.19.12.3356 ◽

2007 ◽

Vol 19 (12) ◽

pp. 3356-3368 ◽

Cited By ~ 10

Author(s):

Yan Xiong ◽

Wei Wu ◽

Xidai Kang ◽

Chao Zhang

Keyword(s):

Neural Networks ◽

Numerical Experiments ◽

Error Function ◽

Feedforward Neural Networks ◽

Gradient Algorithm ◽

Training Method ◽

Penalty Term ◽

Slow Convergence ◽

Output Layer ◽

Adaptive Penalty

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.

Download Full-text

Comparison of gradient descent method, Kalman filtering and decoupled kalman in training neural networks used for fingerprint-based positioning

IEEE 60th Vehicular Technology Conference, 2004. VTC2004-Fall. 2004 ◽

10.1109/vetecf.2004.1404859 ◽

2005 ◽

Cited By ~ 6

Author(s):

C.M. Takenga ◽

K. Rao Anne ◽

K. Kyamakya ◽

J. Chamberlain Chedjou

Keyword(s):

Neural Networks ◽

Kalman Filtering ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method

Download Full-text

Globally convergent stochastic optimization with optimal asymptotic distribution

Journal of Applied Probability ◽

10.1017/s0021900200015023 ◽

1998 ◽

Vol 35 (02) ◽

pp. 395-406 ◽

Cited By ~ 3

Author(s):

Jürgen Dippon

Keyword(s):

Neural Networks ◽

Stochastic Optimization ◽

Asymptotic Distribution ◽

Gradient Descent ◽

Likelihood Estimation ◽

Descent Method ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Optimal Convergence Rate ◽

Globally Convergent

A stochastic gradient descent method is combined with a consistent auxiliary estimate to achieve global convergence of the recursion. Using step lengths converging to zero slower than 1/n and averaging the trajectories, yields the optimal convergence rate of 1/√n and the optimal variance of the asymptotic distribution. Possible applications can be found in maximum likelihood estimation, regression analysis, training of artificial neural networks, and stochastic optimization.

Download Full-text