scholarly journals Convergence Analysis of Inverse Iterative Neural Networks with L2 Penalty

2016 ◽  
Vol 8 (2) ◽  
pp. 85-98
Author(s):  
Yanqing Wen ◽  
Jian Wang ◽  
Bingjia Huang ◽  
Jacek M. Zurada

Abstract The iterative inversion of neural networks has been used in solving problems of adaptive control due to its good performance of information processing. In this paper an iterative inversion neural network with L2 penalty term has been presented trained by using the classical gradient descent method. We mainly focus on the theoretical analysis of this proposed algorithm such as monotonicity of error function, boundedness of input sequences and weak (strong) convergence behavior. For bounded property of inputs, we rigorously proved that the feasible solutions of input are restricted in a measurable field. The weak convergence means that the gradient of error function with respect to input tends to zero as the iterations go to infinity while the strong convergence stands for the iterative sequence of input vectors convergence to a fixed optimal point.

Author(s):  
Arnošt Veselý

This chapter deals with applications of artificial neural networks in classification and regression problems. Based on theoretical analysis it demonstrates that in classification problems one should use cross-entropy error function rather than the usual sum-of-square error function. Using gradient descent method for finding the minimum of the cross entropy error function, leads to the well-known backpropagation of error scheme of gradient calculation if at the output layer of the neural network the neurons with logistic or softmax output functions are used. The author believes that understanding the underlying theory presented in this chapter will help researchers in medical informatics to choose more suitable network architectures for medical applications and that it helps them to carry out the network training more effectively.


2014 ◽  
pp. 99-106
Author(s):  
Leonid Makhnist ◽  
Nikolaj Maniakov ◽  
Nikolaj Maniakov

Is proposed two new techniques for multilayer neural networks training. Its basic concept is based on the gradient descent method. For every methodic are showed formulas for calculation of the adaptive training steps. Presented matrix algorithmizations for all of these techniques are very helpful in its program realization.


Author(s):  
Stefan Balluff ◽  
Jörg Bendfeld ◽  
Stefan Krauter

Gathering knowledge not only of the current but also the upcoming wind speed is getting more and more important as the experience of operating and maintaining wind turbines is increasing. Not only with regards to operation and maintenance tasks such as gearbox and generator checks but moreover due to the fact that energy providers have to sell the right amount of their converted energy at the European energy markets, the knowledge of the wind and hence electrical power of the next day is of key importance. Selling more energy as has been offered is penalized as well as offering less energy as contractually promised. In addition to that the price per offered kWh decreases in case of a surplus of energy. Achieving a forecast there are various methods in computer science: fuzzy logic, linear prediction or neural networks. This paper presents current results of wind speed forecasts using recurrent neural networks (RNN) and the gradient descent method plus a backpropagation learning algorithm. Data used has been extracted from NASA's Modern Era-Retrospective analysis for Research and Applications (MERRA) which is calculated by a GEOS-5 Earth System Modeling and Data Assimilation system. The presented results show that wind speed data can be forecasted using historical data for training the RNN. Nevertheless, the current set up system lacks robustness and can be improved further with regards to accuracy.


2015 ◽  
Vol 7 (2) ◽  
pp. 89-103 ◽  
Author(s):  
Jian Wang ◽  
Guoling Yang ◽  
Shan Liu ◽  
Jacek M. Zurada

Abstract Gradient descent method is one of the popular methods to train feedforward neural networks. Batch and incremental modes are the two most common methods to practically implement the gradient-based training for such networks. Furthermore, since generalization is an important property and quality criterion of a trained network, pruning algorithms with the addition of regularization terms have been widely used as an efficient way to achieve good generalization. In this paper, we review the convergence property and other performance aspects of recently researched training approaches based on different penalization terms. In addition, we show the smoothing approximation tricks when the penalty term is non-differentiable at origin.


Author(s):  
WANG XIANGDONG ◽  
WANG SHOUJUE

In this paper, we present a neural-based manufacturing process control system for semiconductor factories to improve the die yield. A model based on neural networks is proposed to simulate Very Large-Scale Integrated (VLSI) manufacturing process. Learning from the historical processing lists with Radial Basis Function (RBF), we simulate the functional relationship between the wafer probing parameters and the die yield. Then we use a gradient-descent method to search a set of 'optimal' parameters that lead to the maximum yield of the model. At last, we adjust the specification in the practical semiconductor manufacturing process. The average die yield increased from 51.7% to 57.5% after the system had been applied in Huajing Corporation.


Author(s):  
Shilpa Verma ◽  
G. T. Thampi ◽  
Madhuri Rao

Forecast of prices of financial assets including gold is of considerable importance for planning the economy. For centuries, people have been holding gold for many important reasons such as smoothening inflation fluctuations, protection from an economic crisis, sound investment etc.. Forecasting of gold prices is therefore an ever important exercise undertaken both by individuals and groups. Various local, global, political, psychological and economic factors make such a forecast a complex problem. Data analysts have been increasingly applying Artificial Intelligence (AI) techniques to make such forecasts. In the present work an inter comparison of gold price forecasting in Indian market is first done by employing a few classical Artificial Neural Network (ANN) techniques, namely Gradient Descent Method (GDM), Resilient Backpropagation method (RP), Scaled Conjugate Gradient method (SCG), Levenberg-Marquardt method (LM), Bayesian Regularization method (BR), One Step Secant method (OSS) and BFGS Quasi Newton method (BFG). Improvement in forecasting accuracy is achieved by proposing and developing a few modified GDM algorithms that incorporate different optimization functions by replacing the standard quadratic error function of classical GDM. Various optimization functions investigated in the present work are Mean median error function (MMD), Cauchy error function (CCY), Minkowski error function (MKW), Log cosh error function (LCH) and Negative logarithmic likelihood function (NLG). Modified algorithms incorporating these optimization functions are referred to here by GDM_MMD, GDM_CCY, GDM_KWK, GDM_LCH and GDM_NLG respectively. Gold price forecasting is then done by employing these algorithms and the results are analysed. The results of our study suggest that  the forecasting efficiency improves considerably on applying the modified methods proposed by us.


2019 ◽  
Vol 9 (21) ◽  
pp. 4568
Author(s):  
Hyeyoung Park ◽  
Kwanyong Lee

Gradient descent method is an essential algorithm for learning of neural networks. Among diverse variations of gradient descent method that have been developed for accelerating learning speed, the natural gradient learning is based on the theory of information geometry on stochastic neuromanifold, and is known to have ideal convergence properties. Despite its theoretical advantages, the pure natural gradient has some limitations that prevent its practical usage. In order to get the explicit value of the natural gradient, it is required to know true probability distribution of input variables, and to calculate inverse of a matrix with the square size of the number of parameters. Though an adaptive estimation of the natural gradient has been proposed as a solution, it was originally developed for online learning mode, which is computationally inefficient for the learning of large data set. In this paper, we propose a novel adaptive natural gradient estimation for mini-batch learning mode, which is commonly adopted for big data analysis. For two representative stochastic neural network models, we present explicit rules of parameter updates and learning algorithm. Through experiments on three benchmark problems, we confirm that the proposed method has superior convergence properties to the conventional methods.


2007 ◽  
Vol 19 (12) ◽  
pp. 3356-3368 ◽  
Author(s):  
Yan Xiong ◽  
Wei Wu ◽  
Xidai Kang ◽  
Chao Zhang

A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.


1998 ◽  
Vol 35 (02) ◽  
pp. 395-406 ◽  
Author(s):  
Jürgen Dippon

A stochastic gradient descent method is combined with a consistent auxiliary estimate to achieve global convergence of the recursion. Using step lengths converging to zero slower than 1/n and averaging the trajectories, yields the optimal convergence rate of 1/√n and the optimal variance of the asymptotic distribution. Possible applications can be found in maximum likelihood estimation, regression analysis, training of artificial neural networks, and stochastic optimization.


Sign in / Sign up

Export Citation Format

Share Document