scholarly journals Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode

2019 ◽  
Vol 9 (21) ◽  
pp. 4568
Author(s):  
Hyeyoung Park ◽  
Kwanyong Lee

Gradient descent method is an essential algorithm for learning of neural networks. Among diverse variations of gradient descent method that have been developed for accelerating learning speed, the natural gradient learning is based on the theory of information geometry on stochastic neuromanifold, and is known to have ideal convergence properties. Despite its theoretical advantages, the pure natural gradient has some limitations that prevent its practical usage. In order to get the explicit value of the natural gradient, it is required to know true probability distribution of input variables, and to calculate inverse of a matrix with the square size of the number of parameters. Though an adaptive estimation of the natural gradient has been proposed as a solution, it was originally developed for online learning mode, which is computationally inefficient for the learning of large data set. In this paper, we propose a novel adaptive natural gradient estimation for mini-batch learning mode, which is commonly adopted for big data analysis. For two representative stochastic neural network models, we present explicit rules of parameter updates and learning algorithm. Through experiments on three benchmark problems, we confirm that the proposed method has superior convergence properties to the conventional methods.

2014 ◽  
pp. 99-106
Author(s):  
Leonid Makhnist ◽  
Nikolaj Maniakov ◽  
Nikolaj Maniakov

Is proposed two new techniques for multilayer neural networks training. Its basic concept is based on the gradient descent method. For every methodic are showed formulas for calculation of the adaptive training steps. Presented matrix algorithmizations for all of these techniques are very helpful in its program realization.


Author(s):  
Stefan Balluff ◽  
Jörg Bendfeld ◽  
Stefan Krauter

Gathering knowledge not only of the current but also the upcoming wind speed is getting more and more important as the experience of operating and maintaining wind turbines is increasing. Not only with regards to operation and maintenance tasks such as gearbox and generator checks but moreover due to the fact that energy providers have to sell the right amount of their converted energy at the European energy markets, the knowledge of the wind and hence electrical power of the next day is of key importance. Selling more energy as has been offered is penalized as well as offering less energy as contractually promised. In addition to that the price per offered kWh decreases in case of a surplus of energy. Achieving a forecast there are various methods in computer science: fuzzy logic, linear prediction or neural networks. This paper presents current results of wind speed forecasts using recurrent neural networks (RNN) and the gradient descent method plus a backpropagation learning algorithm. Data used has been extracted from NASA's Modern Era-Retrospective analysis for Research and Applications (MERRA) which is calculated by a GEOS-5 Earth System Modeling and Data Assimilation system. The presented results show that wind speed data can be forecasted using historical data for training the RNN. Nevertheless, the current set up system lacks robustness and can be improved further with regards to accuracy.


2018 ◽  
Vol 98 (2) ◽  
pp. 331-338 ◽  
Author(s):  
STEFAN PANIĆ ◽  
MILENA J. PETROVIĆ ◽  
MIROSLAVA MIHAJLOV CAREVIĆ

We improve the convergence properties of the iterative scheme for solving unconstrained optimisation problems introduced in Petrovic et al. [‘Hybridization of accelerated gradient descent method’, Numer. Algorithms (2017), doi:10.1007/s11075-017-0460-4] by optimising the value of the initial step length parameter in the backtracking line search procedure. We prove the validity of the algorithm and illustrate its advantages by numerical experiments and comparisons.


1997 ◽  
Vol 9 (7) ◽  
pp. 1457-1482 ◽  
Author(s):  
Howard Hua Yang ◽  
Shun-ichi Amari

There are two major approaches for blind separation: maximum entropy (ME) and minimum mutual information (MMI). Both can be implemented by the stochastic gradient descent method for obtaining the demixing matrix. The MI is the contrast function for blind separation; the entropy is not. To justify the ME, the relation between ME and MMI is first elucidated by calculating the first derivative of the entropy and proving that the mean subtraction is necessary in applying the ME and at the solution points determined by the MI, the ME will not update the demixing matrix in the directions of increasing the cross-talking. Second, the natural gradient instead of the ordinary gradient is introduced to obtain efficient algorithms, because the parameter space is a Riemannian space consisting of matrices. The mutual information is calculated by applying the Gram-Charlier expansion to approximate probability density functions of the outputs. Finally, we propose an efficient learning algorithm that incorporates with an adaptive method of estimating the unknown cumulants. It is shown by computer simulation that the convergence of the stochastic descent algorithms is improved by using the natural gradient and the adaptively estimated cumulants.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2253
Author(s):  
Xiao Wang ◽  
Peng Shi ◽  
Yushan Zhao ◽  
Yue Sun

In order to help the pursuer find its advantaged control policy in a one-to-one game in space, this paper proposes an innovative pre-trained fuzzy reinforcement learning algorithm, which is conducted in the x, y, and z channels separately. Compared with the previous algorithms applied in ground games, this is the first time reinforcement learning has been introduced to help the pursuer in space optimize its control policy. The known part of the environment is utilized to help the pursuer pre-train its consequent set before learning. An actor-critic framework is built in each moving channel of the pursuer. The consequent set of the pursuer is updated through the gradient descent method in fuzzy inference systems. The numerical experimental results validate the effectiveness of the proposed algorithm in improving the game ability of the pursuer.


1998 ◽  
Vol 35 (02) ◽  
pp. 395-406 ◽  
Author(s):  
Jürgen Dippon

A stochastic gradient descent method is combined with a consistent auxiliary estimate to achieve global convergence of the recursion. Using step lengths converging to zero slower than 1/n and averaging the trajectories, yields the optimal convergence rate of 1/√n and the optimal variance of the asymptotic distribution. Possible applications can be found in maximum likelihood estimation, regression analysis, training of artificial neural networks, and stochastic optimization.


Author(s):  
Kseniia Bazilevych ◽  
Ievgen Meniailov ◽  
Dmytro Chumachenko

Subject: the use of the mathematical apparatus of neural networks for the scientific substantiation of anti-epidemic measures in order to reduce the incidence of diseases when making effective management decisions. Purpose: to apply cluster analysis, based on a neural network, to solve the problem of identifying areas of incidence. Tasks: to analyze methods of data analysis to solve the clustering problem; to develop a neural network method for clustering the territory of Ukraine according to the nature of the epidemic process COVID-19; on the basis of the developed method, to implement a data analysis software product to identify the areas of incidence of the disease using the example of the coronavirus COVID-19. Methods: models and methods of data analysis, models and methods of systems theory (based on the information approach), machine learning methods, in particular the Adaptive Boosting method (based on the gradient descent method), methods for training neural networks. Results: we used the data of the Center for Public Health of the Ministry of Health of Ukraine distributed over the regions of Ukraine on the incidence of COVID-19, the number of laboratory examined persons, the number of laboratory tests performed by PCR and ELISA methods, the number of laboratory tests of IgA, IgM, IgG; the model used data from March 2020 to December 2020, the modeling did not take into account data from the temporarily occupied territories of Ukraine; for cluster analysis, a neural network of 60 input neurons, 100 hidden neurons with an activation Fermi function and 4 output neurons was built; for the software implementation of the model, the programming language Python was used. Conclusions: analysis of methods for constructing neural networks; analysis of training methods for neural networks, including the use of the gradient descent method for the Adaptive Boosting method; all theoretical information described in this work was used to implement a software product for processing test data for COVID-19 in Ukraine; the division of the regions of Ukraine into zones of infection with the COVID-19 virus was carried out and a map of this division was presented.


1998 ◽  
Vol 35 (2) ◽  
pp. 395-406 ◽  
Author(s):  
Jürgen Dippon

A stochastic gradient descent method is combined with a consistent auxiliary estimate to achieve global convergence of the recursion. Using step lengths converging to zero slower than 1/n and averaging the trajectories, yields the optimal convergence rate of 1/√n and the optimal variance of the asymptotic distribution. Possible applications can be found in maximum likelihood estimation, regression analysis, training of artificial neural networks, and stochastic optimization.


Sign in / Sign up

Export Citation Format

Share Document