Globally convergent stochastic optimization with optimal asymptotic distribution

A stochastic gradient descent method is combined with a consistent auxiliary estimate to achieve global convergence of the recursion. Using step lengths converging to zero slower than 1/n and averaging the trajectories, yields the optimal convergence rate of 1/√n and the optimal variance of the asymptotic distribution. Possible applications can be found in maximum likelihood estimation, regression analysis, training of artificial neural networks, and stochastic optimization.

Download Full-text

SOME METHODS OF ADAPTIVE MULTILAYER NEURAL NETWORKS TRAINING

International Journal of Computing ◽

10.47839/ijc.3.1.259 ◽

2014 ◽

pp. 99-106

Author(s):

Leonid Makhnist ◽

Nikolaj Maniakov ◽

Nikolaj Maniakov

Keyword(s):

Neural Networks ◽

Basic Concept ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method ◽

New Techniques ◽

Adaptive Training ◽

Multilayer Neural Networks

Is proposed two new techniques for multilayer neural networks training. Its basic concept is based on the gradient descent method. For every methodic are showed formulas for calculation of the adaptive training steps. Presented matrix algorithmizations for all of these techniques are very helpful in its program realization.

Download Full-text

Fast identification of a human skeleton-marker model for motion capture system using stochastic gradient descent method

2020 8th IEEE RAS/EMBS International Conference for Biomedical Robotics and Biomechatronics (BioRob) ◽

10.1109/biorob49111.2020.9224442 ◽

2020 ◽

Author(s):

Tianyi Zou ◽

Tomomichi Sugihara

Keyword(s):

Motion Capture ◽

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Motion Capture System ◽

Human Skeleton ◽

Fast Identification

Download Full-text

Hyperparameter-free optimizer of stochastic gradient descent that incorporates unit correction and moment estimation

10.1101/348557 ◽

2018 ◽

Author(s):

Kazunori D Yamada

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Mathematical Optimization ◽

Descent Method ◽

Learning Rate ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Moment Estimation ◽

Estimation System

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.

Download Full-text

SSGD: A Safe and Efficient Method of Gradient Descent

Security and Communication Networks ◽

10.1155/2021/5404061 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Jinhuan Duan ◽

Xianxian Li ◽

Shiqi Gao ◽

Zili Zhong ◽

Jinyan Wang

Keyword(s):

Gradient Descent ◽

Large Scale ◽

Optimization Problems ◽

Unit Vector ◽

Descent Method ◽

Stochastic Gradient ◽

Learning System ◽

Training Data ◽

Stochastic Gradient Descent ◽

Gradient Descent Method

With the vigorous development of artificial intelligence technology, various engineering technology applications have been implemented one after another. The gradient descent method plays an important role in solving various optimization problems, due to its simple structure, good stability, and easy implementation. However, in multinode machine learning system, the gradients usually need to be shared, which will cause privacy leakage, because attackers can infer training data with the gradient information. In this paper, to prevent gradient leakage while keeping the accuracy of the model, we propose the super stochastic gradient descent approach to update parameters by concealing the modulus length of gradient vectors and converting it or them into a unit vector. Furthermore, we analyze the security of super stochastic gradient descent approach and demonstrate that our algorithm can defend against the attacks on the gradient. Experiment results show that our approach is obviously superior to prevalent gradient descent approaches in terms of accuracy, robustness, and adaptability to large-scale batches. Interestingly, our algorithm can also resist model poisoning attacks to a certain extent.

Download Full-text

Adaptive Natural Gradient Method for Learning of Stochastic Neural Networks in Mini-Batch Mode

Applied Sciences ◽

10.3390/app9214568 ◽

2019 ◽

Vol 9 (21) ◽

pp. 4568

Author(s):

Hyeyoung Park ◽

Kwanyong Lee

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Learning Algorithm ◽

Descent Method ◽

Benchmark Problems ◽

Stochastic Neural Networks ◽

Gradient Descent Method ◽

Natural Gradient ◽

Convergence Properties ◽

Data Set

Gradient descent method is an essential algorithm for learning of neural networks. Among diverse variations of gradient descent method that have been developed for accelerating learning speed, the natural gradient learning is based on the theory of information geometry on stochastic neuromanifold, and is known to have ideal convergence properties. Despite its theoretical advantages, the pure natural gradient has some limitations that prevent its practical usage. In order to get the explicit value of the natural gradient, it is required to know true probability distribution of input variables, and to calculate inverse of a matrix with the square size of the number of parameters. Though an adaptive estimation of the natural gradient has been proposed as a solution, it was originally developed for online learning mode, which is computationally inefficient for the learning of large data set. In this paper, we propose a novel adaptive natural gradient estimation for mini-batch learning mode, which is commonly adopted for big data analysis. For two representative stochastic neural network models, we present explicit rules of parameter updates and learning algorithm. Through experiments on three benchmark problems, we confirm that the proposed method has superior convergence properties to the conventional methods.

Download Full-text

Implicit Stochastic Gradient Descent Method for Cross-Domain Recommendation System

Sensors ◽

10.3390/s20092510 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2510

Author(s):

Nam D. Vo ◽

Minsung Hong ◽

Jason J. Jung

Keyword(s):

Gradient Descent ◽

Recommendation System ◽

Computation Time ◽

Descent Method ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Target Domain ◽

Cross Domain ◽

Gradient Descent Algorithm

The previous recommendation system applied the matrix factorization collaborative filtering (MFCF) technique to only single domains. Due to data sparsity, this approach has a limitation in overcoming the cold-start problem. Thus, in this study, we focus on discovering latent features from domains to understand the relationships between domains (called domain coherence). This approach uses potential knowledge of the source domain to improve the quality of the target domain recommendation. In this paper, we consider applying MFCF to multiple domains. Mainly, by adopting the implicit stochastic gradient descent algorithm to optimize the objective function for prediction, multiple matrices from different domains are consolidated inside the cross-domain recommendation system (CDRS). Additionally, we design a conceptual framework for CDRS, which applies to different industrial scenarios for recommenders across domains. Moreover, an experiment is devised to validate the proposed method. By using a real-world dataset gathered from Amazon Food and MovieLens, experimental results show that the proposed method improves 15.2% and 19.7% in terms of computation time and MSE over other methods on a utility matrix. Notably, a much lower convergence value of the loss function has been obtained from the experiment. Furthermore, a critical analysis of the obtained results shows that there is a dynamic balance between prediction accuracy and computational complexity.

Download Full-text

Comparison of gradient descent method, Kalman filtering and decoupled kalman in training neural networks used for fingerprint-based positioning

IEEE 60th Vehicular Technology Conference, 2004. VTC2004-Fall. 2004 ◽

10.1109/vetecf.2004.1404859 ◽

2005 ◽

Cited By ~ 6

Author(s):

C.M. Takenga ◽

K. Rao Anne ◽

K. Kyamakya ◽

J. Chamberlain Chedjou

Keyword(s):

Neural Networks ◽

Kalman Filtering ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method

Download Full-text

Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-190861 ◽

2019 ◽

Vol 37 (4) ◽

pp. 5641-5654 ◽

Cited By ~ 3

Author(s):

Qinghe Zheng ◽

Xinyu Tian ◽

Nan Jiang ◽

Mingqiang Yang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gradient Descent ◽

Descent Method ◽

Stochastic Gradient ◽

Deep Convolutional Neural Network ◽

Stochastic Gradient Descent ◽

Gradient Descent Method

Download Full-text

IDENTIFICATION OF AREAS OF CORONAVIRUS COVID-19 INCIDENCE SPREADING BASED ON CLUSTER ANALYSIS METHOD

Innovative technologies and scientific solutions for industries ◽

10.30837/itssi.2021.15.005 ◽

2021 ◽

pp. 5-13

Author(s):

Kseniia Bazilevych ◽

Ievgen Meniailov ◽

Dmytro Chumachenko

Keyword(s):

Neural Network ◽

Neural Networks ◽

Cluster Analysis ◽

Data Analysis ◽

Gradient Descent ◽

Descent Method ◽

Gradient Descent Method ◽

Adaptive Boosting ◽

Software Product ◽

Boosting Method

Subject: the use of the mathematical apparatus of neural networks for the scientific substantiation of anti-epidemic measures in order to reduce the incidence of diseases when making effective management decisions. Purpose: to apply cluster analysis, based on a neural network, to solve the problem of identifying areas of incidence. Tasks: to analyze methods of data analysis to solve the clustering problem; to develop a neural network method for clustering the territory of Ukraine according to the nature of the epidemic process COVID-19; on the basis of the developed method, to implement a data analysis software product to identify the areas of incidence of the disease using the example of the coronavirus COVID-19. Methods: models and methods of data analysis, models and methods of systems theory (based on the information approach), machine learning methods, in particular the Adaptive Boosting method (based on the gradient descent method), methods for training neural networks. Results: we used the data of the Center for Public Health of the Ministry of Health of Ukraine distributed over the regions of Ukraine on the incidence of COVID-19, the number of laboratory examined persons, the number of laboratory tests performed by PCR and ELISA methods, the number of laboratory tests of IgA, IgM, IgG; the model used data from March 2020 to December 2020, the modeling did not take into account data from the temporarily occupied territories of Ukraine; for cluster analysis, a neural network of 60 input neurons, 100 hidden neurons with an activation Fermi function and 4 output neurons was built; for the software implementation of the model, the programming language Python was used. Conclusions: analysis of methods for constructing neural networks; analysis of training methods for neural networks, including the use of the gradient descent method for the Adaptive Boosting method; all theoretical information described in this work was used to implement a software product for processing test data for COVID-19 in Ukraine; the division of the regions of Ukraine into zones of infection with the COVID-19 virus was carried out and a map of this division was presented.

Download Full-text