Local Regularizer Improves Generalization

Yikai Zhang; Hui Qu; Dimitris Metaxas; Chao Chen

doi:10.1609/aaai.v34i04.6167

Local Regularizer Improves Generalization

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6167 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6861-6868 ◽

Cited By ~ 1

Author(s):

Yikai Zhang ◽

Hui Qu ◽

Dimitris Metaxas ◽

Chao Chen

Keyword(s):

Deep Learning ◽

Theoretical Analysis ◽

Experimental Evidence ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Training Algorithms ◽

Training Methods ◽

Theoretical Understanding ◽

Better Than

Regularization plays an important role in generalization of deep learning. In this paper, we study the generalization power of an unbiased regularizor for training algorithms in deep learning. We focus on training methods called Locally Regularized Stochastic Gradient Descent (LRSGD). An LRSGD leverages a proximal type penalty in gradient descent steps to regularize SGD in training. We show that by carefully choosing relevant parameters, LRSGD generalizes better than SGD. Our thorough theoretical analysis is supported by experimental evidence. It advances our theoretical understanding of deep learning and provides new perspectives on designing training algorithms. The code is available at https://github.com/huiqu18/LRSGD.

Download Full-text

Optical Recognition of Handwritten Logic Formulas Using Neural Networks

Electronics ◽

10.3390/electronics10222761 ◽

2021 ◽

Vol 10 (22) ◽

pp. 2761

Author(s):

Vaios Ampelakiotis ◽

Isidoros Perikos ◽

Ioannis Hatzilygeroudis ◽

George Tsihrintzis

Keyword(s):

Neural Networks ◽

Character Recognition ◽

Gradient Descent ◽

Feedforward Neural Networks ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Training Algorithms ◽

Gradient Descent Algorithm ◽

Two Stages ◽

And Training

In this paper, we present a handwritten character recognition (HCR) system that aims to recognize first-order logic handwritten formulas and create editable text files of the recognized formulas. Dense feedforward neural networks (NNs) are utilized, and their performance is examined under various training conditions and methods. More specifically, after three training algorithms (backpropagation, resilient propagation and stochastic gradient descent) had been tested, we created and trained an NN with the stochastic gradient descent algorithm, optimized by the Adam update rule, which was proved to be the best, using a trainset of 16,750 handwritten image samples of 28 × 28 each and a testset of 7947 samples. The final accuracy achieved is 90.13%. The general methodology followed consists of two stages: the image processing and the NN design and training. Finally, an application has been created that implements the methodology and automatically recognizes handwritten logic formulas. An interesting feature of the application is that it allows for creating new, user-oriented training sets and parameter settings, and thus new NN models.

Download Full-text

MapReduce and Optimized Deep Network for Rainfall Prediction in Agriculture

The Computer Journal ◽

10.1093/comjnl/bxz164 ◽

2020 ◽

Vol 63 (6) ◽

pp. 900-912

Author(s):

Oswalt Manoj S ◽

Ananth J P

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Prediction Models ◽

Short Term Memory ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Mean Square ◽

Rainfall Prediction ◽

Gradient Descent Algorithm ◽

Major Factors

Abstract Rainfall prediction is the active area of research as it enables the farmers to move with the effective decision-making regarding agriculture in both cultivation and irrigation. The existing prediction models are scary as the prediction of rainfall depended on three major factors including the humidity, rainfall and rainfall recorded in the previous years, which resulted in huge time consumption and leveraged huge computational efforts associated with the analysis. Thus, this paper introduces the rainfall prediction model based on the deep learning network, convolutional long short-term memory (convLSTM) system, which promises a prediction based on the spatial-temporal patterns. The weights of the convLSTM are tuned optimally using the proposed Salp-stochastic gradient descent algorithm (S-SGD), which is the integration of Salp swarm algorithm (SSA) in the stochastic gradient descent (SGD) algorithm in order to facilitate the global optimal tuning of the weights and to assure a better prediction accuracy. On the other hand, the proposed deep learning framework is built in the MapReduce framework that enables the effective handling of the big data. The analysis using the rainfall prediction database reveals that the proposed model acquired the minimal mean square error (MSE) and percentage root mean square difference (PRD) of 0.001 and 0.0021.

Download Full-text

A Novel Stochastic Gradient Descent Algorithm Based on Grouping over Heterogeneous Cluster Systems for Distributed Deep Learning

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) ◽

10.1109/ccgrid.2019.00053 ◽

2019 ◽

Author(s):

Wenbin Jiang ◽

Geyan Ye ◽

Laurence T. Yang ◽

Jian Zhu ◽

Yang Ma ◽

...

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Heterogeneous Cluster ◽

Cluster Systems ◽

Descent Algorithm ◽

Gradient Descent Algorithm

Download Full-text

An Efficient, Distributed Stochastic Gradient Descent Algorithm for Deep-Learning Applications

2017 46th International Conference on Parallel Processing (ICPP) ◽

10.1109/icpp.2017.10 ◽

2017 ◽

Cited By ~ 2

Author(s):

Guojing Cong ◽

Onkar Bhardwaj ◽

Minwei Feng

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Descent Algorithm ◽

Gradient Descent Algorithm

Download Full-text

Hyperparameter-free optimizer of stochastic gradient descent that incorporates unit correction and moment estimation

10.1101/348557 ◽

2018 ◽

Author(s):

Kazunori D Yamada

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Mathematical Optimization ◽

Descent Method ◽

Learning Rate ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Moment Estimation ◽

Estimation System

ABSTRACTIn the deep learning era, stochastic gradient descent is the most common method used for optimizing neural network parameters. Among the various mathematical optimization methods, the gradient descent method is the most naive. Adjustment of learning rate is necessary for quick convergence, which is normally done manually with gradient descent. Many optimizers have been developed to control the learning rate and increase convergence speed. Generally, these optimizers adjust the learning rate automatically in response to learning status. These optimizers were gradually improved by incorporating the effective aspects of earlier methods. In this study, we developed a new optimizer: YamAdam. Our optimizer is based on Adam, which utilizes the first and second moments of previous gradients. In addition to the moment estimation system, we incorporated an advantageous part of AdaDelta, namely a unit correction system, into YamAdam. According to benchmark tests on some common datasets, our optimizer showed similar or faster convergent performance compared to the existing methods. YamAdam is an option as an alternative optimizer for deep learning.

Download Full-text

A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning

2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS) ◽

10.1109/padsw.2018.8644932 ◽

2018 ◽

Cited By ~ 3

Author(s):

Shaohuai Shi ◽

Qiang Wang ◽

Xiaowen Chu ◽

Bo Li

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Dag Model

Download Full-text

An effective learning rate scheduler for stochastic gradient descent-based deep learning model in healthcare diagnosis system

International Journal of Electronic Healthcare ◽

10.1504/ijeh.2022.119587 ◽

2022 ◽

Vol 12 (1) ◽

pp. 1

Author(s):

K. Sathyabama ◽

K. Saruladha

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Learning Model ◽

Learning Rate ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Diagnosis System ◽

Effective Learning ◽

Deep Learning Model

Download Full-text

An Effective Learning Rate Scheduler for Stochastic Gradient Descent Based Deep Learning Model in Healthcare Diagnosis System

International Journal of Electronic Healthcare ◽

10.1504/ijeh.2022.10041876 ◽

2022 ◽

Vol 12 (1) ◽

pp. 1

Author(s):

Sathyabama K ◽

K. Saruladha

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Learning Model ◽

Learning Rate ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Diagnosis System ◽

Effective Learning ◽

Deep Learning Model

Download Full-text

Communication-Efficient Local Stochastic Gradient Descent for Scalable Deep Learning

2020 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata50022.2020.9378178 ◽

2020 ◽

Author(s):

Sunwoo Lee ◽

Qiao Kang ◽

Ankit Agrawal ◽

Alok Choudhary ◽

Wei-keng Liao

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent

Download Full-text

Adaptive Stochastic Gradient Descent for Deep Learning on Heterogeneous CPU+GPU Architectures

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) ◽

10.1109/ipdpsw52791.2021.00012 ◽

2021 ◽

Author(s):

Yujing Ma ◽

Florin Rusu ◽

Kesheng Wu ◽

Alexander Sim

Keyword(s):

Deep Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gpu Architectures

Download Full-text