Improving training time of deep neural networkwith asynchronous averaged stochastic gradient descent

The 9th International Symposium on Chinese Spoken Language Processing ◽

10.1109/iscslp.2014.6936596 ◽

2014 ◽

Author(s):

Zhao You ◽

Bo Xu

Keyword(s):

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Download Full-text

Damped Newton Stochastic Gradient Descent Method for Neural Networks Training

Mathematics ◽

10.3390/math9131533 ◽

2021 ◽

Vol 9 (13) ◽

pp. 1533

Author(s):

Jingcheng Zhou ◽

Wei Wei ◽

Ruizhi Zhang ◽

Zhiming Zheng

Keyword(s):

Neural Networks ◽

Gradient Descent ◽

Hessian Matrix ◽

Second Order ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Gradient Descent Method ◽

Classification Problems ◽

Training Time ◽

Second Order Methods

First-order methods such as stochastic gradient descent (SGD) have recently become popular optimization methods to train deep neural networks (DNNs) for good generalization; however, they need a long training time. Second-order methods which can lower the training time are scarcely used on account of their overpriced computing cost to obtain the second-order information. Thus, many works have approximated the Hessian matrix to cut the cost of computing while the approximate Hessian matrix has large deviation. In this paper, we explore the convexity of the Hessian matrix of partial parameters and propose the damped Newton stochastic gradient descent (DN-SGD) method and stochastic gradient descent damped Newton (SGD-DN) method to train DNNs for regression problems with mean square error (MSE) and classification problems with cross-entropy loss (CEL). In contrast to other second-order methods for estimating the Hessian matrix of all parameters, our methods only accurately compute a small part of the parameters, which greatly reduces the computational cost and makes the convergence of the learning process much faster and more accurate than SGD and Adagrad. Several numerical experiments on real datasets were performed to verify the effectiveness of our methods for regression and classification problems.

Download Full-text

Linear Support Vector Machine (SVM) with Stochastic Gradient Descent (SGD) training and multinomial Nave Bayes (NB) in News Classification

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.360363 ◽

2019 ◽

Vol 7 (4) ◽

pp. 360-363

Author(s):

Feroz Ahmed ◽

Shabina Ghafir

Keyword(s):

Support Vector Machine ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Support Vector ◽

Linear Support Vector Machine

Download Full-text

Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty

Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - ACL-IJCNLP '09 ◽

10.3115/1687878.1687946 ◽

2009 ◽

Author(s):

Yoshimasa Tsuruoka ◽

Jun'ichi Tsujii ◽

Sophia Ananiadou

Keyword(s):

Gradient Descent ◽

Linear Models ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Download Full-text

Drivetrain System Identification in a Multi-Task Learning Strategy using Partial Asynchronous Elastic Averaging Stochastic Gradient Descent

2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM) ◽

10.1109/aim43001.2020.9158977 ◽

2020 ◽

Author(s):

Tom Staessens ◽

Guillaume Crevecoeur

Keyword(s):

System Identification ◽

Gradient Descent ◽

Learning Strategy ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Download Full-text

Optimized directed roadmap graph for multi-agent path finding using stochastic gradient descent

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373916 ◽

2020 ◽

Author(s):

Christian Henkel ◽

Marc Toussaint

Keyword(s):

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Path Finding ◽

Download Full-text

Enhanced Stochastic Gradient Descent with Backward Queried Data for Online Learning

2020 IEEE International Conference on Machine Learning and Applied Network Technologies (ICMLANT) ◽

10.1109/icmlant50963.2020.9355978 ◽

2020 ◽

Author(s):

Gio Huh

Keyword(s):

Online Learning ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent

Download Full-text

Learning Rates for Stochastic Gradient Descent with Nonconvex Objectives

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2021.3068154 ◽

2021 ◽

pp. 1-1

Author(s):

Yunwen Lei ◽

Ke Tang

Keyword(s):

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Download Full-text

High Performance Parallel Stochastic Gradient Descent in Shared Memory

2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) ◽

10.1109/ipdps.2016.107 ◽

2016 ◽

Author(s):

Scott Sallinen ◽

Nadathur Satish ◽

Mikhail Smelyanskiy ◽

Samantika S. Sury ◽

Christopher Re

Keyword(s):

Shared Memory ◽

Gradient Descent ◽

High Performance ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Parallel Stochastic Gradient Descent

Download Full-text

Text Categorization by Multi-instance Multi-label and Momentum Stochastic Gradient Descent Strategy

2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence ◽

10.1145/3446132.3446158 ◽

2020 ◽

Author(s):

Xiang Bao ◽

Guifeng Liu ◽

Manrong Wang

Keyword(s):

Gradient Descent ◽

Text Categorization ◽

Stochastic Gradient ◽

Stochastic Gradient Descent

Download Full-text

Soft-Sign Stochastic Gradient Descent Algorithm for Wireless Federated Learning

10.1109/spawc51858.2021.9593212 ◽

2021 ◽

Author(s):

Seunghoon Lee ◽

Chanho Park ◽

Songnam Hong ◽

Yonina C. Eldar ◽

Namyoon Lee

Keyword(s):

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Descent Algorithm ◽

Gradient Descent Algorithm

Download Full-text