RALR: Random Amplify Learning Rates for Training Neural Networks

It has been shown that the learning rate is one of the most critical hyper-parameters for the overall performance of deep neural networks. In this paper, we propose a new method for setting the global learning rate, named random amplify learning rates (RALR), to improve the performance of any optimizer in training deep neural networks. Instead of monotonically decreasing the learning rate, we expect to escape saddle points or local minima by amplifying the learning rate between reasonable boundary values based on a given probability. Training with RALR rather than conventionally decreasing the learning rate achieves further improvement on networks’ performance without extra consumption. Remarkably, the RALR is complementary with state-of-the-art data augmentation and regularization methods. Besides, we empirically study its performance on image classification tasks, fine-grained classification tasks, object detection tasks, and machine translation tasks. Experiments demonstrate that RALR can bring a notable improvement while preventing overfitting when training deep neural networks. For example, the classification accuracy of ResNet-110 trained on the CIFAR-100 dataset using RALR achieves a 1.34% gain compared with ResNet-110 trained traditionally.

Download Full-text

Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

IEEE Transactions on Cybernetics ◽

10.1109/tcyb.2021.3107415 ◽

2021 ◽

pp. 1-12

Author(s):

Hideaki Iiduka

Keyword(s):

Neural Networks ◽

Adaptive Learning ◽

Deep Neural Networks ◽

Optimization Algorithms ◽

Learning Rate ◽

Adaptive Learning Rate ◽

Learning Rates ◽

Rate Optimization

Download Full-text

Deep neural networks trained with heavier data augmentation learn features closer to representations in hIT

10.32470/ccn.2018.1046-0 ◽

2018 ◽

Cited By ~ 1

Author(s):

Alex Hernández-García ◽

Johannes Mehrer ◽

Nikolaus Kriegeskorte ◽

Peter König ◽

Tim C. Kietzmann

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

Acceleration-aware Fine-grained Channel Pruning for Deep Neural Networks via Residual Gating

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2021.3093835 ◽

2021 ◽

pp. 1-1

Author(s):

Kai Huang ◽

Siang Chen ◽

Bowen Li ◽

Luc Claesen ◽

Hao Yao ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Fine Grained

Download Full-text

Data augmentation and semi-supervised learning for deep neural networks-based text classifier

Proceedings of the 35th Annual ACM Symposium on Applied Computing ◽

10.1145/3341105.3373992 ◽

2020 ◽

Author(s):

Heereen Shim ◽

Stijn Luca ◽

Dietwig Lowet ◽

Bart Vanrumste

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

Data Augmentation

Download Full-text

Fuzz testing based data augmentation to improve robustness of deep neural networks

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering ◽

10.1145/3377811.3380415 ◽

2020 ◽

Cited By ~ 2

Author(s):

Xiang Gao ◽

Ripon K. Saha ◽

Mukul R. Prasad ◽

Abhik Roychoudhury

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation ◽

Fuzz Testing

Download Full-text

An Automated End-To-End Pipeline for Fine-Grained Video Annotation using Deep Neural Networks

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval - ICMR '16 ◽

10.1145/2911996.2912028 ◽

2016 ◽

Cited By ~ 2

Author(s):

Baptist Vandersmissen ◽

Lucas Sterckx ◽

Thomas Demeester ◽

Azarakhsh Jalalvand ◽

Wesley De Neve ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Video Annotation ◽

Fine Grained ◽

End To End

Download Full-text

Detection of activity and position of speakers by using deep neural networks and acoustic data augmentation

Expert Systems with Applications ◽

10.1016/j.eswa.2019.05.017 ◽

2019 ◽

Vol 134 ◽

pp. 53-65 ◽

Cited By ~ 1

Author(s):

Paolo Vecchiotti ◽

Giovanni Pepe ◽

Emanuele Principi ◽

Stefano Squartini

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation ◽

Acoustic Data

Download Full-text

A Diffusion Approximation Theory of Momentum Stochastic Gradient Descent in Nonconvex Optimization

Stochastic Systems ◽

10.1287/stsy.2021.0083 ◽

2021 ◽

Author(s):

Tianyi Liu ◽

Zhehui Chen ◽

Enlu Zhou ◽

Tuo Zhao

Keyword(s):

Neural Networks ◽

Nonconvex Optimization ◽

Gradient Descent ◽

Deep Neural Networks ◽

Optimization Problems ◽

Saddle Points ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Nonconvex Optimization Problems ◽

Empirical Success

Momentum stochastic gradient descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning (e.g., training deep neural networks, variational Bayesian inference, etc.). Despite its empirical success, there is still a lack of theoretical understanding of convergence properties of MSGD. To fill this gap, we propose to analyze the algorithmic behavior of MSGD by diffusion approximations for nonconvex optimization problems with strict saddle points and isolated local optima. Our study shows that the momentum helps escape from saddle points but hurts the convergence within the neighborhood of optima (if without the step size annealing or momentum annealing). Our theoretical discovery partially corroborates the empirical success of MSGD in training deep neural networks.

Download Full-text

Deep Scattering Spectra with Deep Neural Networks for Acoustic Scene Classification Tasks

Chinese Journal of Electronics ◽

10.1049/cje.2019.07.006 ◽

2019 ◽

Vol 28 (6) ◽

pp. 1177-1183

Author(s):

Pengyuan Zhang ◽

Hangting Chen ◽

Haichuan Bai ◽

Qingsheng Yuan

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Scene Classification ◽

Scattering Spectra ◽

Classification Tasks

Download Full-text

The Use of Synthetic Data to Facilitate Eye Segmentation Using Deeplabv3+

Annals of Emerging Technologies in Computing ◽

10.33166/aetic.2021.03.001 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-10

Author(s):

Melih Öz ◽

Taner Danışman ◽

Melih Günay ◽

Esra Zekiye Şanal ◽

Özgür Duman ◽

...

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Data Augmentation ◽

Real Life ◽

Synthetic Data ◽

Limited Data ◽

Test Set ◽

Human Eye ◽

Context Data ◽

Head Positions

The human eye contains valuable information about an individual’s identity and health. Therefore, segmenting the eye into distinct regions is an essential step towards gathering this useful information precisely. The main challenges in segmenting the human eye include low light conditions, reflections on the eye, variations in the eyelid, and head positions that make an eye image hard to segment. For this reason, there is a need for deep neural networks, which are preferred due to their success in segmentation problems. However, deep neural networks need a large amount of manually annotated data to be trained. Manual annotation is a labor-intensive task, and to tackle this problem, we used data augmentation methods to improve synthetic data. In this paper, we detail the exploration of the scenario, which, with limited data, whether performance can be enhanced using similar context data with image augmentation methods. Our training and test set consists of 3D synthetic eye images generated from the UnityEyes application and manually annotated real-life eye images, respectively. We examined the effect of using synthetic eye images with the Deeplabv3+ network in different conditions using image augmentation methods on the synthetic data. According to our experiments, the network trained with processed synthetic images beside real-life images produced better mIoU results than the network, which only trained with real-life images in the Base dataset. We also observed mIoU increase in the test set we created from MICHE II competition images.

Download Full-text