stochastic gradient method
Recently Published Documents


TOTAL DOCUMENTS

47
(FIVE YEARS 17)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Vol 12 (5) ◽  
pp. 1-26
Author(s):  
Congliang Chen ◽  
Li Shen ◽  
Haozhi Huang ◽  
Wei Liu

In this article, we present a distributed variant of an adaptive stochastic gradient method for training deep neural networks in the parameter-server model. To reduce the communication cost among the workers and server, we incorporate two types of quantization schemes, i.e., gradient quantization and weight quantization, into the proposed distributed Adam. In addition, to reduce the bias introduced by quantization operations, we propose an error-feedback technique to compensate for the quantized gradient. Theoretically, in the stochastic nonconvex setting, we show that the distributed adaptive gradient method with gradient quantization and error feedback converges to the first-order stationary point, and that the distributed adaptive gradient method with weight quantization and error feedback converges to the point related to the quantized level under both the single-worker and multi-worker modes. Last, we apply the proposed distributed adaptive gradient methods to train deep neural networks. Experimental results demonstrate the efficacy of our methods.


2021 ◽  
pp. 108201
Author(s):  
Zhijian Luo ◽  
Siyu Chen ◽  
Yuntao Qian ◽  
Yueen Hou

2020 ◽  
Vol 34 (04) ◽  
pp. 5636-5643 ◽  
Author(s):  
Ali Shafahi ◽  
Mahyar Najibi ◽  
Zheng Xu ◽  
John Dickerson ◽  
Larry S. Davis ◽  
...  

Standard adversarial attacks change the predicted class label of a selected image by adding specially tailored small perturbations to its pixels. In contrast, a universal perturbation is an update that can be added to any image in a broad class of images, while still changing the predicted class label. We study the efficient generation of universal adversarial perturbations, and also efficient methods for hardening networks to these attacks. We propose a simple optimization-based universal attack that reduces the top-1 accuracy of various network architectures on ImageNet to less than 20%, while learning the universal perturbation 13× faster than the standard method.To defend against these perturbations, we propose universal adversarial training, which models the problem of robust classifier generation as a two-player min-max game, and produces robust models with only 2× the cost of natural training. We also propose a simultaneous stochastic gradient method that is almost free of extra computation, which allows us to do universal adversarial training on ImageNet.


Sign in / Sign up

Export Citation Format

Share Document