Rise the Momentum: A Method for Reducing the Training Error on Multiple GPUs

Author(s):  
Yu Tang ◽  
Lujia Yin ◽  
Zhaoning Zhang ◽  
Dongsheng Li
Keyword(s):  
Algorithms ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 204
Author(s):  
Wenpeng Ma ◽  
Wu Yuan ◽  
Xiazhen Liu

Incomplete Sparse Approximate Inverses (ISAI) has shown some advantages over sparse triangular solves on GPUs when it is used for the incomplete LU based preconditioner. In this paper, we extend the single GPU method for Block–ISAI to multiple GPUs algorithm by coupling Block–Jacobi preconditioner, and introduce the detailed implementation in the open source numerical package PETSc. In the experiments, two representative cases are performed and a comparative study of Block–ISAI on up to four GPUs are conducted on two major generations of NVIDIA’s GPUs (Tesla K20 and Tesla V100). Block–Jacobi preconditioning with Block–ISAI (BJPB-ISAI) shows an advantage over the level-scheduling based triangular solves from the cuSPARSE library for the cases, and the overhead of setting up Block–ISAI and the total wall clock times of GMRES is greatly reduced using Tesla V100 GPUs compared to Tesla K20 GPUs.


Author(s):  
Xiaodong Yi ◽  
Ziyue Luo ◽  
Chen Meng ◽  
Mengdi Wang ◽  
Guoping Long ◽  
...  

Author(s):  
Guoshi Wang ◽  
Ying Liu ◽  
Xiaowen Chen ◽  
Qing Yan ◽  
Haibin Sui ◽  
...  

Abstract Transformer is the most important equipment in the power system. The research and development of fault diagnosis technology for Internet of Things equipment can effectively detect the operation status of equipment and eliminate hidden faults in time, which is conducive to reducing the incidence of accidents and improving people's life safety index. Objective To explore the utility of Internet of Things in power transformer fault diagnosis system. Methods A total of 30 groups of transformer fault samples were selected, and 10 groups were randomly selected for network training, and the rest samples were used for testing. The matter-element extension mathematical model of power transformer fault diagnosis was established, and the correlation function was improved according to the characteristics of three ratio method. Each group of power transformer was diagnosed for four months continuously, and the monitoring data and diagnosis were recorded and analyzed result. GPRS communication network is used to complete the communication between data acquisition terminal and monitoring terminal. According to the parameters of the database, the working state of the equipment is set, and various sensors are controlled by the instrument driver module to complete the diagnosis of transformer fault system. Results The detection success rate of the power transformer fault diagnosis system model established in this paper is as high as 95.6%, the training error is less than 0.0001, and it can correctly identify the fault types of the non training samples. It can be seen that the technical support of the Internet of Things is helpful to the upgrading and maintenance of the power transformer fault diagnosis system.


Author(s):  
Ivan Tanasic ◽  
Lluís Vilanova ◽  
Marc Jordà ◽  
Javier Cabezas ◽  
Isaac Gelado ◽  
...  
Keyword(s):  

2018 ◽  
Vol 110 (1) ◽  
pp. 43-70 ◽  
Author(s):  
Martin Popel ◽  
Ondřej Bojar

Abstract This article describes our experiments in neural machine translation using the recent Tensor2Tensor framework and the Transformer sequence-to-sequence model (Vaswani et al., 2017). We examine some of the critical parameters that affect the final translation quality, memory usage, training stability and training time, concluding each experiment with a set of recommendations for fellow researchers. In addition to confirming the general mantra “more data and larger models”, we address scaling to multiple GPUs and provide practical tips for improved training regarding batch size, learning rate, warmup steps, maximum sentence length and checkpoint averaging. We hope that our observations will allow others to get better results given their particular hardware and data constraints.


2008 ◽  
Vol 375-376 ◽  
pp. 535-538
Author(s):  
Xiang Feng Li ◽  
Gen Lian Yang ◽  
Dun Wen Zuo

Effects of running state and spindle speeds on the sound signals produced from a drill press are investigated. And the obtained sound signals by using of a sound level meter are analyzed in both time domain and frequency domain. It is evident that there is more high frequency energy for drilling sound signals with load than without load. And spindle speeds still affect their energy distribution of drilling sound signals. Using wavelet decomposition and wavelet packet decomposition, drilling sound signals are decomposed into a number of frequency bands. And energy percentages of the divided frequency bands are extracted to be the effective characteristics to recognize spindle speeds. Meanwhile, training error of different BP networks is compared to obtain the effective network for recognition spindle speeds. By using of the obtained network structure named 16-30-5, the study rate for training samples and the recognize rate for testing samples are all above 95%.


2000 ◽  
Vol 12 (6) ◽  
pp. 1411-1427 ◽  
Author(s):  
Shotaro Akaho ◽  
Hilbert J. Kappen

Theories of learning and generalization hold that the generalization bias, defined as the difference between the training error and the generalization error, increases on average with the number of adaptive parameters. This article, however, shows that this general tendency is violated for a gaussian mixture model. For temperatures just below the first symmetry breaking point, the effective number of adaptive parameters increases and the generalization bias decreases. We compute the dependence of the neural information criterion on temperature around the symmetry breaking. Our results are confirmed by numerical cross-validation experiments.


Author(s):  
Adeesha Wijayasiri ◽  
Tania Banerjee ◽  
Sanjay Ranka ◽  
Sartaj Sahni ◽  
Mark Schmalz

Sign in / Sign up

Export Citation Format

Share Document