S-DFP: shifted dynamic fixed point for quantized deep neural network training

Correspondence between neuroevolution and gradient descent

Nature Communications ◽

10.1038/s41467-021-26568-2 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Stephen Whitelam ◽

Viktor Selin ◽

Sang-Won Park ◽

Isaac Tamblyn

Keyword(s):

Neural Network ◽

Numerical Simulation ◽

Neural Networks ◽

Loss Function ◽

Gradient Descent ◽

Deep Neural Networks ◽

Gaussian White Noise ◽

Training Methods ◽

Neural Network Training ◽

Network Training

AbstractWe show analytically that training a neural network by conditioned stochastic mutation or neuroevolution of its weights is equivalent, in the limit of small mutations, to gradient descent on the loss function in the presence of Gaussian white noise. Averaged over independent realizations of the learning process, neuroevolution is equivalent to gradient descent on the loss function. We use numerical simulation to show that this correspondence can be observed for finite mutations, for shallow and deep neural networks. Our results provide a connection between two families of neural-network training methods that are usually considered to be fundamentally different.

Download Full-text

Parallelization of Neural Network Training for NLP with Hogwild!

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0036 ◽

2017 ◽

Vol 109 (1) ◽

pp. 29-38 ◽

Cited By ~ 2

Author(s):

Valentin Deyringer ◽

Alexander Fraser ◽

Helmut Schmid ◽

Tsuyoshi Okita

Keyword(s):

Neural Network ◽

Neural Networks ◽

Suitable Method ◽

Neural Network Training ◽

Training Time ◽

Pos Tagging ◽

Network Training ◽

Speed Up

Abstract Neural Networks are prevalent in todays NLP research. Despite their success for different tasks, training time is relatively long. We use Hogwild! to counteract this phenomenon and show that it is a suitable method to speed up training Neural Networks of different architectures and complexity. For POS tagging and translation we report considerable speedups of training, especially for the latter. We show that Hogwild! can be an important tool for training complex NLP architectures.

Download Full-text

A Study on the Image Classification Techniques Based on Wavelet Artificial Neural Network Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.602-605.3512 ◽

2014 ◽

Vol 602-605 ◽

pp. 3512-3514

Author(s):

Xue Ding ◽

Hong Hong Yang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Training Methods ◽

Neural Network Training ◽

Target Vector ◽

Network Training ◽

Incremental Training ◽

Artificial Neural ◽

Artificial Neural Network Training

With the ever-changing education information technology, it is a big problem for the universities and college that how to classify the thousands of copies of the image during the art examination marking process. This paper is to explore the application of artificial intelligence techniques, and to do accurate classification of a large number of images within a limited time and under the help of computer. It is can be seen that the proposed method is feasible through the application of the results of the actual work. Artificial neural network training Artificial neural network training methods have two mainly style, which are Incremental Training and Batch Training, and take the amount of different network training mission as the distinction standard. First, to introduce the Incremental Training [1], that means whenever the network receives the input vector and target vector, it have to adjust once the connection weights and thresholds. It is an online learning method. The other one is Batch Training [2], that means no longer adjust the connection and immediately, but perform bulk adjustment, and after a given volume of the input vector and target vector. Both training methods can be applied, whether it is static or dynamic neural network. Different results will be obtained by artificial neural network for the use of different training methods. When using artificial neural networks to solve specific problems, learning methods, training methods and artificial neural network function should be selected according to the expected results of question type and its specific requirements [3-4]. The selection of parameters of wavelet neural networks and adaptive learning

Download Full-text

Machine-learning in astronomy

Proceedings of the International Astronomical Union ◽

10.1017/s1743921314013672 ◽

2014 ◽

Vol 10 (S306) ◽

pp. 279-287 ◽

Cited By ~ 2

Author(s):

Michael Hobson ◽

Philip Graff ◽

Farhan Feroz ◽

Anthony Lasenby

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Gamma Ray ◽

Neural Network Training ◽

Training Algorithm ◽

Data Description ◽

Astronomical Data ◽

Machine Learning Methods ◽

Network Training

AbstractMachine-learning methods may be used to perform many tasks required in the analysis of astronomical data, including: data description and interpretation, pattern recognition, prediction, classification, compression, inference and many more. An intuitive and well-established approach to machine learning is the use of artificial neural networks (NNs), which consist of a group of interconnected nodes, each of which processes information that it receives and then passes this product on to other nodes via weighted connections. In particular, I discuss the first public release of the generic neural network training algorithm, calledSkyNet, and demonstrate its application to astronomical problems focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters. TheSkyNetand BAMBI packages, which are fully parallelised using MPI, are available athttp://www.mrao.cam.ac.uk/software/.

Download Full-text

Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

10.4018/978-1-6684-2408-7.ch009 ◽

2022 ◽

pp. 202-226

Author(s):

Leema N. ◽

Khanna H. Nehemiah ◽

Elgin Christo V. R. ◽

Kannan A.

Keyword(s):

Neural Network ◽

Neural Networks ◽

Activation Function ◽

Neural Network Training ◽

Network Parameter ◽

Network Parameters ◽

Network Training ◽

Rate Minimum ◽

Hidden Layer ◽

Function Number

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Download Full-text

A nonlinear training set superposition filter derived by neural network training methods for implementation in a shift-invariant optical correlator

10.1117/12.486334 ◽

2003 ◽

Cited By ~ 2

Author(s):

Ioannis Kypraios ◽

Rupert C. D. Young ◽

Philip M. Birch ◽

Christopher R. Chatwin

Keyword(s):

Neural Network ◽

Training Methods ◽

Neural Network Training ◽

Training Set ◽

Network Training ◽

Optical Correlator

Download Full-text

A method for decreasing neural network training time as applied to ECG classification

[1992] Proceedings of the Eighteenth IEEE Annual Northeast Bioengineering Conference ◽

10.1109/nebc.1992.285919 ◽

2003 ◽

Author(s):

E. Sakk ◽

J. Belina ◽

R.J. Thomas

Keyword(s):

Neural Network ◽

Neural Network Training ◽

Training Time ◽

Network Training

Download Full-text

TRAINING NEURAL NETWORK FOR TAXI PASSENGER DEMAND FORECASTING USING GRAPHICS PROCESSING UNITS

Ukrainian Journal of Information Technology ◽

10.23939/ujit2020.02.029 ◽

2020 ◽

Vol 2 (1) ◽

pp. 29-36

Author(s):

M. I. Zghoba ◽

◽

Yu. I. Hrytsiuk ◽

Keyword(s):

Neural Network ◽

Graphics Processing Units ◽

Neural Network Training ◽

Training Time ◽

Passenger Demand ◽

Network Training ◽

The Neural Network ◽

Input Dataset ◽

Speed Up ◽

Graphics Processing

The peculiarities of neural network training for forecasting taxi passenger demand using graphics processing units are considered, which allowed to speed up the training procedure for different sets of input data, hardware configurations, and its power. It has been found that taxi services are becoming more accessible to a wide range of people. The most important task for any transportation company and taxi driver is to minimize the waiting time for new orders and to minimize the distance from drivers to passengers on order receiving. Understanding and assessing the geographical passenger demand that depends on many factors is crucial to achieve this goal. This paper describes an example of neural network training for predicting taxi passenger demand. It shows the importance of a large input dataset for the accuracy of the neural network. Since the training of a neural network is a lengthy process, parallel training was used to speed up the training. The neural network for forecasting taxi passenger demand was trained using different hardware configurations, such as one CPU, one GPU, and two GPUs. The training times of one epoch were compared along with these configurations. The impact of different hardware configurations on training time was analyzed in this work. The network was trained using a dataset containing 4.5 million trips within one city. The results of this study show that the training with GPU accelerators doesn't necessarily improve the training time. The training time depends on many factors, such as input dataset size, splitting of the entire dataset into smaller subsets, as well as hardware and power characteristics.

Download Full-text

An Advanced Conjugate Gradient Training Algorithm Based on a Modified Secant Equation

ISRN Artificial Intelligence ◽

10.5402/2012/486361 ◽

2012 ◽

Vol 2012 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Ioannis E. Livieris ◽

Panagiotis Pintelas

Keyword(s):

Neural Network ◽

Conjugate Gradient ◽

Gradient Methods ◽

High Order Accuracy ◽

Conjugate Gradient Methods ◽

Training Methods ◽

Neural Network Training ◽

Training Algorithm ◽

Error Surface ◽

Network Training

Conjugate gradient methods constitute excellent neural network training methods characterized by their simplicity, numerical efficiency, and their very low memory requirements. In this paper, we propose a conjugate gradient neural network training algorithm which guarantees sufficient descent using any line search, avoiding thereby the usually inefficient restarts. Moreover, it achieves a high-order accuracy in approximating the second-order curvature information of the error surface by utilizing the modified secant condition proposed by Li et al. (2007). Under mild conditions, we establish that the proposed method is globally convergent for general functions under the strong Wolfe conditions. Experimental results provide evidence that our proposed method is preferable and in general superior to the classical conjugate gradient methods and has a potential to significantly enhance the computational efficiency and robustness of the training process.

Download Full-text

Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

International Journal of Operations Research and Information Systems ◽

10.4018/ijoris.2020100104 ◽

2020 ◽

Vol 11 (4) ◽

pp. 62-85

Author(s):

Leema N. ◽

Khanna H. Nehemiah ◽

Elgin Christo V. R. ◽

Kannan A.

Keyword(s):

Neural Network ◽

Neural Networks ◽

Activation Function ◽

Neural Network Training ◽

Network Parameter ◽

Network Parameters ◽

Network Training ◽

Hidden Layer ◽

Function Number ◽

The Impact

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Download Full-text