Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

2022 ◽  
pp. 202-226
Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Author(s):  
Leema N. ◽  
Khanna H. Nehemiah ◽  
Elgin Christo V. R. ◽  
Kannan A.

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 711
Author(s):  
Mina Basirat ◽  
Bernhard C. Geiger ◽  
Peter M. Roth

Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.


2021 ◽  
Vol 26 (jai2021.26(1)) ◽  
pp. 32-41
Author(s):  
Bodyanskiy Y ◽  
◽  
Antonenko T ◽  

Modern approaches in deep neural networks have a number of issues related to the learning process and computational costs. This article considers the architecture grounded on an alternative approach to the basic unit of the neural network. This approach achieves optimization in the calculations and gives rise to an alternative way to solve the problems of the vanishing and exploding gradient. The main issue of the article is the usage of the deep stacked neo-fuzzy system, which uses a generalized neo-fuzzy neuron to optimize the learning process. This approach is non-standard from a theoretical point of view, so the paper presents the necessary mathematical calculations and describes all the intricacies of using this architecture from a practical point of view. From a theoretical point, the network learning process is fully disclosed. Derived all necessary calculations for the use of the backpropagation algorithm for network training. A feature of the network is the rapid calculation of the derivative for the activation functions of neurons. This is achieved through the use of fuzzy membership functions. The paper shows that the derivative of such function is a constant, and this is a reason for the statement of increasing in the optimization rate in comparison with neural networks which use neurons with more common activation functions (ReLU, sigmoid). The paper highlights the main points that can be improved in further theoretical developments on this topic. In general, these issues are related to the calculation of the activation function. The proposed methods cope with these points and allow approximation using the network, but the authors already have theoretical justifications for improving the speed and approximation properties of the network. The results of the comparison of the proposed network with standard neural network architectures are shown


2014 ◽  
Vol 10 (S306) ◽  
pp. 279-287 ◽  
Author(s):  
Michael Hobson ◽  
Philip Graff ◽  
Farhan Feroz ◽  
Anthony Lasenby

AbstractMachine-learning methods may be used to perform many tasks required in the analysis of astronomical data, including: data description and interpretation, pattern recognition, prediction, classification, compression, inference and many more. An intuitive and well-established approach to machine learning is the use of artificial neural networks (NNs), which consist of a group of interconnected nodes, each of which processes information that it receives and then passes this product on to other nodes via weighted connections. In particular, I discuss the first public release of the generic neural network training algorithm, calledSkyNet, and demonstrate its application to astronomical problems focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters. TheSkyNetand BAMBI packages, which are fully parallelised using MPI, are available athttp://www.mrao.cam.ac.uk/software/.


2017 ◽  
Vol 109 (1) ◽  
pp. 29-38 ◽  
Author(s):  
Valentin Deyringer ◽  
Alexander Fraser ◽  
Helmut Schmid ◽  
Tsuyoshi Okita

Abstract Neural Networks are prevalent in todays NLP research. Despite their success for different tasks, training time is relatively long. We use Hogwild! to counteract this phenomenon and show that it is a suitable method to speed up training Neural Networks of different architectures and complexity. For POS tagging and translation we report considerable speedups of training, especially for the latter. We show that Hogwild! can be an important tool for training complex NLP architectures.


2008 ◽  
Vol 20 (11) ◽  
pp. 2757-2791 ◽  
Author(s):  
Yoshifusa Ito

We have constructed one-hidden-layer neural networks capable of approximating polynomials and their derivatives simultaneously. Generally, optimizing neural network parameters to be trained at later steps of the BP training is more difficult than optimizing those to be trained at the first step. Taking into account this fact, we suppressed the number of parameters of the former type. We measure degree of approximation in both the uniform norm on compact sets and the Lp-norm on the whole space with respect to probability measures.


Author(s):  
M. HARLY ◽  
I. N. SUTANTRA ◽  
H. P. MAURIDHI

Fixed order neural networks (FONN), such as high order neural network (HONN), in which its architecture is developed from zero order of activation function and joint weight, regulates only the number of weight and their value. As a result, this network only produces a fixed order model or control level. These obstacles, which affect preceeding architectures, have been performing finite ability to adapt uncertainty character of real world plant, such as driving dynamics and its desired control performance. This paper introduces a new concept of neural network neuron. In this matter, exploiting discrete z-function builds new neuron activation. Instead of zero order joint weight matrices, the discrete z-function weight matrix will be provided to realize uncertainty or undetermined real word plant and desired adaptive control system that their order has probably been changing. Instead of using bias, an initial condition value is developed. Neural networks using new neurons is called Varied Order Neural Network (VONN). For optimization process, updating order, coefficient and initial value of node activation function uses GA; while updating joint weight, it applies both back propagation (combined LSE-gauss Newton) and NPSO. To estimate the number of hidden layer, constructive back propagation (CBP) was also applied. Thorough simulation was conducted to compare the control performance between FONN and MONN. In order to control, vehicle stability was equipped by electronics stability program (ESP), electronics four wheel steering (4-EWS), and active suspension (AS). 2000, 4000, 6000, 8000 data that are from TODS, a hidden layer, 3 input nodes, 3 output nodes were provided to train and test the network of both the uncertainty model and its adaptive control system. The result of simulation, therefore, shows that stability parameter such as yaw rate error, vehicle side slip error, and rolling angle error produces better performance control in the form of smaller performance index using FDNN than those using MONN.


Author(s):  
Sheng-Uei Guan ◽  
Ji Hua Ang ◽  
Kay Chen Tan ◽  
Abdullah Al Mamun

This chapter proposes a novel method of incremental interference-free neural network training (IIFNNT) for medical datasets, which takes into consideration the interference each attribute has on the others. A specially designed network is used to determine if two attributes interfere with each other, after which the attributes are partitioned using some partitioning algorithms. These algorithms make sure that attributes beneficial to each other are trained in the same batch, thus sharing the same subnetwork while interfering attributes are separated to reduce interference. There are several incremental neural networks available in literature (Guan & Li, 2001; Su, Guan & Yeo, 2001). The architecture of IIFNNT employed some incremental algorithm: the ILIA1 and ILIA2 (incremental learning with respect to new incoming attributes) (Guan & Li, 2001).


2012 ◽  
Vol 500 ◽  
pp. 198-203
Author(s):  
Chang Lin Xiao ◽  
Yan Chen ◽  
Lina Liu ◽  
Ling Tong ◽  
Ming Quan Jia

Genetic Algorithm can further optimize Neural Networks, and this optimized Algorithm has been used in many fields and made better results, but currently, it have not been used in inversion parameters. This paper used backscattering coefficients from ASAR, AIEM model to calculate data as neural network training data and through Genetic Algorithm Neural Networks to retrieve soil moisture. Finally compared with practical test and shows the validity and superiority of the Genetic Algorithm Neural Networks.


Sign in / Sign up

Export Citation Format

Share Document