Evaluation of Parameter Settings for Training Neural Networks Using Backpropagation Algorithms

Artificial neural networks (ANN) are widely used for classification, and the training algorithm commonly used is the backpropagation (BP) algorithm. The major bottleneck faced in the backpropagation neural network training is in fixing the appropriate values for network parameters. The network parameters are initial weights, biases, activation function, number of hidden layers and the number of neurons per hidden layer, number of training epochs, learning rate, minimum error, and momentum term for the classification task. The objective of this work is to investigate the performance of 12 different BP algorithms with the impact of variations in network parameter values for the neural network training. The algorithms were evaluated with different training and testing samples taken from the three benchmark clinical datasets, namely, Pima Indian Diabetes (PID), Hepatitis, and Wisconsin Breast Cancer (WBC) dataset obtained from the University of California Irvine (UCI) machine learning repository.

Download Full-text

A Geometric Perspective on Information Plane Analysis

Entropy ◽

10.3390/e23060711 ◽

2021 ◽

Vol 23 (6) ◽

pp. 711

Author(s):

Mina Basirat ◽

Bernhard C. Geiger ◽

Peter M. Roth

Keyword(s):

Neural Network ◽

Mutual Information ◽

Geometric Interpretation ◽

Neural Network Training ◽

Neural Network Learning ◽

Network Learning ◽

Plane Analysis ◽

Network Training ◽

Hidden Layer ◽

The Impact

Information plane analysis, describing the mutual information between the input and a hidden layer and between a hidden layer and the target over time, has recently been proposed to analyze the training of neural networks. Since the activations of a hidden layer are typically continuous-valued, this mutual information cannot be computed analytically and must thus be estimated, resulting in apparently inconsistent or even contradicting results in the literature. The goal of this paper is to demonstrate how information plane analysis can still be a valuable tool for analyzing neural network training. To this end, we complement the prevailing binning estimator for mutual information with a geometric interpretation. With this geometric interpretation in mind, we evaluate the impact of regularization and interpret phenomena such as underfitting and overfitting. In addition, we investigate neural network learning in the presence of noisy data and noisy labels.

Download Full-text

Deep neural network based on generalized neo-fuzzy neurons and its learning based on backpropagation

Artificial Intelligence ◽

10.15407/jai2021.01.032 ◽

2021 ◽

Vol 26 (jai2021.26(1)) ◽

pp. 32-41

Author(s):

Bodyanskiy Y ◽

◽

Antonenko T ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Learning Process ◽

Activation Function ◽

Point Of View ◽

Basic Unit ◽

Theoretical Point ◽

Activation Functions ◽

Approximation Properties ◽

Network Training

Modern approaches in deep neural networks have a number of issues related to the learning process and computational costs. This article considers the architecture grounded on an alternative approach to the basic unit of the neural network. This approach achieves optimization in the calculations and gives rise to an alternative way to solve the problems of the vanishing and exploding gradient. The main issue of the article is the usage of the deep stacked neo-fuzzy system, which uses a generalized neo-fuzzy neuron to optimize the learning process. This approach is non-standard from a theoretical point of view, so the paper presents the necessary mathematical calculations and describes all the intricacies of using this architecture from a practical point of view. From a theoretical point, the network learning process is fully disclosed. Derived all necessary calculations for the use of the backpropagation algorithm for network training. A feature of the network is the rapid calculation of the derivative for the activation functions of neurons. This is achieved through the use of fuzzy membership functions. The paper shows that the derivative of such function is a constant, and this is a reason for the statement of increasing in the optimization rate in comparison with neural networks which use neurons with more common activation functions (ReLU, sigmoid). The paper highlights the main points that can be improved in further theoretical developments on this topic. In general, these issues are related to the calculation of the activation function. The proposed methods cope with these points and allow approximation using the network, but the authors already have theoretical justifications for improving the speed and approximation properties of the network. The results of the comparison of the proposed network with standard neural network architectures are shown

Download Full-text

Machine-learning in astronomy

Proceedings of the International Astronomical Union ◽

10.1017/s1743921314013672 ◽

2014 ◽

Vol 10 (S306) ◽

pp. 279-287 ◽

Cited By ~ 2

Author(s):

Michael Hobson ◽

Philip Graff ◽

Farhan Feroz ◽

Anthony Lasenby

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Gamma Ray ◽

Neural Network Training ◽

Training Algorithm ◽

Data Description ◽

Astronomical Data ◽

Machine Learning Methods ◽

Network Training

AbstractMachine-learning methods may be used to perform many tasks required in the analysis of astronomical data, including: data description and interpretation, pattern recognition, prediction, classification, compression, inference and many more. An intuitive and well-established approach to machine learning is the use of artificial neural networks (NNs), which consist of a group of interconnected nodes, each of which processes information that it receives and then passes this product on to other nodes via weighted connections. In particular, I discuss the first public release of the generic neural network training algorithm, calledSkyNet, and demonstrate its application to astronomical problems focusing on its use in the BAMBI package for accelerated Bayesian inference in cosmology, and the identification of gamma-ray bursters. TheSkyNetand BAMBI packages, which are fully parallelised using MPI, are available athttp://www.mrao.cam.ac.uk/software/.

Download Full-text

Parallelization of Neural Network Training for NLP with Hogwild!

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0036 ◽

2017 ◽

Vol 109 (1) ◽

pp. 29-38 ◽

Cited By ~ 2

Author(s):

Valentin Deyringer ◽

Alexander Fraser ◽

Helmut Schmid ◽

Tsuyoshi Okita

Keyword(s):

Neural Network ◽

Neural Networks ◽

Suitable Method ◽

Neural Network Training ◽

Training Time ◽

Pos Tagging ◽

Network Training ◽

Speed Up

Abstract Neural Networks are prevalent in todays NLP research. Despite their success for different tasks, training time is relatively long. We use Hogwild! to counteract this phenomenon and show that it is a suitable method to speed up training Neural Networks of different architectures and complexity. For POS tagging and translation we report considerable speedups of training, especially for the latter. We show that Hogwild! can be an important tool for training complex NLP architectures.

Download Full-text

Simultaneous Approximations of Polynomials and Derivatives and Their Applications to Neural Networks

Neural Computation ◽

10.1162/neco.2008.03-07-494 ◽

2008 ◽

Vol 20 (11) ◽

pp. 2757-2791 ◽

Cited By ~ 7

Author(s):

Yoshifusa Ito

Keyword(s):

Neural Network ◽

Neural Networks ◽

Uniform Norm ◽

Degree Of Approximation ◽

Probability Measures ◽

Compact Sets ◽

Network Parameters ◽

Whole Space ◽

Hidden Layer ◽

Approximating Polynomials

We have constructed one-hidden-layer neural networks capable of approximating polynomials and their derivatives simultaneously. Generally, optimizing neural network parameters to be trained at later steps of the BP training is more difficult than optimizing those to be trained at the first step. Taking into account this fact, we suppressed the number of parameters of the former type. We measure degree of approximation in both the uniform norm on compact sets and the Lp-norm on the whole space with respect to probability measures.

Download Full-text

Spectral technique for hidden layer neural network training

Pattern Recognition Letters ◽

10.1016/s0167-8655(97)00079-2 ◽

1997 ◽

Vol 18 (8) ◽

pp. 723-731 ◽

Cited By ~ 12

Author(s):

Terry Windeatt ◽

Robert Tebbs

Keyword(s):

Neural Network ◽

Neural Network Training ◽

Spectral Technique ◽

Network Training ◽

Hidden Layer

Download Full-text

NEW METHOD OF NEURON DESIGN BASED ON DISCRETE Z-FUNCTION TO ADAPT THE CHANGE OF INTEGRATED VEHICLE STABILITY CONTROL ORDER

International Journal of Computational Intelligence and Applications ◽

10.1142/s146902680900259x ◽

2009 ◽

Vol 08 (03) ◽

pp. 253-285 ◽

Cited By ~ 1

Author(s):

M. HARLY ◽

I. N. SUTANTRA ◽

H. P. MAURIDHI

Keyword(s):

Neural Network ◽

Neural Networks ◽

Back Propagation ◽

Activation Function ◽

Zero Order ◽

Vehicle Stability ◽

Adaptive Control System ◽

Control Performance ◽

Fixed Order ◽

Hidden Layer

Fixed order neural networks (FONN), such as high order neural network (HONN), in which its architecture is developed from zero order of activation function and joint weight, regulates only the number of weight and their value. As a result, this network only produces a fixed order model or control level. These obstacles, which affect preceeding architectures, have been performing finite ability to adapt uncertainty character of real world plant, such as driving dynamics and its desired control performance. This paper introduces a new concept of neural network neuron. In this matter, exploiting discrete z-function builds new neuron activation. Instead of zero order joint weight matrices, the discrete z-function weight matrix will be provided to realize uncertainty or undetermined real word plant and desired adaptive control system that their order has probably been changing. Instead of using bias, an initial condition value is developed. Neural networks using new neurons is called Varied Order Neural Network (VONN). For optimization process, updating order, coefficient and initial value of node activation function uses GA; while updating joint weight, it applies both back propagation (combined LSE-gauss Newton) and NPSO. To estimate the number of hidden layer, constructive back propagation (CBP) was also applied. Thorough simulation was conducted to compare the control performance between FONN and MONN. In order to control, vehicle stability was equipped by electronics stability program (ESP), electronics four wheel steering (4-EWS), and active suspension (AS). 2000, 4000, 6000, 8000 data that are from TODS, a hidden layer, 3 input nodes, 3 output nodes were provided to train and test the network of both the uncertainty model and its adaptive control system. The result of simulation, therefore, shows that stability parameter such as yaw rate error, vehicle side slip error, and rolling angle error produces better performance control in the form of smaller performance index using FDNN than those using MONN.

Download Full-text

Incremental Neural Network Training for Medical Diagnosis

Encyclopedia of Healthcare Information Systems ◽

10.4018/978-1-59904-889-5.ch091 ◽

2008 ◽

pp. 720-731

Author(s):

Sheng-Uei Guan ◽

Ji Hua Ang ◽

Kay Chen Tan ◽

Abdullah Al Mamun

Keyword(s):

Neural Network ◽

Neural Networks ◽

Incremental Learning ◽

Medical Diagnosis ◽

Incremental Algorithm ◽

Neural Network Training ◽

Network Training ◽

Novel Method ◽

Partitioning Algorithms

This chapter proposes a novel method of incremental interference-free neural network training (IIFNNT) for medical datasets, which takes into consideration the interference each attribute has on the others. A specially designed network is used to determine if two attributes interfere with each other, after which the attributes are partitioned using some partitioning algorithms. These algorithms make sure that attributes beneficial to each other are trained in the same batch, thus sharing the same subnetwork while interfering attributes are separated to reduce interference. There are several incremental neural networks available in literature (Guan & Li, 2001; Su, Guan & Yeo, 2001). The architecture of IIFNNT employed some incremental algorithm: the ILIA1 and ILIA2 (incremental learning with respect to new incoming attributes) (Guan & Li, 2001).

Download Full-text

Soil Moisture Retrieval Based on ASAR Data and Genetic Neural Networks

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.500.198 ◽

2012 ◽

Vol 500 ◽

pp. 198-203

Author(s):

Chang Lin Xiao ◽

Yan Chen ◽

Lina Liu ◽

Ling Tong ◽

Ming Quan Jia

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Neural Networks ◽

Soil Moisture ◽

Training Data ◽

Neural Network Training ◽

Practical Test ◽

Network Training

Genetic Algorithm can further optimize Neural Networks, and this optimized Algorithm has been used in many fields and made better results, but currently, it have not been used in inversion parameters. This paper used backscattering coefficients from ASAR, AIEM model to calculate data as neural network training data and through Genetic Algorithm Neural Networks to retrieve soil moisture. Finally compared with practical test and shows the validity and superiority of the Genetic Algorithm Neural Networks.

Download Full-text