The Interchangeability of Learning Rate and Gain in Backpropagation Neural Networks

The backpropagation algorithm is widely used for training multilayer neural networks. In this publication the gain of its activation function(s) is investigated. In specific, it is proven that changing the gain of the activation function is equivalent to changing the learning rate and the weights. This simplifies the backpropagation learning rule by eliminating one of its parameters. The theorem can be extended to hold for some well-known variations on the backpropagation algorithm, such as using a momentum term, flat spot elimination, or adaptive gain. Furthermore, it is successfully applied to compensate for the nonstandard gain of optical sigmoids for optical neural networks.

Download Full-text

An approximate backpropagation learning rule for memristor based neural networks using synaptic plasticity

Neurocomputing ◽

10.1016/j.neucom.2016.10.061 ◽

2017 ◽

Vol 237 ◽

pp. 193-199 ◽

Cited By ~ 21

Author(s):

D. Negrov ◽

I. Karandashev ◽

V. Shakirov ◽

Yu. Matveyev ◽

W. Dunin-Barkowski ◽

...

Keyword(s):

Neural Networks ◽

Synaptic Plasticity ◽

Learning Rule ◽

Backpropagation Learning

Download Full-text

HOW TO MAKE GOOD USE OF MULTILAYER NEURAL NETWORKS

Journal of Biological System ◽

10.1142/s0218339095001064 ◽

1995 ◽

Vol 03 (04) ◽

pp. 1177-1191 ◽

Cited By ~ 1

Author(s):

HÉLÈNE PAUGAM-MOISY

Keyword(s):

Neural Networks ◽

Open Problem ◽

Back Propagation ◽

Learning Rule ◽

Multilayer Network ◽

Odor Recognition ◽

Short Summary ◽

Multilayer Neural Networks ◽

Hidden Layer ◽

Universal Approximators

This article is a survey of recent advances on multilayer neural networks. The first section is a short summary on multilayer neural networks, their history, their architecture and their learning rule, the well-known back-propagation. In the following section, several theorems are cited, which present one-hidden-layer neural networks as universal approximators. The next section points out that two hidden layers are often required for exactly realizing d-dimensional dichotomies. Defining the frontier between one-hidden-layer and two-hidden-layer networks is still an open problem. Several bounds on the size of a multilayer network which learns from examples are presented and we enhance the fact that, even if all can be done with only one hidden layer, more often, things can be done better with two or more hidden layers. Finally, this assertion 'is supported by the behaviour of multilayer neural networks in two applications: prediction of pollution and odor recognition modelling.

Download Full-text

Near-optimal dynamic learning rate for training backpropagation neural networks

10.1117/12.152627 ◽

1993 ◽

Cited By ~ 2

Author(s):

Serge Roy

Keyword(s):

Neural Networks ◽

Learning Rate ◽

Dynamic Learning ◽

Backpropagation Neural Networks ◽

Optimal Dynamic

Download Full-text

Relating the Slope of the Activation Function and the Learning Rate Within a Recurrent Neural Network

Neural Computation ◽

10.1162/089976699300016340 ◽

1999 ◽

Vol 11 (5) ◽

pp. 1069-1077 ◽

Cited By ~ 28

Author(s):

Danilo P. Mandic ◽

Jonathon A. Chambers

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Network ◽

Recurrent Neural Networks ◽

Degrees Of Freedom ◽

Learning Algorithm ◽

Activation Function ◽

Learning Rate ◽

Optimization Task ◽

Nonlinear Activation Function

A relationship between the learning rate η in the learning algorithm, and the slope β in the nonlinear activation function, for a class of recurrent neural networks (RNNs) trained by the real-time recurrent learning algorithm is provided. It is shown that an arbitrary RNN can be obtained via the referent RNN, with some deterministic rules imposed on its weights and the learning rate. Such relationships reduce the number of degrees of freedom when solving the nonlinear optimization task of finding the optimal RNN parameters.

Download Full-text

Speeding Up Back-Propagation Neural Networks

10.28945/2931 ◽

2005 ◽

Cited By ~ 4

Author(s):

Mohammed A. Otair ◽

Walid A. Salameh

Keyword(s):

Neural Networks ◽

Back Propagation ◽

Network Size ◽

Learning Rate ◽

Network Architectures ◽

Local Minima ◽

Set Size ◽

Long Time ◽

Multilayer Neural Networks ◽

Optical Time

There are many successful applications of Backpropagation (BP) for training multilayer neural networks. However, it has many shortcomings. Learning often takes long time to converge, and it may fall into local minima. One of the possible remedies to escape from local minima is by using a very small learning rate, which slows down the learning process. The proposed algorithm presented in this study used for training depends on a multilayer neural network with a very small learning rate, especially when using a large training set size. It can be applied in a generic manner for any network size that uses a backpropgation algorithm through an optical time (seen time). The paper describes the proposed algorithm, and how it can improve the performance of back-propagation (BP). The feasibility of proposed algorithm is shown through out number of experiments on different network architectures.

Download Full-text

Flexible Transmitter Network

Neural Computation ◽

10.1162/neco_a_01431 ◽

2021 ◽

pp. 1-20

Author(s):

Shao-Qun Zhang ◽

Zhi-Hua Zhou

Keyword(s):

Neural Networks ◽

Building Block ◽

Neuron Model ◽

Activation Function ◽

Spatiotemporal Data ◽

Basic Building Block ◽

Backpropagation Algorithm ◽

Gradient Calculation ◽

Fully Connected ◽

Complex Valued

Abstract Current neural networks are mostly built on the MP model, which usually formulates the neuron as executing an activation function on the real-valued weighted aggregation of signals received from other neurons. This letter proposes the flexible transmitter (FT) model, a novel bio-plausible neuron model with flexible synaptic plasticity. The FT model employs a pair of parameters to model the neurotransmitters between neurons and puts up a neuron-exclusive variable to record the regulated neurotrophin density. Thus, the FT model can be formulated as a two-variable, two-valued function, taking the commonly used MP neuron model as its particular case. This modeling manner makes the FT model biologically more realistic and capable of handling complicated data, even spatiotemporal data. To exhibit its power and potential, we present the flexible transmitter network (FTNet), which is built on the most common fully connected feedforward architecture taking the FT model as the basic building block. FTNet allows gradient calculation and can be implemented by an improved backpropagation algorithm in the complex-valued domain. Experiments on a broad range of tasks show that FTNet has power and potential in processing spatiotemporal data. This study provides an alternative basic building block in neural networks and exhibits the feasibility of developing artificial neural networks with neuronal plasticity.

Download Full-text

All-Chalcogenide Programmable All-Optical Deep Neural Networks

10.21203/rs.3.rs-259851/v1 ◽

2021 ◽

Author(s):

Ting Yu ◽

Xiaoxuan Ma ◽

Ernest Pastor ◽

Jonathan George ◽

Simon Wall ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Autonomous Vehicles ◽

Structural Phase ◽

Activation Function ◽

Navigation Systems ◽

Optical Neural Networks ◽

Electro Optic ◽

All Optical ◽

On Chip

Abstract Deeplearning algorithms are revolutionising many aspects of modern life. Typically, they are implemented in CMOS-based hardware with severely limited memory access times and inefficient data-routing. All-optical neural networks without any electro-optic conversions could alleviate these shortcomings. However, an all-optical nonlinear activation function, which is a vital building block for optical neural networks, needs to be developed efficiently on-chip. Here, we introduce and demonstrate both optical synapse weighting and all-optical nonlinear thresholding using two different effects in one single chalcogenide material. We show how the structural phase transitions in a wide-bandgap phase-change material enables storing the neural network weights via non-volatile photonic memory, whilst resonant bond destabilisation is used as a nonlinear activation threshold without changing the material. These two different transitions within chalcogenides enable programmable neural networks with near-zero static power consumption once trained, in addition to picosecond delays performing inference tasks not limited by wire charging that limit electrical circuits; for instance, we show that nanosecond-order weight programming and near-instantaneous weight updates enable accurate inference tasks within 20 picoseconds in a 3-layer all-optical neural network. Optical neural networks that bypass electro-optic conversion altogether hold promise for network-edge machine learning applications where decision-making in real-time are critical, such as for autonomous vehicles or navigation systems such as signal pre-processing of LIDAR systems.

Download Full-text

Use of Binary Sigmoid Function And Linear Identity In Artificial Neural Networks For Forecasting Population Density

IJISTECH (International Journal Of Information System & Technology) ◽

10.30645/ijistech.v1i1.6 ◽

2017 ◽

Vol 1 (1) ◽

pp. 43 ◽

Cited By ~ 4

Author(s):

Anjar Wanto ◽

Agus Perdana Windarto ◽

Dedy Hartama ◽

Iin Parlina

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Population Density ◽

Network Architecture ◽

Activation Function ◽

Sigmoid Function ◽

Backpropagation Algorithm ◽

Architecture Model ◽

Artificial Neural

Artificial Neural Network (ANN) is often used to solve forecasting cases. As in this study. The artificial neural network used is with backpropagation algorithm. The study focused on cases concerning overcrowding forecasting based District in Simalungun in Indonesia in 2010-2015. The data source comes from the Central Bureau of Statistics of Simalungun Regency. The population density forecasting its future will be processed using backpropagation algorithm focused on binary sigmoid function (logsig) and a linear function of identity (purelin) with 5 network architecture model used the 3-5-1, 3-10-1, 3-5 -10-1, 3-5-15-1 and 3-10-15-1. Results from 5 to architectural models using Neural Networks Backpropagation with binary sigmoid function and identity functions vary greatly, but the best is 3-5-1 models with an accuracy of 94%, MSE, and the epoch 0.0025448 6843 iterations. Thus, the use of binary sigmoid activation function (logsig) and the identity function (purelin) on Backpropagation Neural Networks for forecasting the population density is very good, as evidenced by the high accuracy results achieved.

Download Full-text

A convenient method to prune multilayer neural networks via transform domain backpropagation algorithm

[Proceedings 1992] IJCNN International Joint Conference on Neural Networks ◽

10.1109/ijcnn.1992.227051 ◽

2003 ◽

Author(s):

Xiahua Yang

Keyword(s):

Neural Networks ◽

Convenient Method ◽

Transform Domain ◽

Backpropagation Algorithm ◽

Multilayer Neural Networks

Download Full-text

Convergence of Batch Split-Complex Backpropagation Algorithm for Complex-Valued Neural Networks

Discrete Dynamics in Nature and Society ◽

10.1155/2009/329173 ◽

2009 ◽

Vol 2009 ◽

pp. 1-16 ◽

Cited By ~ 5

Author(s):

Huisheng Zhang ◽

Chao Zhang ◽

Wei Wu

Keyword(s):

Neural Networks ◽

Theoretical Analysis ◽

Error Function ◽

Iteration Process ◽

Learning Rate ◽

Backpropagation Algorithm ◽

Numerical Example ◽

Moderate Condition ◽

Complex Valued

The batch split-complex backpropagation (BSCBP) algorithm for training complex-valued neural networks is considered. For constant learning rate, it is proved that the error function of BSCBP algorithm is monotone during the training iteration process, and the gradient of the error function tends to zero. By adding a moderate condition, the weights sequence itself is also proved to be convergent. A numerical example is given to support the theoretical analysis.

Download Full-text