A Convergence Result for Learning in Recurrent Neural Networks

We give a rigorous analysis of the convergence properties of a backpropagation algorithm for recurrent networks containing either output or hidden layer recurrence. The conditions permit data generated by stochastic processes with considerable dependence. Restrictions are offered that may help assure convergence of the network parameters to a local optimum, as some simulations illustrate.

Download Full-text

Recurrent Neural Networks for Narrowband Signal Detection in the Time-Frequency Domain

Symposium - International Astronomical Union ◽

10.1017/s0074180900193751 ◽

2004 ◽

Vol 213 ◽

pp. 483-486

Author(s):

David Brodrick ◽

Douglas Taylor ◽

Joachim Diederich

Keyword(s):

Neural Network ◽

Neural Networks ◽

Signal Detection ◽

Frequency Domain ◽

Recurrent Neural Networks ◽

Radio Frequency Interference ◽

Recurrent Networks ◽

Time Frequency ◽

Narrowband Signal ◽

Radio Signals

A recurrent neural network was trained to detect the time-frequency domain signature of narrowband radio signals against a background of astronomical noise. The objective was to investigate the use of recurrent networks for signal detection in the Search for Extra-Terrestrial Intelligence, though the problem is closely analogous to the detection of some classes of Radio Frequency Interference in radio astronomy.

Download Full-text

Recurrent Neural Networks with Small Weights Implement Definite Memory Machines

Neural Computation ◽

10.1162/08997660360675080 ◽

2003 ◽

Vol 15 (8) ◽

pp. 1897-1929 ◽

Cited By ~ 32

Author(s):

Barbara Hammer ◽

Peter Tiňo

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Experimental Studies ◽

Activation Function ◽

Transition Function ◽

Recurrent Network ◽

Point Of View ◽

Recurrent Networks ◽

Contraction Parameter ◽

Arbitrary Precision

Recent experimental studies indicate that recurrent neural networks initialized with “small” weights are inherently biased toward definite memory machines (Tiňno, Čerňanský, & Beňušková, 2002a, 2002b). This article establishes a theoretical counterpart: transition function of recurrent network with small weights and squashing activation function is a contraction. We prove that recurrent networks with contractive transition function can be approximated arbitrarily well on input sequences of unbounded length by a definite memory machine. Conversely, every definite memory machine can be simulated by a recurrent network with contractive transition function. Hence, initialization with small weights induces an architectural bias into learning with recurrent neural networks. This bias might have benefits from the point of view of statistical learning theory: it emphasizes one possible region of the weight space where generalization ability can be formally proved. It is well known that standard recurrent neural networks are not distribution independent learnable in the probably approximately correct (PAC) sense if arbitrary precision and inputs are considered. We prove that recurrent networks with contractive transition function with a fixed contraction parameter fulfill the so-called distribution independent uniform convergence of empirical distances property and hence, unlike general recurrent networks, are distribution independent PAC learnable.

Download Full-text

Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks

10.1101/2021.03.22.436372 ◽

2021 ◽

Author(s):

Bojian Yin ◽

Federico Corradi ◽

Sander M. Bohté

Keyword(s):

Neural Networks ◽

Time Domain ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Spiking Neurons ◽

Recurrent Networks ◽

Computationally Efficient ◽

Hardware Implementations ◽

Comparable Performance ◽

The Time Domain

ABSTRACTInspired by more detailed modeling of biological neurons, Spiking neural networks (SNNs) have been investigated both as more biologically plausible and potentially more powerful models of neural computation, and also with the aim of extracting biological neurons’ energy efficiency; the performance of such networks however has remained lacking compared to classical artificial neural networks (ANNs). Here, we demonstrate how a novel surrogate gradient combined with recurrent networks of tunable and adaptive spiking neurons yields state-of-the-art for SNNs on challenging benchmarks in the time-domain, like speech and gesture recognition. This also exceeds the performance of standard classical recurrent neural networks (RNNs) and approaches that of the best modern ANNs. As these SNNs exhibit sparse spiking, we show that they theoretically are one to three orders of magnitude more computationally efficient compared to RNNs with comparable performance. Together, this positions SNNs as an attractive solution for AI hardware implementations.

Download Full-text

NONLINEAR SYSTEM IDENTIFICATION BASED ON INTERNAL RECURRENT NEURAL NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065709001884 ◽

2009 ◽

Vol 19 (02) ◽

pp. 115-125 ◽

Cited By ~ 32

Author(s):

GHEORGHE PUSCASU ◽

BOGDAN CODRES ◽

ALEXANDRU STANCU ◽

GABRIEL MURARIU

Keyword(s):

Neural Networks ◽

System Identification ◽

Nonlinear System ◽

Recurrent Neural Networks ◽

Internal State ◽

Nonlinear System Identification ◽

Backpropagation Algorithm ◽

Novel Approach ◽

System States

A novel approach for nonlinear complex system identification based on internal recurrent neural networks (IRNN) is proposed in this paper. The computational complexity of neural identification can be greatly reduced if the whole system is decomposed into several subsystems. This approach employs internal state estimation when no measurements coming from the sensors are available for the system states. A modified backpropagation algorithm is introduced in order to train the IRNN for nonlinear system identification. The performance of the proposed design approach is proven on a car simulator case study.

Download Full-text

Aspects of the numerical analysis of neural networks

Acta Numerica ◽

10.1017/s0962492900002439 ◽

1994 ◽

Vol 3 ◽

pp. 145-202 ◽

Cited By ~ 22

Author(s):

S.W. Ellacott

Keyword(s):

Neural Networks ◽

Degree Of Approximation ◽

Compact Set ◽

Backpropagation Algorithm ◽

Feedforward Network ◽

Delta Rule ◽

Open Questions ◽

Numerical Process ◽

The Subject ◽

Hidden Layer

This article starts with a brief introduction to neural networks for those unfamiliar with the basic concepts, together with a very brief overview of mathematical approaches to the subject. This is followed by a more detailed look at three areas of research which are of particular interest to numerical analysts.The first area is approximation theory. IfKis a compact set in ℝn, for somen, then it is proved that a semilinear feedforward network with one hidden layer can uniformly approximate any continuous function inC(K) to any required accuracy. A discussion of known results and open questions on the degree of approximation is included. We also consider the relevance of radial basis functions to neural networks.The second area considered is that of learning algorithms. A detailed analysis of one popular algorithm (the delta rule) will be given, indicating why one implementation leads to a stable numerical process, whereas an initially attractive variant (essentially a form of steepest descent) does not. Similar considerations apply to the backpropagation algorithm. The effect of filtering and other preprocessing of the input data will also be discussed systematically.Finally some applications of neural networks to numerical computation are considered.

Download Full-text

Sufficient Conditions for Error Backflow Convergence in Dynamical Recurrent Neural Networks

Neural Computation ◽

10.1162/089976602760128063 ◽

2002 ◽

Vol 14 (8) ◽

pp. 1907-1927 ◽

Cited By ~ 3

Author(s):

Alex Aussem

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Error Propagation ◽

Finite Impulse Response ◽

Learning Algorithm ◽

Previous Analysis ◽

Sufficient Conditions ◽

Recurrent Networks ◽

Computational Overhead ◽

Propagation Network

This article extends previous analysis of the gradient decay to a class of discrete-time fully recurrent networks, called dynamical recurrent neural networks, obtained by modeling synapses as finite impulse response (FIR) filters instead of multiplicative scalars. Using elementary matrix manipulations, we provide an upper bound on the norm of the weight matrix, ensuring that the gradient vector, when propagated in a reverse manner in time through the error-propagation network, decays exponentially to zero. This bound applies to all recurrent FIR architecture proposals, as well as fixed-point recurrent networks, regardless of delay and connectivity. In addition, we show that the computational overhead of the learning algorithm can be reduced drastically by taking advantage of the exponential decay of the gradient.

Download Full-text

A Subgrouping Strategy that Reduces Complexity and Speeds Up Learning in Recurrent Networks

Neural Computation ◽

10.1162/neco.1989.1.4.552 ◽

1989 ◽

Vol 1 (4) ◽

pp. 552-558 ◽

Cited By ~ 35

Author(s):

David Zipser

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Error Propagation ◽

Computation Time ◽

Great Power ◽

Recurrent Networks ◽

Original Network ◽

Original Algorithm ◽

Finite State ◽

Unit Network

An algorithm, called RTRL, for training fully recurrent neural networks has recently been studied by Williams and Zipser (1989a, b). Whereas RTRL has been shown to have great power and generality, it has the disadvantage of requiring a great deal of computation time. A technique is described here for reducing the amount of computation required by RTRL without changing the connectivity of the networks. This is accomplished by dividing the original network into subnets for the purpose of error propagation while leaving them undivided for activity propagation. An example is given of a 12-unit network that learns to be the finite-state part of a Turing machine and runs 10 times faster using the subgrouping strategy than the original algorithm.

Download Full-text

Biologically plausible learning in recurrent neural networks reproduces neural dynamics observed during cognitive tasks

10.1101/057729 ◽

2016 ◽

Cited By ~ 4

Author(s):

Thomas Miconi

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Complex Dynamics ◽

Learning Rule ◽

Cognitive Tasks ◽

Recurrent Networks ◽

Plausible Model ◽

Relevant Variables ◽

And Performance ◽

Encode Task

AbstractNeural activity during cognitive tasks exhibits complex dynamics that flexibly encode task-relevant variables. Chaotic recurrent networks, which spontaneously generate rich dynamics, have been proposed as a model of cortical computation during cognitive tasks. However, existing methods for training these networks are either biologically implausible, and/or require a continuous, real-time error signal to guide learning. Here we show that a biologically plausible learning rule can train such recurrent networks, guided solely by delayed, phasic rewards at the end of each trial. Networks endowed with this learning rule can successfully learn nontrivial tasks requiring flexible (context-dependent) associations, memory maintenance, nonlinear mixed selectivities, and coordination among multiple outputs. The resulting networks replicate complex dynamics previously observed in animal cortex, such as dynamic encoding of task features and selective integration of sensory inputs. We conclude that recurrent neural networks offer a plausible model of cortical dynamics during both learning and performance of flexible behavior.

Download Full-text

Rotational remapping between differently prioritized representations in visual working memory

10.1101/2021.05.13.443973 ◽

2021 ◽

Author(s):

Quan Wan ◽

Jorge A. Menendez ◽

Bradley R. Postle

Keyword(s):

Neural Networks ◽

Working Memory ◽

Visual Working Memory ◽

Principle Component Analysis ◽

Recurrent Neural Networks ◽

Memory Task ◽

Neural Computation ◽

Hidden Layer ◽

The Brain ◽

Insight Into

How does the brain prioritize among the contents of working memory to appropriately guide behavior? Using inverted encoding modeling (IEM), previous work (Wan et al., 2020) showed that unprioritized memory items (UMI) are actively represented in the brain but in a “flipped”, or opposite, format compared to prioritized memory items (PMI). To gain insight into the mechanisms underlying the UMI-to-PMI representational transformation, we trained recurrent neural networks (RNNs) with an LSTM architecture to perform a 2-back working memory task. Visualization of the LSTM hidden layer activity using Principle Component Analysis (PCA) revealed that the UMI representation is rotationally remapped to that of PMI, and this was quantified and confirmed via demixed PCA. The application of the same analyses to the EEG dataset of Wan et al. (2020) revealed similar rotational remapping between the UMI and PMI representations. These results identify rotational remapping as a candidate neural computation employed in the dynamic prioritization within contents of working memory.

Download Full-text

A MODIFIED ERROR BACKPROPAGATION ALGORITHM FOR COMPLEX-VALUE NEURAL NETWORKS

International Journal of Neural Systems ◽

10.1142/s0129065705000426 ◽

2005 ◽

Vol 15 (06) ◽

pp. 435-443 ◽

Cited By ~ 10

Author(s):

XIAOMING CHEN ◽

ZHENG TANG ◽

CATHERINE VARIAPPAN ◽

SONGSONG LI ◽

TOSHIMI OKADA

Keyword(s):

Neural Networks ◽

Image Processing ◽

Fourier Transformation ◽

Error Function ◽

Local Minima ◽

Backpropagation Algorithm ◽

Speed Up ◽

Simulation Results ◽

Hidden Layer ◽

Complex Valued

The complex-valued backpropagation algorithm has been widely used in fields of dealing with telecommunications, speech recognition and image processing with Fourier transformation. However, the local minima problem usually occurs in the process of learning. To solve this problem and to speed up the learning process, we propose a modified error function by adding a term to the conventional error function, which is corresponding to the hidden layer error. The simulation results show that the proposed algorithm is capable of preventing the learning from sticking into the local minima and of speeding up the learning.

Download Full-text