backpropagation through time
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 12)

H-INDEX

12
(FIVE YEARS 3)

2021 ◽  
Vol 15 ◽  
Author(s):  
Axel Laborieux ◽  
Maxence Ernoult ◽  
Benjamin Scellier ◽  
Yoshua Bengio ◽  
Julie Grollier ◽  
...  

Equilibrium Propagation is a biologically-inspired algorithm that trains convergent recurrent neural networks with a local learning rule. This approach constitutes a major lead to allow learning-capable neuromophic systems and comes with strong theoretical guarantees. Equilibrium propagation operates in two phases, during which the network is let to evolve freely and then “nudged” toward a target; the weights of the network are then updated based solely on the states of the neurons that they connect. The weight updates of Equilibrium Propagation have been shown mathematically to approach those provided by Backpropagation Through Time (BPTT), the mainstream approach to train recurrent neural networks, when nudging is performed with infinitely small strength. In practice, however, the standard implementation of Equilibrium Propagation does not scale to visual tasks harder than MNIST. In this work, we show that a bias in the gradient estimate of equilibrium propagation, inherent in the use of finite nudging, is responsible for this phenomenon and that canceling it allows training deep convolutional neural networks. We show that this bias can be greatly reduced by using symmetric nudging (a positive nudging and a negative one). We also generalize Equilibrium Propagation to the case of cross-entropy loss (by opposition to squared error). As a result of these advances, we are able to achieve a test error of 11.7% on CIFAR-10, which approaches the one achieved by BPTT and provides a major improvement with respect to the standard Equilibrium Propagation that gives 86% test error. We also apply these techniques to train an architecture with unidirectional forward and backward connections, yielding a 13.2% test error. These results highlight equilibrium propagation as a compelling biologically-plausible approach to compute error gradients in deep neuromorphic systems.


Author(s):  
Luca Manneschi ◽  
Matthew O. A. Ellis ◽  
Guido Gigante ◽  
Andrew C. Lin ◽  
Paolo Del Giudice ◽  
...  

Echo state networks (ESNs) are a powerful form of reservoir computing that only require training of linear output weights while the internal reservoir is formed of fixed randomly connected neurons. With a correctly scaled connectivity matrix, the neurons’ activity exhibits the echo-state property and responds to the input dynamics with certain timescales. Tuning the timescales of the network can be necessary for treating certain tasks, and some environments require multiple timescales for an efficient representation. Here we explore the timescales in hierarchical ESNs, where the reservoir is partitioned into two smaller linked reservoirs with distinct properties. Over three different tasks (NARMA10, a reconstruction task in a volatile environment, and psMNIST), we show that by selecting the hyper-parameters of each partition such that they focus on different timescales, we achieve a significant performance improvement over a single ESN. Through a linear analysis, and under the assumption that the timescales of the first partition are much shorter than the second’s (typically corresponding to optimal operating conditions), we interpret the feedforward coupling of the partitions in terms of an effective representation of the input signal, provided by the first partition to the second, whereby the instantaneous input signal is expanded into a weighted combination of its time derivatives. Furthermore, we propose a data-driven approach to optimise the hyper-parameters through a gradient descent optimisation method that is an online approximation of backpropagation through time. We demonstrate the application of the online learning rule across all the tasks considered.


Author(s):  
Daniele Di Sarli ◽  
Claudio Gallicchio ◽  
Alessio Micheli

In the context of recurrent neural networks, gated architectures such as the GRU have contributed to the development of highly accurate machine learning models that can tackle long-term dependencies in the data. However, the training of such networks is performed by the expensive algorithm of gradient descent with backpropagation through time. On the other hand, reservoir computing approaches such as Echo State Networks (ESNs) can produce models that can be trained efficiently thanks to the use of fixed random parameters, but are not ideal for dealing with data presenting long-term dependencies. We explore the problem of employing gated architectures in ESNs from both theoretical and empirical perspectives. We do so by deriving and evaluating a necessary condition for the non-contractivity of the state transition function, which is important to overcome the fading-memory characterization of conventional ESNs. We find that using pure reservoir computing methodologies is not sufficient for effective gating mechanisms, while instead training even only the gates is highly effective in terms of predictive accuracy.


2020 ◽  
pp. 95-115
Author(s):  
Jarosław Skaruz

In the paper we present a new approach based on application of neural networks to detect SQL attacks. SQL attacks are those attacks that take the advantage of using SQL statements to be performed. The problem of detection of this class of attacks is transformed to time series prediction problem. SQL queries are used as a source of events in a protected environment. To differentiate between normal SQL queries and those sent by an attacker, we divide SQL statements into tokens and pass them to our detection system, which predicts the next token, taking into account previously seen tokens. In the learning phase tokens are passed to a recurrent neural network (RNN) trained by backpropagation through time (BPTT) algorithm. Then, two coefficients of the rule are evaluated. The rule is used to interpret RNN output. In the testing phase RNN with the rule is examined against attacks and legal data to find out how evaluated rule affects efficiency of detecting attacks. All experiments were conducted on Jordan network. Experimental results show the relationship between the rule and a length of SQL queries.


Author(s):  
Varadharajan. Veeramanikandan ◽  
Mohan Jeyakarthic

Background: Presently, financial credit scoring (CS) is considered as a hottest research topic in the financial sectors which assist to determine the credit value of individual persons as well as organizations. Data mining approaches finds useful in the banking sectors which assist them to determine the proper products or services to the customer with minimal risks. Credit risks linked to the risk of loss and loan defaults are the main source of risk exist in the banking sector. <p> Aim: This paper aims to present an effective credit score prediction model for the banking sectors which assist them to foresee the credible customers who have applied for loan. <p> Methods: An optimal deep neural network (DNN) based framework is employed for credit score data classification by the use of stacked autoencoders (SA). Here, SA is applied for extracting the features from the dataset and undergoes classification by the use of SoftMax layer. Besides, tuning of network also takes place through truncated backpropagation through time (TBPTT) model in a supervised way with the training dataset. <p> Results: The proposed model is tested using a benchmark German credit dataset which includes the necessary variables to determine credit score of a loan applicant. The presented SADNN model offers maximum classification with the higher accuracy of 96.10%, F-score of 97.25% and accuracy of 90.52%. <p> Conclusion: The experimental results pointed out that maximum classification performance is attained by proposed model on all the different aspects. The proposed method helps to determine the capability of a borrower in repaying the loan and compute the credit risks properly.


2020 ◽  
Vol 2 (3) ◽  
pp. 155-156 ◽  
Author(s):  
Luca Manneschi ◽  
Eleni Vasilaki

2019 ◽  
Author(s):  
Guillaume Bellec ◽  
Franz Scherr ◽  
Anand Subramoney ◽  
Elias Hajek ◽  
Darjan Salaj ◽  
...  

AbstractRecurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. But in spite of extensive research, it has remained open how they can learn through synaptic plasticity to carry out complex network computations. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A new mathematical insight tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This new learning method – called e-prop – approaches the performance of BPTT (backpropagation through time), the best known method for training recurrent neural networks in machine learning. In addition, it suggests a method for powerful on-chip learning in novel energy-efficient spike-based hardware for AI.


Sign in / Sign up

Export Citation Format

Share Document