scholarly journals The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks

2021 ◽  
pp. 1-27
Author(s):  
Friedemann Zenke ◽  
Tim P. Vogels

Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. Yet how network connectivity relates to function is poorly understood, and the functional capabilities of models of spiking networks are still rudimentary. The lack of both theoretical insight and practical algorithms to find the necessary connectivity poses a major impediment to both studying information processing in the brain and building efficient neuromorphic hardware systems. The training algorithms that solve this problem for artificial neural networks typically rely on gradient descent. But doing so in spiking networks has remained challenging due to the nondifferentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients affect learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative's scale can substantially affect learning performance. When we combine surrogate gradients with suitable activity regularization techniques, spiking networks perform robust information processing at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

2020 ◽  
Author(s):  
Friedemann Zenke ◽  
Tim P. Vogels

AbstractBrains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3240
Author(s):  
Tehreem Syed ◽  
Vijay Kakani ◽  
Xuenan Cui ◽  
Hakil Kim

In recent times, the usage of modern neuromorphic hardware for brain-inspired SNNs has grown exponentially. In the context of sparse input data, they are undertaking low power consumption for event-based neuromorphic hardware, specifically in the deeper layers. However, using deep ANNs for training spiking models is still considered as a tedious task. Until recently, various ANN to SNN conversion methods in the literature have been proposed to train deep SNN models. Nevertheless, these methods require hundreds to thousands of time-steps for training and still cannot attain good SNN performance. This work proposes a customized model (VGG, ResNet) architecture to train deep convolutional spiking neural networks. In this current study, the training is carried out using deep convolutional spiking neural networks with surrogate gradient descent backpropagation in a customized layer architecture similar to deep artificial neural networks. Moreover, this work also proposes fewer time-steps for training SNNs with surrogate gradient descent. During the training with surrogate gradient descent backpropagation, overfitting problems have been encountered. To overcome these problems, this work refines the SNN based dropout technique with surrogate gradient descent. The proposed customized SNN models achieve good classification results on both private and public datasets. In this work, several experiments have been carried out on an embedded platform (NVIDIA JETSON TX2 board), where the deployment of customized SNN models has been extensively conducted. Performance validations have been carried out in terms of processing time and inference accuracy between PC and embedded platforms, showing that the proposed customized models and training techniques are feasible for achieving a better performance on various datasets such as CIFAR-10, MNIST, SVHN, and private KITTI and Korean License plate dataset.


2021 ◽  
Author(s):  
Ceca Kraišniković ◽  
Wolfgang Maass ◽  
Robert Legenstein

The brain uses recurrent spiking neural networks for higher cognitive functions such as symbolic computations, in particular, mathematical computations. We review the current state of research on spike-based symbolic computations of this type. In addition, we present new results which show that surprisingly small spiking neural networks can perform symbolic computations on bit sequences and numbers and even learn such computations using a biologically plausible learning rule. The resulting networks operate in a rather low firing rate regime, where they could not simply emulate artificial neural networks by encoding continuous values through firing rates. Thus, we propose here a new paradigm for symbolic computation in neural networks that provides concrete hypotheses about the organization of symbolic computations in the brain. The employed spike-based network models are the basis for drastically more energy-efficient computer hardware – neuromorphic hardware. Hence, our results can be seen as creating a bridge from symbolic artificial intelligence to energy-efficient implementation in spike-based neuromorphic hardware.


2015 ◽  
Vol 59 (2) ◽  
pp. 1-5 ◽  
Author(s):  
Juncheng Shen ◽  
De Ma ◽  
Zonghua Gu ◽  
Ming Zhang ◽  
Xiaolei Zhu ◽  
...  

2021 ◽  
Vol 15 ◽  
Author(s):  
Youngeun Kim ◽  
Priyadarshini Panda

Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, SNNs convey temporally-varying spike activation through time that is likely to induce a large variation of forward activation and backward gradients, resulting in unstable training. To address this training issue in SNNs, we revisit Batch Normalization (BN) and propose a temporal Batch Normalization Through Time (BNTT) technique. Different from previous BN techniques with SNNs, we find that varying the BN parameters at every time-step allows the model to learn the time-varying input distribution better. Specifically, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. We demonstrate BNTT on CIFAR-10, CIFAR-100, Tiny-ImageNet, event-driven DVS-CIFAR10 datasets, and Sequential MNIST and show near state-of-the-art performance. We conduct comprehensive analysis on the temporal characteristic of BNTT and showcase interesting benefits toward robustness against random and adversarial noise. Further, by monitoring the learnt parameters of BNTT, we find that we can do temporal early exit. That is, we can reduce the inference latency by ~5 − 20 time-steps from the original training latency. The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/BNTT-Batch-Normalization-Through-Time.


2020 ◽  
Vol 92 (11) ◽  
pp. 1293-1302
Author(s):  
Adarsha Balaji ◽  
Thibaut Marty ◽  
Anup Das ◽  
Francky Catthoor

Sign in / Sign up

Export Citation Format

Share Document