Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks

AbstractBrains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

Download Full-text

Event-based backpropagation can compute exact gradients for spiking neural networks

Scientific Reports ◽

10.1038/s41598-021-91786-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Timo C. Wunderlich ◽

Christian Pehle

Keyword(s):

Neural Networks ◽

Loss Function ◽

Spiking Neural Networks ◽

Backpropagation Algorithm ◽

General Loss Function ◽

Gradient Based ◽

Rigorous Study ◽

Spiking Networks ◽

Event Based ◽

First Time

AbstractSpiking neural networks combine analog computation with event-based communication using discrete spikes. While the impressive advances of deep learning are enabled by training non-spiking artificial neural networks using the backpropagation algorithm, applying this algorithm to spiking networks was previously hindered by the existence of discrete spike events and discontinuities. For the first time, this work derives the backpropagation algorithm for a continuous-time spiking neural network and a general loss function by applying the adjoint method together with the proper partial derivative jumps, allowing for backpropagation through discrete spike events without approximations. This algorithm, EventProp, backpropagates errors at spike times in order to compute the exact gradient in an event-based, temporally and spatially sparse fashion. We use gradients computed via EventProp to train networks on the Yin-Yang and MNIST datasets using either a spike time or voltage based loss function and report competitive performance. Our work supports the rigorous study of gradient-based learning algorithms in spiking neural networks and provides insights toward their implementation in novel brain-inspired hardware.

Download Full-text

The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks

Neural Computation ◽

10.1162/neco_a_01367 ◽

2021 ◽

pp. 1-27

Author(s):

Friedemann Zenke ◽

Tim P. Vogels

Keyword(s):

Neural Networks ◽

Information Processing ◽

Complex Function ◽

Spiking Neural Networks ◽

Learning Performance ◽

Design Parameters ◽

Practical Algorithms ◽

Neuromorphic Hardware ◽

Spiking Networks ◽

Gradient Learning

Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. Yet how network connectivity relates to function is poorly understood, and the functional capabilities of models of spiking networks are still rudimentary. The lack of both theoretical insight and practical algorithms to find the necessary connectivity poses a major impediment to both studying information processing in the brain and building efficient neuromorphic hardware systems. The training algorithms that solve this problem for artificial neural networks typically rely on gradient descent. But doing so in spiking networks has remained challenging due to the nondifferentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients affect learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative's scale can substantially affect learning performance. When we combine surrogate gradients with suitable activity regularization techniques, spiking networks perform robust information processing at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

Download Full-text