Synaptic plasticity model of a spiking neural network for reinforcement learning

In the current era, design and development of artificial neural networks exploiting the architecture of the human brain have evolved rapidly. Artificial neural networks effectively solve a wide range of common for artificial intelligence tasks involving data classification and recognition, prediction, forecasting and adaptive control of object behavior. Biologically inspired underlying principles of ANN operation have certain advantages over the conventional von Neumann architecture including unsupervised learning, architectural flexibility and adaptability to environmental change and high performance under significantly reduced power consumption due to heavy parallel and asynchronous data processing. In this paper, we present the circuit design of main functional blocks (neurons and synapses) intended for hardware implementation of a perceptron-based feedforward spiking neural network. As the third generation of artificial neural networks, spiking neural networks perform data processing utilizing spikes, which are discrete events (or functions) that take place at points in time. Neurons in spiking neural networks initiate precisely timing spikes and communicate with each other via spikes transmitted through synaptic connections or synapses with adaptable scalable weight. One of the prospective approach to emulate the synaptic behavior in hardware implemented spiking neural networks is to use non-volatile memory devices with analog conduction modulation (or memristive structures). Here we propose a circuit design for functional analogues of memristive structure to mimic a synaptic plasticity, pre- and postsynaptic neurons which could be used for developing circuit design of spiking neural network architectures with different training algorithms including spike-timing dependent plasticity learning rule. Two different circuits of electronic synapse were developed. The first one is an analog synapse with photoresistive optocoupler used to ensure the tunable conductivity for synaptic plasticity emulation. While the second one is a digital synapse, in which the synaptic weight is stored in a digital code with its direct conversion into conductivity (without digital-to-analog converter andphotoresistive optocoupler). The results of the prototyping of developed circuits for electronic analogues of synapses, pre- and postsynaptic neurons and the study of transient processes are presented. The developed approach could provide a basis for ASIC design of spiking neural networks based on CMOS (complementary metal oxide semiconductor) design technology.

Download Full-text

A Spiking Neural Network Based Autonomous Reinforcement Learning Model and Its Application in Decision Making

Advances in Brain Inspired Cognitive Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-319-49685-6_12 ◽

2016 ◽

pp. 125-137

Author(s):

Guixiang Wang ◽

Yi Zeng ◽

Bo Xu

Keyword(s):

Neural Network ◽

Decision Making ◽

Reinforcement Learning ◽

Learning Model ◽

Spiking Neural Network ◽

Reinforcement Learning Model

Download Full-text

Studying a Reinforcement Learning Technique for the Spiking Neural Network

Science and Education of the Bauman MSTU ◽

10.7463/0616.0842238 ◽

2016 ◽

Vol 16 (06) ◽

Author(s):

A Kozov ◽

A Chernyshev

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Spiking Neural Network ◽

Learning Technique

Download Full-text

Exploring the limits of learning: segregation of information integration and response selection is required for learning a serial reversal task

10.1101/163725 ◽

2017 ◽

Author(s):

Camilo J. Mininni ◽

B. Silvano Zanutto

Keyword(s):

Neural Network ◽

Synaptic Plasticity ◽

Network Model ◽

Neural Network Model ◽

Response Selection ◽

Neural Population ◽

Spiking Neural Network ◽

Reversal Task ◽

Spiking Neural Network Model ◽

Serial Reversal

AbstractAnimals are proposed to learn the latent rules governing their environment in order to maximize their chances of survival. However, rules may change without notice, forcing animals to keep a memory of which one is currently at work. Rule switching can lead to situations in which the same stimulus/response pairing is positively and negatively rewarded in the long run, depending on variables that are not accessible to the animal. This fact rises questions on how neural systems are capable of reinforcement learning in environments where the reinforcement is inconsistent. Here we address this issue by asking about which aspects of connectivity, neural excitability and synaptic plasticity are key for a very general, stochastic spiking neural network model to solve a task in which rules change without being cued, taking the serial reversal task (SRT) as paradigm. Contrary to what could be expected, we found strong limitations for biologically plausible networks to solve the SRT. Especially, we proved that no network of neurons can learn a SRT if it is a single neural population that integrates stimuli information and at the same time is responsible of choosing the behavioural response. This limitation is independent of the number of neurons, neuronal dynamics or plasticity rules, and arises from the fact that plasticity is locally computed at each synapse, and that synaptic changes and neuronal activity are mutually dependent processes. We propose and characterize a spiking neural network model that solves the SRT, which relies on separating the functions of stimuli integration and response selection. The model suggests that experimental efforts to understand neural function should focus on the characterization of neural circuits according to their connectivity, neural dynamics, and the degree of modulation of synaptic plasticity with reward.

Download Full-text