An Event-based Categorization Model Using Spatio-temporal Features in a Spiking Neural Network

Author(s):  
Junwei Lu ◽  
Junfei Dong ◽  
Rui Yan ◽  
Huajin Tang
2020 ◽  
Vol 10 (15) ◽  
pp. 5326
Author(s):  
Xiaolei Diao ◽  
Xiaoqiang Li ◽  
Chen Huang

The same action takes different time in different cases. This difference will affect the accuracy of action recognition to a certain extent. We propose an end-to-end deep neural network called “Multi-Term Attention Networks” (MTANs), which solves the above problem by extracting temporal features with different time scales. The network consists of a Multi-Term Attention Recurrent Neural Network (MTA-RNN) and a Spatio-Temporal Convolutional Neural Network (ST-CNN). In MTA-RNN, a method for fusing multi-term temporal features are proposed to extract the temporal dependence of different time scales, and the weighted fusion temporal feature is recalibrated by the attention mechanism. Ablation research proves that this network has powerful spatio-temporal dynamic modeling capabilities for actions with different time scales. We perform extensive experiments on four challenging benchmark datasets, including the NTU RGB+D dataset, UT-Kinect dataset, Northwestern-UCLA dataset, and UWA3DII dataset. Our method achieves better results than the state-of-the-art benchmarks, which demonstrates the effectiveness of MTANs.


Author(s):  
Shaznoor Shakira Saharuddin ◽  
Norhanifah Murli ◽  
Muhammad Azani Hasibuan

2017 ◽  
Vol 7 (1) ◽  
Author(s):  
Marc Osswald ◽  
Sio-Hoi Ieng ◽  
Ryad Benosman ◽  
Giacomo Indiveri

Abstract Stereo vision is an important feature that enables machine vision systems to perceive their environment in 3D. While machine vision has spawned a variety of software algorithms to solve the stereo-correspondence problem, their implementation and integration in small, fast, and efficient hardware vision systems remains a difficult challenge. Recent advances made in neuromorphic engineering offer a possible solution to this problem, with the use of a new class of event-based vision sensors and neural processing devices inspired by the organizing principles of the brain. Here we propose a radically novel model that solves the stereo-correspondence problem with a spiking neural network that can be directly implemented with massively parallel, compact, low-latency and low-power neuromorphic engineering devices. We validate the model with experimental results, highlighting features that are in agreement with both computational neuroscience stereo vision theories and experimental findings. We demonstrate its features with a prototype neuromorphic hardware system and provide testable predictions on the role of spike-based representations and temporal dynamics in biological stereo vision processing systems.


2016 ◽  
Author(s):  
Irina Higgins ◽  
Simon Stringer ◽  
Jan Schnupp

AbstractThe nature of the code used in the auditory cortex to represent complex auditory stimuli, such as naturally spoken words, remains a matter of debate. Here we argue that such representations are encoded by stable spatio-temporal patterns of firing within cell assemblies known as polychronous groups, or PGs. We develop a physiologically grounded, unsupervised spiking neural network model of the auditory brain with local, biologically realistic, spike-time dependent plasticity (STDP) learning, and show that the plastic cortical layers of the network develop PGs which convey substantially more information about the speaker independent identity of two naturally spoken word stimuli than does rate encoding that ignores the precise spike timings. We furthermore demonstrate that such informative PGs can only develop if the input spatio-temporal spike patterns to the plastic cortical areas of the model are relatively stable.Author SummaryCurrently we still do not know how the auditory cortex encodes the identity of complex auditory objects, such as words, given the great variability in the raw auditory waves that correspond to the different pronunciations of the same word by different speakers. Here we argue for temporal information encoding within neural cell assemblies for representing auditory objects. Unlike the more traditionally accepted rate encoding, temporal encoding takes into account the precise relative timing of spikes across a population of neurons. We provide support for our hypothesis by building a neurophysiologically grounded spiking neural network model of the auditory brain with a biologically plausible learning mechanism. We show that the model learns to differentiate between naturally spoken digits “one” and “two” pronounced by numerous speakers in a speaker-independent manner through simple unsupervised exposure to the words. Our simulations demonstrate that temporal encoding contains significantly more information about the two words than rate encoding. We also show that such learning depends on the presence of stable patterns of firing in the input to the cortical areas of the model that are performing the learning.


2022 ◽  
Author(s):  
Antoine Grimaldi ◽  
Victor Boutin ◽  
Sio-Hoi Ieng ◽  
Ryad Benosman ◽  
Laurent Perrinet

<div> <div> <div> <p>We propose a neuromimetic architecture able to perform always-on pattern recognition. To achieve this, we extended an existing event-based algorithm [1], which introduced novel spatio-temporal features as a Hierarchy Of Time-Surfaces (HOTS). Built from asynchronous events acquired by a neuromorphic camera, these time surfaces allow to code the local dynamics of a visual scene and to create an efficient event-based pattern recognition architecture. Inspired by neuroscience, we extended this method to increase its performance. Our first contribution was to add a homeostatic gain control on the activity of neurons to improve the learning of spatio-temporal patterns [2]. A second contribution is to draw an analogy between the HOTS algorithm and Spiking Neural Networks (SNN). Following that analogy, our last contribution is to modify the classification layer and remodel the offline pattern categorization method previously used into an online and event-driven one. This classifier uses the spiking output of the network to define novel time surfaces and we then perform online classification with a neuromimetic implementation of a multinomial logistic regression. Not only do these improvements increase consistently the performances of the network, they also make this event-driven pattern recognition algorithm online and bio-realistic. Results were validated on different datasets: DVS barrel [3], Poker-DVS [4] and N-MNIST [5]. We foresee to develop the SNN version of the method and to extend this fully event-driven approach to more naturalistic tasks, notably for always-on, ultra-fast object categorization. </p> </div> </div> </div>


Sign in / Sign up

Export Citation Format

Share Document