A Multi-feature Fusion Temporal Neural Network for Multi-hand Gesture Recognition using Millimeter-wave Radar Sensor

Author(s):  
Dengke Yao ◽  
Yong Wang ◽  
Wei Nie ◽  
Liangbo Xie ◽  
Mu Zhou ◽  
...  
Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 259
Author(s):  
Kang Zhang ◽  
Shengchang Lan ◽  
Guiyuan Zhang

The purpose of this paper was to investigate the effect of a training state-of-the-art convolution neural network (CNN) for millimeter-wave radar-based hand gesture recognition (MR-HGR). Focusing on the small training dataset problem in MR-HGR, this paper first proposed to transfer the knowledge with the CNN models in computer vision to MR-HGR by fine-tuning the models with radar data samples. Meanwhile, for the different data modality in MR-HGR, a parameterized representation of temporal space-velocity (TSV) spectrogram was proposed as an integrated data modality of the time-evolving hand gesture features in the radar echo signals. The TSV spectrograms representing six common gestures in human–computer interaction (HCI) from nine volunteers were used as the data samples in the experiment. The evaluated models included ResNet with 50, 101, and 152 layers, DenseNet with 121, 161 and 169 layers, as well as light-weight MobileNet V2 and ShuffleNet V2, mostly proposed by many latest publications. In the experiment, not only self-testing (ST), but also more persuasive cross-testing (CT), were implemented to evaluate whether the fine-tuned models generalize to the radar data samples. The CT results show that the best fine-tuned models can reach to an average accuracy higher than 93% with a comparable ST average accuracy almost 100%. Moreover, in order to alleviate the problem caused by private gesture habits, an auxiliary test was performed by augmenting four shots of the gestures with the heaviest misclassifications into the training set. This enriching test is similar with the scenario that a tablet reacts to a new user. The results of two different volunteer in the enriching test shows that the average accuracy of the enriched gesture can be improved from 55.59% and 65.58% to 90.66% and 95.95% respectively. Compared with some baseline work in MR-HGR, the investigation by this paper can be beneficial in promoting MR-HGR in future industry applications and consumer electronic design.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 79147-79158 ◽  
Author(s):  
Changjiang Liu ◽  
Yuanhao Li ◽  
Dongyang Ao ◽  
Haiyan Tian

Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1405
Author(s):  
Ing Jyh Tsang ◽  
Federico Corradi ◽  
Manolis Sifalakis ◽  
Werner Van Leekwijck ◽  
Steven Latré

We propose a spiking neural network (SNN) approach for radar-based hand gesture recognition (HGR), using frequency modulated continuous wave (FMCW) millimeter-wave radar. After pre-processing the range-Doppler or micro-Doppler radar signal, we use a signal-to-spike conversion scheme that encodes radar Doppler maps into spike trains. The spike trains are fed into a spiking recurrent neural network, a liquid state machine (LSM). The readout spike signal from the SNN is then used as input for different classifiers for comparison, including logistic regression, random forest, and support vector machine (SVM). Using liquid state machines of less than 1000 neurons, we achieve better than state-of-the-art results on two publicly available reference datasets, reaching over 98% accuracy on 10-fold cross-validation for both data sets.


2020 ◽  
Vol 17 (4) ◽  
pp. 497-506
Author(s):  
Sunil Patel ◽  
Ramji Makwana

Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%


2020 ◽  
Vol 29 (6) ◽  
pp. 1153-1164
Author(s):  
Qianyi Xu ◽  
Guihe Qin ◽  
Minghui Sun ◽  
Jie Yan ◽  
Huiming Jiang ◽  
...  

2021 ◽  
Vol 5 (3) ◽  
pp. 1-4
Author(s):  
Dominik Meier ◽  
Christian Zech ◽  
Benjamin Baumann ◽  
Bersant Gashi ◽  
Matthias Malzacher ◽  
...  

Author(s):  
Christian Schoffmann ◽  
Barnaba Ubezio ◽  
Christoph Boehm ◽  
Stephan Muhlbacher-Karrer ◽  
Hubert Zangl

Sign in / Sign up

Export Citation Format

Share Document