A Multi-feature Fusion Temporal Neural Network for Multi-hand Gesture Recognition using Millimeter-wave Radar Sensor

The purpose of this paper was to investigate the effect of a training state-of-the-art convolution neural network (CNN) for millimeter-wave radar-based hand gesture recognition (MR-HGR). Focusing on the small training dataset problem in MR-HGR, this paper first proposed to transfer the knowledge with the CNN models in computer vision to MR-HGR by fine-tuning the models with radar data samples. Meanwhile, for the different data modality in MR-HGR, a parameterized representation of temporal space-velocity (TSV) spectrogram was proposed as an integrated data modality of the time-evolving hand gesture features in the radar echo signals. The TSV spectrograms representing six common gestures in human–computer interaction (HCI) from nine volunteers were used as the data samples in the experiment. The evaluated models included ResNet with 50, 101, and 152 layers, DenseNet with 121, 161 and 169 layers, as well as light-weight MobileNet V2 and ShuffleNet V2, mostly proposed by many latest publications. In the experiment, not only self-testing (ST), but also more persuasive cross-testing (CT), were implemented to evaluate whether the fine-tuned models generalize to the radar data samples. The CT results show that the best fine-tuned models can reach to an average accuracy higher than 93% with a comparable ST average accuracy almost 100%. Moreover, in order to alleviate the problem caused by private gesture habits, an auxiliary test was performed by augmenting four shots of the gestures with the heaviest misclassifications into the training set. This enriching test is similar with the scenario that a tablet reacts to a new user. The results of two different volunteer in the enriching test shows that the average accuracy of the enriched gesture can be improved from 55.59% and 65.58% to 90.66% and 95.95% respectively. Compared with some baseline work in MR-HGR, the investigation by this paper can be beneficial in promoting MR-HGR in future industry applications and consumer electronic design.

Download Full-text

Spectrum-Based Hand Gesture Recognition Using Millimeter-Wave Radar Parameter Measurements

IEEE Access ◽

10.1109/access.2019.2923122 ◽

2019 ◽

Vol 7 ◽

pp. 79147-79158 ◽

Cited By ~ 3

Author(s):

Changjiang Liu ◽

Yuanhao Li ◽

Dongyang Ao ◽

Haiyan Tian

Keyword(s):

Millimeter Wave ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Millimeter Wave Radar ◽

Wave Radar

Download Full-text

Multidimensional Feature Representation and Learning for Robust Hand-Gesture Recognition on Commercial Millimeter-Wave Radar

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2020.3010880 ◽

2020 ◽

pp. 1-16

Author(s):

Zhaoyang Xia ◽

Yixiang Luomei ◽

Chenglong Zhou ◽

Feng Xu

Keyword(s):

Millimeter Wave ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Feature Representation ◽

Hand Gesture ◽

Millimeter Wave Radar ◽

Wave Radar

Download Full-text

Radar-Based Hand Gesture Recognition Using Spiking Neural Networks

Electronics ◽

10.3390/electronics10121405 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1405

Author(s):

Ing Jyh Tsang ◽

Federico Corradi ◽

Manolis Sifalakis ◽

Werner Van Leekwijck ◽

Steven Latré

Keyword(s):

Neural Network ◽

Gesture Recognition ◽

Liquid State ◽

Continuous Wave ◽

Spike Trains ◽

Hand Gesture Recognition ◽

Radar Signal ◽

Support Vector ◽

Hand Gesture ◽

Wave Radar

We propose a spiking neural network (SNN) approach for radar-based hand gesture recognition (HGR), using frequency modulated continuous wave (FMCW) millimeter-wave radar. After pre-processing the range-Doppler or micro-Doppler radar signal, we use a signal-to-spike conversion scheme that encodes radar Doppler maps into spike trains. The spike trains are fed into a spiking recurrent neural network, a liquid state machine (LSM). The readout spike signal from the SNN is then used as input for different classifiers for comparison, including logistic regression, random forest, and support vector machine (SVM). Using liquid state machines of less than 1000 neurons, we achieve better than state-of-the-art results on two publicly available reference datasets, reaching over 98% accuracy on 10-fold cross-validation for both data sets.

Download Full-text

Improve Inter-day Hand Gesture Recognition Via Convolutional Neural Network based Feature Fusion

International Journal of Humanoid Robotics ◽

10.1142/s0219843620500255 ◽

2020 ◽

Author(s):

Yinfeng Fang ◽

Xuguang Zhang ◽

Dalin Zhou ◽

Honghai Liu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Gesture Recognition ◽

Feature Fusion ◽

Hand Gesture Recognition ◽

Hand Gesture

Download Full-text

Connectionist Temporal Classification Model for Dynamic Hand Gesture Recognition using RGB and Optical flow Data

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/4/8 ◽

2020 ◽

Vol 17 (4) ◽

pp. 497-506

Author(s):

Sunil Patel ◽

Ramji Makwana

Keyword(s):

Neural Network ◽

Optical Flow ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Classification Model ◽

Hand Gesture ◽

Flow Data ◽

Dynamic Hand Gesture Recognition ◽

Connectionist Temporal Classification

Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%

Download Full-text