Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Dynamic hand gesture recognition based on one-shot learning requires full assimilation of the motion features from a few annotated data. However, how to effectively extract the spatio-temporal features of the hand gestures remains a challenging issue. This paper proposes a skeleton-based dynamic hand gesture recognition using an enhanced network (GREN) based on one-shot learning by improving the memory-augmented neural network, which can rapidly assimilate the motion features of dynamic hand gestures. Besides, the network effectively combines and stores the shared features between dissimilar classes, which lowers the prediction error caused by the unnecessary hyper-parameters updating, and improves the recognition accuracy with the increase of categories. In this paper, the public dynamic hand gesture database (DHGD) is used for the experimental comparison of the state-of-the-art performance of the GREN network, and although only 30% of the dataset was used for training, the accuracy of skeleton-based dynamic hand gesture recognition reached 82.29% based on one-shot learning. Experiments with the Microsoft Research Asia (MSRA) hand gesture dataset verified the robustness of the GREN network. The experimental results demonstrate that the GREN network is feasible for skeleton-based dynamic hand gesture recognition based on one-shot learning.

Download Full-text

Mining Spatio-Temporal Features from mmW Radar echoes for Hand Gesture Recognition

2019 IEEE Asia-Pacific Microwave Conference (APMC) ◽

10.1109/apmc46564.2019.9038713 ◽

2019 ◽

Author(s):

Zhang Kang ◽

Lan Shengchang ◽

Zhang Guiyuan

Keyword(s):

Gesture Recognition ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Radar Echoes ◽

Temporal Features ◽

Spatio Temporal

Download Full-text

Synthesis of spatio-temporal descriptors for dynamic hand gesture recognition using genetic programming

2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG) ◽

10.1109/fg.2013.6553765 ◽

2013 ◽

Cited By ~ 4

Author(s):

Li Liu ◽

Ling Shao

Keyword(s):

Genetic Programming ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Dynamic Hand Gesture Recognition ◽

Spatio Temporal

Download Full-text

Connectionist Temporal Classification Model for Dynamic Hand Gesture Recognition using RGB and Optical flow Data

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/4/8 ◽

2020 ◽

Vol 17 (4) ◽

pp. 497-506

Author(s):

Sunil Patel ◽

Ramji Makwana

Keyword(s):

Neural Network ◽

Optical Flow ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Classification Model ◽

Hand Gesture ◽

Flow Data ◽

Dynamic Hand Gesture Recognition ◽

Connectionist Temporal Classification

Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%

Download Full-text