3D Hand Pose Estimation from Monocular RGB with Feature Interaction Module

Hand gesture recognition and hand pose estimation are two closely correlated tasks. In this paper, we propose a deep-learning based approach which jointly learns an intermediate level shared feature for these two tasks, so that the hand gesture recognition task can be benefited from the hand pose estimation task. In the training process, a semi-supervised training scheme is designed to solve the problem of lacking proper annotation. Our approach detects the foreground hand, recognizes the hand gesture, and estimates the corresponding 3D hand pose simultaneously. To evaluate the hand gesture recognition performance of the state-of-the-arts, we propose a challenging hand gesture recognition dataset collected in unconstrained environments. Experimental results show that, the gesture recognition accuracy of ours is significantly boosted by leveraging the knowledge learned from the hand pose estimation task.

Download Full-text

Generative Model-Based Loss to the Rescue: A Method to Overcome Annotation Errors for Depth-Based Hand Pose Estimation

2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020) ◽

10.1109/fg47880.2020.00013 ◽

2020 ◽

Author(s):

Jiayi Wang ◽

Franziska Mueller ◽

Florian Bernard ◽

Christian Theobalt

Keyword(s):

Pose Estimation ◽

Generative Model ◽

Hand Pose Estimation ◽

Model Based ◽

Hand Pose ◽

Annotation Errors

Download Full-text

Robust hand gesture recognition using multiple shape-oriented visual cues

EURASIP Journal on Image and Video Processing ◽

10.1186/s13640-021-00567-1 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Samy Bakheet ◽

Ayoub Al-Hamadi

Keyword(s):

Real Time ◽

Gesture Recognition ◽

Pose Estimation ◽

Depth Map ◽

Hand Gesture Recognition ◽

Support Vector ◽

Hand Gesture ◽

Hand Pose Estimation ◽

Time Operation ◽

Hand Pose

AbstractRobust vision-based hand pose estimation is highly sought but still remains a challenging task, due to its inherent difficulty partially caused by self-occlusion among hand fingers. In this paper, an innovative framework for real-time static hand gesture recognition is introduced, based on an optimized shape representation build from multiple shape cues. The framework incorporates a specific module for hand pose estimation based on depth map data, where the hand silhouette is first extracted from the extremely detailed and accurate depth map captured by a time-of-flight (ToF) depth sensor. A hybrid multi-modal descriptor that integrates multiple affine-invariant boundary-based and region-based features is created from the hand silhouette to obtain a reliable and representative description of individual gestures. Finally, an ensemble of one-vs.-all support vector machines (SVMs) is independently trained on each of these learned feature representations to perform gesture classification. When evaluated on a publicly available dataset incorporating a relatively large and diverse collection of egocentric hand gestures, the approach yields encouraging results that agree very favorably with those reported in the literature, while maintaining real-time operation.

Download Full-text