3D Interacting Hand Pose and Shape Estimation from a Single RGB Image

Precise 3D hand pose estimation can be used to improve the performance of human–computer interaction (HCI). Specifically, computer-vision-based hand pose estimation can make this process more natural. Most traditional computer-vision-based hand pose estimation methods use depth images as the input, which requires complicated and expensive acquisition equipment. Estimation through a single RGB image is more convenient and less expensive. Previous methods based on RGB images utilize only 2D keypoint score maps to recover 3D hand poses but ignore the hand texture features and the underlying spatial information in the RGB image, which leads to a relatively low accuracy. To address this issue, we propose a channel fusion attention mechanism that combines 2D keypoint features and RGB image features at the channel level. In particular, the proposed method replans weights by using cascading RGB images and 2D keypoint features, which enables rational planning and the utilization of various features. Moreover, our method improves the fusion performance of different types of feature maps. Multiple contrast experiments on public datasets demonstrate that the accuracy of our proposed method is comparable to the state-of-the-art accuracy.

Download Full-text

Consistent-Resolution Network for 3D Hand Shape Estimation from a Single RGB Image

Journal of Physics Conference Series ◽

10.1088/1742-6596/1631/1/012014 ◽

2020 ◽

Vol 1631 ◽

pp. 012014

Author(s):

Qi Wu ◽

Joya Chen ◽

Zhiming Yao ◽

Xu Zhou ◽

Jianguo Wang ◽

...

Keyword(s):

Shape Estimation ◽

Hand Shape ◽

Rgb Image

Download Full-text

SeqHAND: RGB-Sequence-Based 3D Hand Pose and Shape Estimation

Computer Vision – ECCV 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-58610-2_8 ◽

2020 ◽

pp. 122-139 ◽

Cited By ~ 1

Author(s):

John Yang ◽

Hyung Jin Chang ◽

Seungeui Lee ◽

Nojun Kwak

Keyword(s):

Shape Estimation ◽

Hand Pose

Download Full-text

Cascaded Hierarchical CNN for RGB-Based 3D Hand Pose Estimation

Mathematical Problems in Engineering ◽

10.1155/2020/8432840 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Shiming Dai ◽

Wei Liu ◽

Wenji Yang ◽

Lili Fan ◽

Jihao Zhang

Keyword(s):

Pose Estimation ◽

Depth Image ◽

Estimation Methods ◽

Hierarchical Network ◽

Human Machine Interaction ◽

Depth Cameras ◽

Hand Pose Estimation ◽

Public Datasets ◽

Rgb Image ◽

Hand Pose

3D hand pose estimation can provide basic information about gestures, which has an important significance in the fields of Human-Machine Interaction (HMI) and Virtual Reality (VR). In recent years, 3D hand pose estimation from a single depth image has made great research achievements due to the development of depth cameras. However, 3D hand pose estimation from a single RGB image is still a highly challenging problem. In this work, we propose a novel four-stage cascaded hierarchical CNN (4CHNet), which leverages hierarchical network to decompose hand pose estimation into finger pose estimation and palm pose estimation, extracts separately finger features and palm features, and finally fuses them to estimate 3D hand pose. Compared with direct estimation methods, the hand feature information extracted by the hierarchical network is more representative. Furthermore, concatenating various stages of the network for end-to-end training can make each stage mutually beneficial and progress. The experimental results on two public datasets demonstrate that our 4CHNet can significantly improve the accuracy of 3D hand pose estimation from a single RGB image.

Download Full-text