InterNet+: A Light Network for Hand Pose Estimation

Despite recent successes in hand pose estimation from RGB images or depth maps, inherent challenges remain. RGB-based methods suffer from heavy self-occlusions and depth ambiguity. Depth sensors rely heavily on distance and can only be used indoors, thus there are many limitations to the practical application of depth-based methods. The aforementioned challenges have inspired us to combine the two modalities to offset the shortcomings of the other. In this paper, we propose a novel RGB and depth information fusion network to improve the accuracy of 3D hand pose estimation, which is called CrossFuNet. Specifically, the RGB image and the paired depth map are input into two different subnetworks, respectively. The feature maps are fused in the fusion module in which we propose a completely new approach to combine the information from the two modalities. Then, the common method is used to regress the 3D key-points by heatmaps. We validate our model on two public datasets and the results reveal that our model outperforms the state-of-the-art methods.

Download Full-text

Hand Pose Estimation from RGB Images Based on Deep Learning: A Survey

2021 IEEE 7th International Conference on Virtual Reality (ICVR) ◽

10.1109/icvr51878.2021.9483815 ◽

2021 ◽

Author(s):

Yang Liu ◽

Jie Jiang ◽

Jiahao Sun

Keyword(s):

Deep Learning ◽

Pose Estimation ◽

Hand Pose Estimation ◽

Rgb Images ◽

Hand Pose

Download Full-text

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Computer Vision – ECCV 2018 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-01231-1_41 ◽

2018 ◽

pp. 678-694 ◽

Cited By ~ 47

Author(s):

Yujun Cai ◽

Liuhao Ge ◽

Jianfei Cai ◽

Junsong Yuan

Keyword(s):

Pose Estimation ◽

Hand Pose Estimation ◽

Rgb Images ◽

Weakly Supervised ◽

Hand Pose

Download Full-text

AN ACTION-TUNED NEURAL NETWORK ARCHITECTURE FOR HAND POSE ESTIMATION

Proceedings of the International Conference on Fuzzy Computation and 2nd International Conference on Neural Computation ◽

10.5220/0003086403580363 ◽

2010 ◽

Keyword(s):

Neural Network ◽

Pose Estimation ◽

Network Architecture ◽

Neural Network Architecture ◽

Hand Pose Estimation ◽

Hand Pose

Download Full-text

Grasping Hand Pose Estimation from RGB Images Using Digital Human Model by Convolutional Neural Network

Proceedings of 3DBODY.TECH 2018 - 9th International Conference and Exhibition on 3D Body Scanning and Processing Technologies, Lugano, Switzerland, 16-17 Oct. 2018 ◽

10.15221/18.154 ◽

2018 ◽

Cited By ~ 1

Author(s):

Kentaro INO ◽

Naoto IENAGA ◽

Yuta SUGIURA ◽

Hideo SAITO ◽

Natsuki MIYATA ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Pose Estimation ◽

Human Model ◽

Hand Pose Estimation ◽

Digital Human Model ◽

Rgb Images ◽

Digital Human ◽

Hand Pose

Download Full-text

CFAM: Estimating 3D Hand Poses from a Single RGB Image with Attention

Applied Sciences ◽

10.3390/app10020618 ◽

2020 ◽

Vol 10 (2) ◽

pp. 618

Author(s):

Xianghan Wang ◽

Jie Jiang ◽

Yanming Guo ◽

Lai Kang ◽

Yingmei Wei ◽

...

Keyword(s):

Computer Vision ◽

Pose Estimation ◽

Spatial Information ◽

Image Features ◽

Estimation Methods ◽

Feature Maps ◽

Hand Pose Estimation ◽

Rgb Images ◽

Rgb Image ◽

Hand Pose

Precise 3D hand pose estimation can be used to improve the performance of human–computer interaction (HCI). Specifically, computer-vision-based hand pose estimation can make this process more natural. Most traditional computer-vision-based hand pose estimation methods use depth images as the input, which requires complicated and expensive acquisition equipment. Estimation through a single RGB image is more convenient and less expensive. Previous methods based on RGB images utilize only 2D keypoint score maps to recover 3D hand poses but ignore the hand texture features and the underlying spatial information in the RGB image, which leads to a relatively low accuracy. To address this issue, we propose a channel fusion attention mechanism that combines 2D keypoint features and RGB image features at the channel level. In particular, the proposed method replans weights by using cascading RGB images and 2D keypoint features, which enables rational planning and the utilization of various features. Moreover, our method improves the fusion performance of different types of feature maps. Multiple contrast experiments on public datasets demonstrate that the accuracy of our proposed method is comparable to the state-of-the-art accuracy.

Download Full-text

An Online Robot Teaching Method using Static Hand Gestures and Poses

10.36227/techrxiv.14572251 ◽

2021 ◽

Author(s):

Digang Sun ◽

Ping Zhang ◽

Mingxuan Chen ◽

Jiaxin Chen

Keyword(s):

Gesture Recognition ◽

Pose Estimation ◽

Hand Gesture Recognition ◽

Human Robot Interaction ◽

Hand Gesture ◽

Depth Information ◽

Hand Gestures ◽

Teaching Mode ◽

Hand Pose Estimation ◽

Hand Pose

With an increasing number of robots are employed in manufacturing, a human-robot interaction method that can teach robots in a natural, accurate, and rapid manner is needed. In this paper, we propose a novel human-robot interface based on the combination of static hand gestures and hand poses. In our proposed interface, the pointing direction of the index finger and the orientation of the whole hand are extracted to indicate the moving direction and orientation of the robot in a fast-teaching mode. A set of hand gestures are designed according to their usage in humans' daily life and recognized to control the position and orientation of the robot in a fine-teaching mode. We employ the feature extraction ability of the hand pose estimation network via transfer learning and utilize attention mechanisms to improve the performance of the hand gesture recognition network. The inputs of hand pose estimation and hand gesture recognition networks are monocular RGB images, making our method independent of depth information input and applicable to more scenarios. In the regular shape reconstruction experiments on the UR3 robot, the mean error of the reconstructed shape is less than 1 mm, which demonstrates the effectiveness and efficiency of our method.

Download Full-text

An Online Robot Teaching Method using Static Hand Gestures and Poses

10.36227/techrxiv.14572251.v1 ◽

2021 ◽

Author(s):

Digang Sun ◽

Ping Zhang ◽

Mingxuan Chen ◽

Jiaxin Chen

Keyword(s):

Gesture Recognition ◽

Pose Estimation ◽

Hand Gesture Recognition ◽

Human Robot Interaction ◽

Hand Gesture ◽

Depth Information ◽

Hand Gestures ◽

Teaching Mode ◽

Hand Pose Estimation ◽

Hand Pose

With an increasing number of robots are employed in manufacturing, a human-robot interaction method that can teach robots in a natural, accurate, and rapid manner is needed. In this paper, we propose a novel human-robot interface based on the combination of static hand gestures and hand poses. In our proposed interface, the pointing direction of the index finger and the orientation of the whole hand are extracted to indicate the moving direction and orientation of the robot in a fast-teaching mode. A set of hand gestures are designed according to their usage in humans' daily life and recognized to control the position and orientation of the robot in a fine-teaching mode. We employ the feature extraction ability of the hand pose estimation network via transfer learning and utilize attention mechanisms to improve the performance of the hand gesture recognition network. The inputs of hand pose estimation and hand gesture recognition networks are monocular RGB images, making our method independent of depth information input and applicable to more scenarios. In the regular shape reconstruction experiments on the UR3 robot, the mean error of the reconstructed shape is less than 1 mm, which demonstrates the effectiveness and efficiency of our method.

Download Full-text