Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation

Human pose estimation is still a challenging task in computer vision, especially in the case of camera view transformation, joints occlusions and overlapping, the task will be of ever-increasing difficulty to achieve success. Most existing methods pass the input through a network, which typically consists of high-to-low resolution sub-networks that are connected in series. Still, during the up-sampling process, the spatial relationships and details might be lost. This paper designs a parallel atrous convolutional network with body structure constraints (PAC-BCNet) to address the problem. Among the mentioned techniques, the parallel atrous convolution (PAC) is constructed to deal with scale changes by connecting multiple different atrous convolution sub-networks in parallel. And it is used to extract features from different scales without reducing the resolution. Besides, the body structure constraints (BC), which enhance the correlation between each keypoint, are constructed to obtain better spatial relationships of the body by designing keypoints constraints sets and improving the loss function. In this work, a comparative experiment of the serial atrous convolution, the parallel atrous convolution, the ablation study with and without body structure constraints are conducted, which reasonably proves the effectiveness of the approach. The model is evaluated on two widely used human pose estimation benchmarks (MPII and LSP). The method achieves better performance on both datasets.

Download Full-text

A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video

10.1109/icra48506.2021.9561605 ◽

2021 ◽

Author(s):

Junfa Liu ◽

Juan Rojas ◽

Yihui Li ◽

Zhijun Liang ◽

Yisheng Guan ◽

...

Keyword(s):

Pose Estimation ◽

Human Pose Estimation ◽

Convolutional Network ◽

Spatio Temporal ◽

Human Pose ◽

3D Human Pose Estimation

Download Full-text

Multi-View Pose Generator Based on Deep Learning for Monocular 3D Human Pose Estimation

Symmetry ◽

10.3390/sym12071116 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1116 ◽

Cited By ~ 2

Author(s):

Jun Sun ◽

Mantao Wang ◽

Xin Zhao ◽

Dejun Zhang

Keyword(s):

Deep Learning ◽

Pose Estimation ◽

Data Augmentation ◽

Estimation Method ◽

Human Pose Estimation ◽

Convolutional Network ◽

3D Pose Estimation ◽

Single View ◽

Human Pose ◽

3D Human Pose Estimation

In this paper, we study the problem of monocular 3D human pose estimation based on deep learning. Due to single view limitations, the monocular human pose estimation cannot avoid the inherent occlusion problem. The common methods use the multi-view based 3D pose estimation method to solve this problem. However, single-view images cannot be used directly in multi-view methods, which greatly limits practical applications. To address the above-mentioned issues, we propose a novel end-to-end 3D pose estimation network for monocular 3D human pose estimation. First, we propose a multi-view pose generator to predict multi-view 2D poses from the 2D poses in a single view. Secondly, we propose a simple but effective data augmentation method for generating multi-view 2D pose annotations, on account of the existing datasets (e.g., Human3.6M, etc.) not containing a large number of 2D pose annotations in different views. Thirdly, we employ graph convolutional network to infer a 3D pose from multi-view 2D poses. From experiments conducted on public datasets, the results have verified the effectiveness of our method. Furthermore, the ablation studies show that our method improved the performance of existing 3D pose estimation networks.

Download Full-text