Human pose estimation based on parallel atrous convolution and body structure constraints

Human pose estimation is still a challenging task in computer vision, especially in the case of camera view transformation, joints occlusions and overlapping, the task will be of ever-increasing difficulty to achieve success. Most existing methods pass the input through a network, which typically consists of high-to-low resolution sub-networks that are connected in series. Still, during the up-sampling process, the spatial relationships and details might be lost. This paper designs a parallel atrous convolutional network with body structure constraints (PAC-BCNet) to address the problem. Among the mentioned techniques, the parallel atrous convolution (PAC) is constructed to deal with scale changes by connecting multiple different atrous convolution sub-networks in parallel. And it is used to extract features from different scales without reducing the resolution. Besides, the body structure constraints (BC), which enhance the correlation between each keypoint, are constructed to obtain better spatial relationships of the body by designing keypoints constraints sets and improving the loss function. In this work, a comparative experiment of the serial atrous convolution, the parallel atrous convolution, the ablation study with and without body structure constraints are conducted, which reasonably proves the effectiveness of the approach. The model is evaluated on two widely used human pose estimation benchmarks (MPII and LSP). The method achieves better performance on both datasets.

Download Full-text

Learning human poses in natural scenes

10.32469/10355/66196 ◽

2018 ◽

Author(s):

◽

Guanghan Ning

Keyword(s):

Computer Vision ◽

Pose Estimation ◽

The Body ◽

Human Pose Estimation ◽

Natural Scenes ◽

Top Down ◽

University Of Missouri ◽

Single Person ◽

Human Pose ◽

High Level

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] The task of human pose estimation in natural scenes is to determine the precise pixel locations of body keypoints. It is very important for many high-level computer vision tasks, including action and activity recognition, human-computer interaction, motion capture, and animation. We cover two different approaches for this task: top-down approach and bottom-up approach. In the top-down approach, we propose a human tracking method called ROLO that localizes each person. We then propose a state-of-the-art single-person human pose estimator that predicts the body keypoints of each individual. In the bottomup approach, we propose an efficient multi-person pose estimator with which we participated in a PoseTrack challenge [11]. On top of these, we propose to employ adversarial training to further boost the performance of single-person human pose estimator while generating synthetic images. We also propose a novel PoSeg network that jointly estimates the multi-person human poses and semantically segment the portraits of these persons at pixel-level. Lastly, we extend some of the proposed methods on human pose estimation and portrait segmentation to the task of human parsing, a more finegrained computer vision perception of humans.

Download Full-text

A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video

10.1109/icra48506.2021.9561605 ◽

2021 ◽

Author(s):

Junfa Liu ◽

Juan Rojas ◽

Yihui Li ◽

Zhijun Liang ◽

Yisheng Guan ◽

...

Keyword(s):

Pose Estimation ◽

Human Pose Estimation ◽

Convolutional Network ◽

Spatio Temporal ◽

Human Pose ◽

3D Human Pose Estimation

Download Full-text

Adversarial PoseNet: A Structure-Aware Convolutional Network for Human Pose Estimation

2017 IEEE International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2017.137 ◽

2017 ◽

Cited By ~ 86

Author(s):

Yu Chen ◽

Chunhua Shen ◽

Xiu-Shen Wei ◽

Lingqiao Liu ◽

Jian Yang

Keyword(s):

Pose Estimation ◽

Human Pose Estimation ◽

Convolutional Network ◽

Human Pose

Download Full-text

A survey of human pose estimation: The body parts parsing based methods

Journal of Visual Communication and Image Representation ◽

10.1016/j.jvcir.2015.06.013 ◽

2015 ◽

Vol 32 ◽

pp. 10-19 ◽

Cited By ~ 32

Author(s):

Zhao Liu ◽

Jianke Zhu ◽

Jiajun Bu ◽

Chun Chen

Keyword(s):

Pose Estimation ◽

The Body ◽

Human Pose Estimation ◽

Body Parts ◽

Human Pose

Download Full-text

Semi-Dynamic Hypergraph Neural Network for 3D Pose Estimation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/109 ◽

2020 ◽

Author(s):

Shengyuan Liu ◽

Pei Lv ◽

Yuzhen Zhang ◽

Jie Fu ◽

Junjin Cheng ◽

...

Keyword(s):

Neural Network ◽

Pose Estimation ◽

State Of The Art ◽

Human Pose Estimation ◽

Body Structure ◽

Single Image ◽

3D Pose Estimation ◽

Convolutional Networks ◽

Human Pose ◽

Fixed Tree

This paper proposes a novel Semi-Dynamic Hypergraph Neural Network (SD-HNN) to estimate 3D human pose from a single image. SD-HNN adopts hypergraph to represent the human body to effectively exploit the kinematic constrains among adjacent and non-adjacent joints. Specifically, a pose hypergraph in SD-HNN has two components. One is a static hypergraph constructed according to the conventional tree body structure. The other is the semi-dynamic hypergraph representing the dynamic kinematic constrains among different joints. These two hypergraphs are combined together to be trained in an end-to-end fashion. Unlike traditional Graph Convolutional Networks (GCNs) that are based on a fixed tree structure, the SD-HNN can deal with ambiguity in human pose estimation. Experimental results demonstrate that the proposed method achieves state-of-the-art performance both on the Human3.6M and MPI-INF-3DHP datasets.

Download Full-text

3D Human Pose Estimation in Vietnamese Traditional Martial Art Videos

Journal of Advanced Engineering and Computation ◽

10.25073/jaec.201933.252 ◽

2019 ◽

Vol 3 (3) ◽

pp. 471

Author(s):

Tuong Thanh Nguyen ◽

Van-Hung Le ◽

Duy-Long Duong ◽

Thanh-Cong Pham ◽

Dung Le

Keyword(s):

Pose Estimation ◽

Martial Arts ◽

Social Life ◽

The Body ◽

Human Pose Estimation ◽

Body Parts ◽

Creative Commons ◽

Martial Art ◽

Benchmark Datasets ◽

Human Pose

Preserving, maintaining and teaching traditional martial arts are very important activities in social life. That helps preserve national culture, exercise and self-defense for practitioners. However, traditional martial arts have many different postures and activities of the body and body parts are diverse. The problem of estimating the actions of the human body still has many challenges, such as accuracy, obscurity, etc. In this paper, we survey several strong studies in the recent years for 3-D human pose estimation. Statistical tables have been compiled for years, typical results of these studies on the Human 3.6m dataset have been summarized. We also present a comparative study for 3-D human pose estimation based on the method that uses a single image. This study based on the methods that use the Convolutional Neural Network (CNN) for 2-D pose estimation, and then using 3-D pose library for mapping the 2-D results into the 3-D space. The CNNs model is trained on the benchmark datasets as MSCOCO Keypoints Challenge dataset [1], Human 3.6m [2], MPII dataset [3], LSP [4], [5], etc. We final publish the dataset of Vietnamese's traditional martial arts in Binh Dinh province for evaluating the 3-D human pose estimation. Quantitative results are presented and evaluated.This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium provided the original work is properly cited.

Download Full-text

Multi-View Pose Generator Based on Deep Learning for Monocular 3D Human Pose Estimation

Symmetry ◽

10.3390/sym12071116 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1116 ◽

Cited By ~ 2

Author(s):

Jun Sun ◽

Mantao Wang ◽

Xin Zhao ◽

Dejun Zhang

Keyword(s):

Deep Learning ◽

Pose Estimation ◽

Data Augmentation ◽

Estimation Method ◽

Human Pose Estimation ◽

Convolutional Network ◽

3D Pose Estimation ◽

Single View ◽

Human Pose ◽

3D Human Pose Estimation

In this paper, we study the problem of monocular 3D human pose estimation based on deep learning. Due to single view limitations, the monocular human pose estimation cannot avoid the inherent occlusion problem. The common methods use the multi-view based 3D pose estimation method to solve this problem. However, single-view images cannot be used directly in multi-view methods, which greatly limits practical applications. To address the above-mentioned issues, we propose a novel end-to-end 3D pose estimation network for monocular 3D human pose estimation. First, we propose a multi-view pose generator to predict multi-view 2D poses from the 2D poses in a single view. Secondly, we propose a simple but effective data augmentation method for generating multi-view 2D pose annotations, on account of the existing datasets (e.g., Human3.6M, etc.) not containing a large number of 2D pose annotations in different views. Thirdly, we employ graph convolutional network to infer a 3D pose from multi-view 2D poses. From experiments conducted on public datasets, the results have verified the effectiveness of our method. Furthermore, the ablation studies show that our method improved the performance of existing 3D pose estimation networks.

Download Full-text