scholarly journals A Review: Point Cloud-Based 3D Human Joints Estimation

Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1684
Author(s):  
Tianxu Xu ◽  
Dong An ◽  
Yuetong Jia ◽  
Yang Yue

Joint estimation of the human body is suitable for many fields such as human–computer interaction, autonomous driving, video analysis and virtual reality. Although many depth-based researches have been classified and generalized in previous review or survey papers, the point cloud-based pose estimation of human body is still difficult due to the disorder and rotation invariance of the point cloud. In this review, we summarize the recent development on the point cloud-based pose estimation of the human body. The existing works are divided into three categories based on their working principles, including template-based method, feature-based method and machine learning-based method. Especially, the significant works are highlighted with a detailed introduction to analyze their characteristics and limitations. The widely used datasets in the field are summarized, and quantitative comparisons are provided for the representative methods. Moreover, this review helps further understand the pertinent applications in many frontier research directions. Finally, we conclude the challenges involved and problems to be solved in future researches.

Author(s):  
J. Jeong ◽  
I. Lee

Generating of a highly precise map grows up with development of autonomous driving vehicles. The highly precise map includes a precision of centimetres level unlike an existing commercial map with the precision of meters level. It is important to understand road environments and make a decision for autonomous driving since a robust localization is one of the critical challenges for the autonomous driving car. The one of source data is from a Lidar because it provides highly dense point cloud data with three dimensional position, intensities and ranges from the sensor to target. In this paper, we focus on how to segment point cloud data from a Lidar on a vehicle and classify objects on the road for the highly precise map. In particular, we propose the combination with a feature descriptor and a classification algorithm in machine learning. Objects can be distinguish by geometrical features based on a surface normal of each point. To achieve correct classification using limited point cloud data sets, a Support Vector Machine algorithm in machine learning are used. Final step is to evaluate accuracies of obtained results by comparing them to reference data The results show sufficient accuracy and it will be utilized to generate a highly precise road map.


Sensors ◽  
2018 ◽  
Vol 18 (8) ◽  
pp. 2719 ◽  
Author(s):  
Diyi Liu ◽  
Shogo Arai ◽  
Jiaqi Miao ◽  
Jun Kinugawa ◽  
Zhao Wang ◽  
...  

Automation of the bin picking task with robots entails the key step of pose estimation, which identifies and locates objects so that the robot can pick and manipulate the object in an accurate and reliable way. This paper proposes a novel point pair feature-based descriptor named Boundary-to-Boundary-using-Tangent-Line (B2B-TL) to estimate the pose of industrial parts including some parts whose point clouds lack key details, for example, the point cloud of the ridges of a part. The proposed descriptor utilizes the 3D point cloud data and 2D image data of the scene simultaneously, and the 2D image data could compensate the missing key details of the point cloud. Based on the descriptor B2B-TL, Multiple Edge Appearance Models (MEAM), a method using multiple models to describe the target object, is proposed to increase the recognition rate and reduce the computation time. A novel pipeline of an online computation process is presented to take advantage of B2B-TL and MEAM. Our algorithm is evaluated against synthetic and real scenes and implemented in a bin picking system. The experimental results show that our method is sufficiently accurate for a robot to grasp industrial parts and is fast enough to be used in a real factory environment.


2021 ◽  
Vol 20 (5s) ◽  
pp. 1-22
Author(s):  
Sizhe An ◽  
Umit Y. Ogras

Rehabilitation is a crucial process for patients suffering from motor disorders. The current practice is performing rehabilitation exercises under clinical expert supervision. New approaches are needed to allow patients to perform prescribed exercises at their homes and alleviate commuting requirements, expert shortages, and healthcare costs. Human joint estimation is a substantial component of these programs since it offers valuable visualization and feedback based on body movements. Camera-based systems have been popular for capturing joint motion. However, they have high-cost, raise serious privacy concerns, and require strict lighting and placement settings. We propose a millimeter-wave (mmWave)-based assistive rehabilitation system (MARS) for motor disorders to address these challenges. MARS provides a low-cost solution with a competitive object localization and detection accuracy. It first maps the 5D time-series point cloud from mmWave to a lower dimension. Then, it uses a convolution neural network (CNN) to estimate the accurate location of human joints. MARS can reconstruct 19 human joints and their skeleton from the point cloud generated by mmWave radar. We evaluate MARS using ten specific rehabilitation movements performed by four human subjects involving all body parts and obtain an average mean absolute error of 5.87 cm for all joint positions. To the best of our knowledge, this is the first rehabilitation movements dataset using mmWave point cloud. MARS is evaluated on the Nvidia Jetson Xavier-NX board. Model inference takes only 64 s and consumes 442 J energy. These results demonstrate the practicality of MARS on low-power edge devices.


2020 ◽  
Vol 34 (07) ◽  
pp. 13033-13040 ◽  
Author(s):  
Lu Zhou ◽  
Yingying Chen ◽  
Jinqiao Wang ◽  
Hanqing Lu

In this paper, we propose a progressive pose grammar network learned with Bi-C3D (Bidirectional Convolutional 3D) for human pose estimation. Exploiting the dependencies among the human body parts proves effective in solving the problems such as complex articulation, occlusion and so on. Therefore, we propose two articulated grammars learned with Bi-C3D to build the relationships of the human joints and exploit the contextual information of human body structure. Firstly, a local multi-scale Bi-C3D kinematics grammar is proposed to promote the message passing process among the locally related joints. The multi-scale kinematics grammar excavates different levels human context learned by the network. Moreover, a global sequential grammar is put forward to capture the long-range dependencies among the human body joints. The whole procedure can be regarded as a local-global progressive refinement process. Without bells and whistles, our method achieves competitive performance on both MPII and LSP benchmarks compared with previous methods, which confirms the feasibility and effectiveness of C3D in information interactions.


Author(s):  
J. Jeong ◽  
I. Lee

Generating of a highly precise map grows up with development of autonomous driving vehicles. The highly precise map includes a precision of centimetres level unlike an existing commercial map with the precision of meters level. It is important to understand road environments and make a decision for autonomous driving since a robust localization is one of the critical challenges for the autonomous driving car. The one of source data is from a Lidar because it provides highly dense point cloud data with three dimensional position, intensities and ranges from the sensor to target. In this paper, we focus on how to segment point cloud data from a Lidar on a vehicle and classify objects on the road for the highly precise map. In particular, we propose the combination with a feature descriptor and a classification algorithm in machine learning. Objects can be distinguish by geometrical features based on a surface normal of each point. To achieve correct classification using limited point cloud data sets, a Support Vector Machine algorithm in machine learning are used. Final step is to evaluate accuracies of obtained results by comparing them to reference data The results show sufficient accuracy and it will be utilized to generate a highly precise road map.


Author(s):  
Chenchen Liu ◽  
Yongzhi Li ◽  
Kangqi Ma ◽  
Duo Zhang ◽  
Peijun Bao ◽  
...  

3-D human pose estimation is a crucial step for understanding human actions. However, reliably capturing precise 3-D position of human joints is non-trivial and tedious. Current models often suffer from the scarcity of high-quality 3-D annotated training data. In this work, we explore a novel way of obtaining gigantic 3-D human pose data without manual annotations. In catedioptric videos (\emph{e.g.}, people dance before a mirror), the camera records both the original and mirrored human poses, which provides cues for estimating 3-D positions of human joints. Following this idea, we crawl a large-scale Dance-before-Mirror (DBM) video dataset, which is about 24 times larger than existing Human3.6M benchmark. Our technical insight is that, by jointly harnessing the epipolar geometry and human skeleton priors, 3-D joint estimation can boil down to an optimization problem over two sets of 2-D estimations. To our best knowledge, this represents the first work that collects high-quality 3-D human data via catadioptric systems. We have conducted comprehensive experiments on cross-scenario pose estimation and visualization analysis. The results strongly demonstrate the usefulness of our proposed DBM human poses.


2016 ◽  
Vol 136 (8) ◽  
pp. 1078-1084
Author(s):  
Shoichi Takei ◽  
Shuichi Akizuki ◽  
Manabu Hashimoto

Author(s):  
Yu Shao ◽  
Xinyue Wang ◽  
Wenjie Song ◽  
Sobia Ilyas ◽  
Haibo Guo ◽  
...  

With the increasing aging population in modern society, falls as well as fall-induced injuries in elderly people become one of the major public health problems. This study proposes a classification framework that uses floor vibrations to detect fall events as well as distinguish different fall postures. A scaled 3D-printed model with twelve fully adjustable joints that can simulate human body movement was built to generate human fall data. The mass proportion of a human body takes was carefully studied and was reflected in the model. Object drops, human falling tests were carried out and the vibration signature generated in the floor was recorded for analyses. Machine learning algorithms including K-means algorithm and K nearest neighbor algorithm were introduced in the classification process. Three classifiers (human walking versus human fall, human fall versus object drop, human falls from different postures) were developed in this study. Results showed that the three proposed classifiers can achieve the accuracy of 100, 85, and 91%. This paper developed a framework of using floor vibration to build the pattern recognition system in detecting human falls based on a machine learning approach.


2021 ◽  
Vol 11 (9) ◽  
pp. 4241
Author(s):  
Jiahua Wu ◽  
Hyo Jong Lee

In bottom-up multi-person pose estimation, grouping joint candidates into the appropriately structured corresponding instance of a person is challenging. In this paper, a new bottom-up method, the Partitioned CenterPose (PCP) Network, is proposed to better cluster the detected joints. To achieve this goal, we propose a novel approach called Partition Pose Representation (PPR) which integrates the instance of a person and its body joints based on joint offset. PPR leverages information about the center of the human body and the offsets between that center point and the positions of the body’s joints to encode human poses accurately. To enhance the relationships between body joints, we divide the human body into five parts, and then, we generate a sub-PPR for each part. Based on this PPR, the PCP Network can detect people and their body joints simultaneously, then group all body joints according to joint offset. Moreover, an improved l1 loss is designed to more accurately measure joint offset. Using the COCO keypoints and CrowdPose datasets for testing, it was found that the performance of the proposed method is on par with that of existing state-of-the-art bottom-up methods in terms of accuracy and speed.


Sign in / Sign up

Export Citation Format

Share Document