Improved 3D Human Motion Capture Using Kinect Skeleton and Depth Sensor

Alireza Bilesan; Shunsuke Komizunai; Teppei Tsujita; Atsushi Konno;  ;

doi:10.20965/jrm.2021.p1408

Improved 3D Human Motion Capture Using Kinect Skeleton and Depth Sensor

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2021.p1408 ◽

2021 ◽

Vol 33 (6) ◽

pp. 1408-1422

Author(s):

Alireza Bilesan ◽

Shunsuke Komizunai ◽

Teppei Tsujita ◽

Atsushi Konno ◽

◽

...

Keyword(s):

Motion Capture ◽

Correlation Coefficients ◽

Ground Truth ◽

Cost Effective ◽

Human Motion ◽

Human Gait ◽

Joint Angles ◽

Kinematic Parameters ◽

Depth Sensor ◽

Skeleton Model

Kinect has been utilized as a cost-effective, easy-to-use motion capture sensor using the Kinect skeleton algorithm. However, a limited number of landmarks and inaccuracies in tracking the landmarks’ positions restrict Kinect’s capability. In order to increase the accuracy of motion capturing using Kinect, joint use of the Kinect skeleton algorithm and Kinect-based marker tracking was applied to track the 3D coordinates of multiple landmarks on human. The motion’s kinematic parameters were calculated using the landmarks’ positions by applying the joint constraints and inverse kinematics techniques. The accuracy of the proposed method and OptiTrack (NaturalPoint, Inc., USA) was evaluated in capturing the joint angles of a humanoid (as ground truth) in a walking test. In order to evaluate the accuracy of the proposed method in capturing the kinematic parameters of a human, lower body joint angles of five healthy subjects were extracted using a Kinect, and the results were compared to Perception Neuron (Noitom Ltd., China) and OptiTrack data during ten gait trials. The absolute agreement and consistency between each optical system and the robot data in the robot test and between each motion capture system and OptiTrack data in the human gait test were determined using intraclass correlations coefficients (ICC3). The reproducibility between systems was evaluated using Lin’s concordance correlation coefficient (CCC). The correlation coefficients with 95% confidence intervals (95%CI) were interpreted substantial for both OptiTrack and proposed method (ICC > 0.75 and CCC > 0.95) in humanoid test. The results of the human gait experiments demonstrated the advantage of the proposed method (ICC > 0.75 and RMSE = 1.1460°) over the Kinect skeleton model (ICC < 0.4 and RMSE = 6.5843°).

Download Full-text

A Virtual Human's Driving Method Based on Motion Capture Data

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.711.500 ◽

2013 ◽

Vol 711 ◽

pp. 500-505 ◽

Cited By ~ 2

Author(s):

Song Shan Wang ◽

Yan Qing Qi

Keyword(s):

Motion Capture ◽

Euler Angle ◽

Human Motion ◽

Virtual Human ◽

Motion Capture Data ◽

Joint Angles ◽

Skeleton Model ◽

Angle Rotation ◽

Set Up

This article gives a method about driving a virtual human by motion capture data in software Jack. Firstly simplify Jack's skeleton model according to the skeleton of capturing data BVH. Secondly set up an Euler angle rotation equation to mapping joint angles between BVH and Jack. Finally, program the method and give an example to show that it is available to improve Jacks human motion simulating by the human capturing data.

Download Full-text

Human Gait-labeling Uncertainty and a Hybrid Model for Gait Segmentation

10.36227/techrxiv.16924102.v1 ◽

2021 ◽

Author(s):

Jiaen Wu ◽

Henrik Maurenbrecher ◽

Alessandro Schaer ◽

Barna Becsek ◽

Chris Awai Easthope ◽

...

Keyword(s):

Gait Analysis ◽

Motion Capture ◽

Ground Truth ◽

Absolute Error ◽

Human Gait ◽

Heel Strike ◽

Analysis Model ◽

Flat Foot ◽

Dynamic Time ◽

Human Gait Analysis

<div><div><div><p>Motion capture systems are widely accepted as ground-truth for gait analysis and are used for the validation of other gait analysis systems.To date, their reliability and limitations in manual labeling of gait events have not been studied.</p><p><b>Objectives</b>: Evaluate human manual labeling uncertainty and introduce a new hybrid gait analysis model for long-term monitoring.</p><p><b>Methods</b>: Evaluate and estimate inter-labeler inconsistencies by computing the limits-of-agreement; develop a model based on dynamic time warping and convolutional neural network to identify a valid stride and eliminate non-stride data in walking inertial data collected by a wearable device; Gait events are detected within a valid stride region afterwards; This method makes the subsequent data computation more efficient and robust.</p><p><b>Results</b>: The limits of inter-labeler agreement for key</p><p>gait events of heel off, toe off, heel strike, and flat foot are 72 ms, 16 ms, 22 ms, and 80 ms, respectively; The hybrid model's classification accuracy for a stride and a non-stride are 95.16% and 84.48%, respectively; The mean absolute error for detected heel off, toe off, heel strike, and flat foot are 24 ms, 5 ms, 9 ms, and 13 ms, respectively.</p><p><b>Conclusions</b>: The results show the inherent label uncertainty and the limits of human gait labeling of motion capture data; The proposed hybrid-model's performance is comparable to that of human labelers and it is a valid model to reliably detect strides in human gait data.</p><p><b>Significance</b>: This work establishes the foundation for fully automated human gait analysis systems with performances comparable to human-labelers.</p></div></div></div>

Download Full-text

Movement Estimation Using Soft Sensors Based on Bi-LSTM and Two-Layer LSTM for Human Motion Capture

Sensors ◽

10.3390/s20061801 ◽

2020 ◽

Vol 20 (6) ◽

pp. 1801 ◽

Cited By ~ 3

Author(s):

Haitao Guo ◽

Yunsick Sung

Keyword(s):

Motion Capture ◽

Short Term Memory ◽

Arm Movements ◽

Ground Truth ◽

Human Movement ◽

Human Motion ◽

Soft Sensors ◽

Upper Arm ◽

Human Motion Capture ◽

Gesture Control

The importance of estimating human movement has increased in the field of human motion capture. HTC VIVE is a popular device that provides a convenient way of capturing human motions using several sensors. Recently, the motion of only users’ hands has been captured, thereby greatly reducing the range of motion captured. This paper proposes a framework to estimate single-arm orientations using soft sensors mainly by combining a Bi-long short-term memory (Bi-LSTM) and two-layer LSTM. Positions of the two hands are measured using an HTC VIVE set, and the orientations of a single arm, including its corresponding upper arm and forearm, are estimated using the proposed framework based on the estimated positions of the two hands. Given that the proposed framework is meant for a single arm, if orientations of two arms are required to be estimated, the estimations are performed twice. To obtain the ground truth of the orientations of single-arm movements, two Myo gesture-control sensory armbands are employed on the single arm: one for the upper arm and the other for the forearm. The proposed framework analyzed the contextual features of consecutive sensory arm movements, which provides an efficient way to improve the accuracy of arm movement estimation. In comparison with the ground truth, the proposed method estimated the arm movements using a dynamic time warping distance, which was the average of 73.90% less than that of a conventional Bayesian framework. The distinct feature of our proposed framework is that the number of sensors attached to end-users is reduced. Additionally, with the use of our framework, the arm orientations can be estimated with any soft sensor, and good accuracy of the estimations can be ensured. Another contribution is the suggestion of the combination of the Bi-LSTM and two-layer LSTM.

Download Full-text

The Use of IMU-based Human Motion Capture to Assess Kinematic Parameters of Specific Exercises Performed by 400 M Hurdlers

Proceedings of the 7th International Conference on Sport Sciences Research and Technology Support ◽

10.5220/0008363602090216 ◽

2019 ◽

Author(s):

Janusz Iskra ◽

Michał Pietrzak ◽

Krzysztof Przednowek

Keyword(s):

Motion Capture ◽

Human Motion ◽

Kinematic Parameters ◽

Human Motion Capture

Download Full-text

DeepBBWAE-Net: A CNN-RNN Based Deep SuperLearner For Estimating Lower Extremity Sagittal Plane Joint Kinematics Using Shoe-Mounted IMU Sensors In Daily Living

10.36227/techrxiv.15040653.v1 ◽

2021 ◽

Author(s):

Md Sanzid Bin Hossain ◽

Joseph Drantez ◽

Hwan Choi ◽

Zhishan Guo

Keyword(s):

Neural Networks ◽

Motion Capture ◽

Daily Living ◽

Sagittal Plane ◽

Human Motion ◽

Joint Kinematics ◽

Weighted Averaging ◽

Measurement Unit ◽

Accurate Estimation ◽

Joint Angles

<div>Measurement of human body movement is an essential step in biomechanical analysis. The current standard for human motion capture systems uses infrared cameras to track reflective markers placed on the subject. While these systems can accurately track joint kinematics, the analyses are spatially limited to the lab environment. Though Inertial Measurement Unit (IMU) can eliminate the spatial limitations of the motion capture system, those systems are impractical for use in daily living due to the need for many sensors, typically one per body segment. Due to the need for practical and accurate estimation of joint kinematics, this study implements a reduced number of IMU sensors and employs machine learning algorithm to map sensor data to joint angles. Our developed algorithm estimates hip, knee, and ankle angles in the sagittal plane using two shoe-mounted IMU sensors in different practical walking conditions: treadmill, level overground, stair, and slope conditions. Specifically, we proposed five deep learning networks that use combinations of Convolutional Neural Networks (CNN) and Gated Recurrent Unit (GRU) based Recurrent Neural Networks (RNN) as base learners for our framework. Using those five baseline models, we proposed a novel framework, DeepBBWAE-Net, that implements ensemble techniques such as bagging, boosting, and weighted averaging to improve kinematic predictions. DeepBBWAE-Net predicts joint kinematics for the three joint angles under all the walking conditions with a Root Mean Square Error (RMSE) 6.93-29.0% lower than base models individually. This is the first study that uses a reduced number of IMU sensors to estimate kinematics in multiple walking environments.</div>

Download Full-text

Research on Wearable Human Motion Capture and Virtual Control

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.686.121 ◽

2014 ◽

Vol 686 ◽

pp. 121-125

Author(s):

Fei Jiang ◽

Ying Jie Yu ◽

Da Wei Yan

Keyword(s):

Real Time ◽

Motion Capture ◽

Inertial Sensor ◽

Calibration Method ◽

Human Motion ◽

Skeleton Model ◽

Body Attitude ◽

Human Limb ◽

Coordinate Conversion ◽

Skeletal Model

This paper designed the posture initialization calibration method by the inertial sensor in human limb movement with any attitude toward. By initializing the target specific actions can be implemented to identify timing corresponding sensors and joint, and calculate the coordinate transformation relation of human skeletal coordinates corresponding to each inertial sensor's coordinate system and the 3D human skeleton model. Then through the coordinate conversion of inertial sensor attitude coordinates and depth first traversal calculation on human skeletal tree, real-time update of human motion body attitude data, driven simulation of human skeletal model by human motion, realize the real-time tracking of motion capture.

Download Full-text

Real-Time Human Motion Capture Driven by a Wireless Sensor Network

International Journal of Computer Games Technology ◽

10.1155/2015/695874 ◽

2015 ◽

Vol 2015 ◽

pp. 1-14 ◽

Cited By ~ 13

Author(s):

Peng-zhan Chen ◽

Jie Li ◽

Man Luo ◽

Nian-hua Zhu

Keyword(s):

Real Time ◽

Sensor Network ◽

Motion Capture ◽

Human Motion ◽

Human Model ◽

Human Skeleton ◽

Inertial Sensing ◽

Skeleton Model ◽

Human Motion Capture ◽

Motion Capture Systems

The motion of a real object model is reconstructed through measurements of the position, direction, and angle of moving objects in 3D space in a process called “motion capture.” With the development of inertial sensing technology, motion capture systems that are based on inertial sensing have become a research hot spot. However, the solution of motion attitude remains a challenge that restricts the rapid development of motion capture systems. In this study, a human motion capture system based on inertial sensors is developed, and the real-time movement of a human model controlled by real people’s movement is achieved. According to the features of the system of human motion capture and reappearance, a hierarchical modeling approach based on a 3D human body model is proposed. The method collects articular movement data on the basis of rigid body dynamics through a miniature sensor network, controls the human skeleton model, and reproduces human posture according to the features of human articular movement. Finally, the feasibility of the system is validated by testing of system properties via capture of continuous dynamic movement. Experiment results show that the scheme utilizes a real-time sensor network-driven human skeleton model to achieve the accurate reproduction of human motion state. The system also has good application value.

Download Full-text

EVALUATIONS OF PARTICLE FILTER BASED HUMAN MOTION VISUAL TRACKERS FOR HOME ENVIRONMENT SURVEILLANCE

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001409007661 ◽

2009 ◽

Vol 23 (07) ◽

pp. 1333-1368 ◽

Cited By ~ 1

Author(s):

MATHIAS FONTMARTY ◽

PATRICK DANÈS ◽

FRÉDÉRIC LERASLE

Keyword(s):

Particle Filter ◽

Motion Capture ◽

Home Environment ◽

Ground Truth ◽

Monte Carlo Sampling ◽

Human Motion ◽

Video Database ◽

Experimental Procedure ◽

Human Motion Capture ◽

Quasi Monte Carlo

This paper presents a thorough study of some particle filter (PF) strategies dedicated to human motion capture from a trinocular vision surveillance setup. An experimental procedure is used, based on a commercial motion capture ring to provide ground truth. Metrics are proposed to assess performances in terms of accuracy, robustness, but also estimator dispersion which is often neglected elsewhere. Relative performances are discussed through some quantitative and qualitative evaluations on a video database. PF strategies based on Quasi Monte Carlo sampling, a scheme which is surprisingly seldom exploited in the Vision community, provide an interesting way to explore. Future works are finally discussed.

Download Full-text

View Independent Human Gait Recognition Using Markerless 3D Human Motion Capture

Computer Vision and Graphics - Lecture Notes in Computer Science ◽

10.1007/978-3-642-33564-8_59 ◽

2012 ◽

pp. 491-500 ◽

Cited By ~ 9

Author(s):

Tomasz Krzeszowski ◽

Bogdan Kwolek ◽

Agnieszka Michalczuk ◽

Adam Świtoński ◽

Henryk Josiński

Keyword(s):

Motion Capture ◽

Gait Recognition ◽

Human Motion ◽

Human Gait ◽

Human Motion Capture

Download Full-text

UAV-Based RGB Imagery for Hokkaido Pumpkin (Cucurbita max.) Detection and Yield Estimation

Sensors ◽

10.3390/s21010118 ◽

2020 ◽

Vol 21 (1) ◽

pp. 118

Author(s):

Lucas Wittstruck ◽

Insa Kühling ◽

Dieter Trautz ◽

Maik Kohlbrecher ◽

Thomas Jarmer

Keyword(s):

Fruit Size ◽

Correlation Coefficients ◽

Image Data ◽

Ground Truth ◽

Cost Effective ◽

Fruit Weight ◽

Yield Estimation ◽

Food Retailers ◽

Agronomic Management ◽

Aerial Vehicle

Pumpkins are economically and nutritionally valuable vegetables with increasing popularity and acreage across Europe. Successful commercialization, however, require detailed pre-harvest information about number and weight of the fruits. To get a non-destructive and cost-effective yield estimation, we developed an image processing methodology for high-resolution RGB data from Unmanned aerial vehicle (UAV) and applied this on a Hokkaido pumpkin farmer’s field in North-western Germany. The methodology was implemented in the programming language Python and comprised several steps, including image pre-processing, pixel-based image classification, classification post-processing for single fruit detection, and fruit size and weight quantification. To derive the weight from two-dimensional imagery, we calculated elliptical spheroids from lengths of diameters and heights. The performance of this processes was evaluated by comparison with manually harvested ground-truth samples and cross-checked for misclassification from randomly selected test objects. Errors in classification and fruit geometry could be successfully reduced based on the described processing steps. Additionally, different lighting conditions, as well as shadows, in the image data could be compensated by the proposed methodology. The results revealed a satisfactory detection of 95% (error rate of 5%) from the field sample, as well as a reliable volume and weight estimation with Pearson’s correlation coefficients of 0.83 and 0.84, respectively, from the described ellipsoid approach. The yield was estimated with 1.51 kg m−2 corresponding to an average individual fruit weight of 1100 g and an average number of 1.37 pumpkins per m2. Moreover, spatial distribution of aggregated fruit densities and weights were calculated to assess in-field optimization potential for agronomic management as demonstrated between a shaded edge compared to the rest of the field. The proposed approach provides the Hokkaido producer useful information for more targeted pre-harvest marketing strategies, since most food retailers request homogeneous lots within prescribed size or weight classes.

Download Full-text