scholarly journals Skeleton‐based attention‐aware spatial–temporal model for action detection and recognition

2020 ◽  
Vol 14 (5) ◽  
pp. 177-184
Author(s):  
Ran Cui ◽  
Aichun Zhu ◽  
Jingran Wu ◽  
Gang Hua
Author(s):  
Dianting Liu ◽  
Yilin Yan ◽  
Mei-Ling Shyu ◽  
Guiru Zhao ◽  
Min Chen

Understanding semantic meaning of human actions captured in unconstrained environments has broad applications in fields ranging from patient monitoring, human-computer interaction, to surveillance systems. However, while great progresses have been achieved on automatic human action detection and recognition in videos that are captured in controlled/constrained environments, most existing approaches perform unsatisfactorily on videos with uncontrolled/unconstrained conditions (e.g., significant camera motion, background clutter, scaling, and light conditions). To address this issue, the authors propose a robust human action detection and recognition framework that works effectively on videos taken in controlled or uncontrolled environments. Specifically, the authors integrate the optical flow field and Harris3D corner detector to generate a new spatial-temporal information representation for each video sequence, from which the general Gaussian mixture model (GMM) is learned. All the mean vectors of the Gaussian components in the generated GMM model are concatenated to create the GMM supervector for video action recognition. They build a boosting classifier based on a set of sparse representation classifiers and hamming distance classifiers to improve the accuracy of action recognition. The experimental results on two broadly used public data sets, KTH and UCF YouTube Action, show that the proposed framework outperforms the other state-of-the-art approaches on both action detection and recognition.


Sign in / Sign up

Export Citation Format

Share Document