Spatio-Temporal Analysis for Human Action Detection and Recognition in Uncontrolled Environments

Understanding semantic meaning of human actions captured in unconstrained environments has broad applications in fields ranging from patient monitoring, human-computer interaction, to surveillance systems. However, while great progresses have been achieved on automatic human action detection and recognition in videos that are captured in controlled/constrained environments, most existing approaches perform unsatisfactorily on videos with uncontrolled/unconstrained conditions (e.g., significant camera motion, background clutter, scaling, and light conditions). To address this issue, the authors propose a robust human action detection and recognition framework that works effectively on videos taken in controlled or uncontrolled environments. Specifically, the authors integrate the optical flow field and Harris3D corner detector to generate a new spatial-temporal information representation for each video sequence, from which the general Gaussian mixture model (GMM) is learned. All the mean vectors of the Gaussian components in the generated GMM model are concatenated to create the GMM supervector for video action recognition. They build a boosting classifier based on a set of sparse representation classifiers and hamming distance classifiers to improve the accuracy of action recognition. The experimental results on two broadly used public data sets, KTH and UCF YouTube Action, show that the proposed framework outperforms the other state-of-the-art approaches on both action detection and recognition.

Download Full-text

Human Action Detection and Recognition Using SIFT and SVM

Communications in Computer and Information Science - Cognitive Computing and Information Processing ◽

10.1007/978-981-10-9059-2_42 ◽

2018 ◽

pp. 475-491 ◽

Cited By ~ 2

Author(s):

Praveen M. Dhulavvagol ◽

Niranjan C. Kundur

Keyword(s):

Human Action ◽

Action Detection ◽

Human Action Detection ◽

Detection And Recognition

Download Full-text

Human Action Recognition Based on Bag-of-Words

Iraqi Journal of Science ◽

10.24996/ijs.2020.61.5.27 ◽

2020 ◽

pp. 1202-1214

Author(s):

Riyadh Sahib Abdul Ameer ◽

Mohammed Al-Taei

Keyword(s):

Action Recognition ◽

False Positive Rate ◽

Human Action Recognition ◽

True Positive Rate ◽

Human Action ◽

Misclassification Rate ◽

Surveillance Systems ◽

Bag Of Words ◽

Camera Motion ◽

Positive Rate

Human action recognition has gained popularity because of its wide applicability, such as in patient monitoring systems, surveillance systems, and a wide diversity of systems that contain interactions between people and electrical devices, including human computer interfaces. The proposed method includes sequential stages of object segmentation, feature extraction, action detection and then action recognition. Effective results of human actions using different features of unconstrained videos was a challenging task due to camera motion, cluttered background, occlusions, complexity of human movements, and variety of same actions performed by distinct subjects. Thus, the proposed method overcomes such problems by using the fusion of features concept for the development of a powerful human action descriptor. This descriptor is modified to create a visual word vocabulary (or codebook) which yields a Bag-of-Words representation. The True Positive Rate (TPR) and False Positive Rate (FPR) measures gave a true indication about the proposed HAR system. The computed Accuracy (Ar) and the Error (misclassification) Rate (Er) reveal the effectiveness of the system with the used dataset.

Download Full-text

TS-ICNN: Time Sequence-Based Interval Convolutional Neural Networks for Human Action Detection and Recognition

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2018edl8046 ◽

2018 ◽

Vol E101.D (10) ◽

pp. 2534-2538 ◽

Cited By ~ 2

Author(s):

Zhendong ZHUANG ◽

Yang XUE

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Human Action ◽

Time Sequence ◽

Action Detection ◽

Human Action Detection ◽

Detection And Recognition

Download Full-text

Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review

Micromachines ◽

10.3390/mi13010072 ◽

2021 ◽

Vol 13 (1) ◽

pp. 72

Author(s):

Dengshan Li ◽

Rujing Wang ◽

Peng Chen ◽

Chengjun Xie ◽

Qiong Zhou ◽

...

Keyword(s):

Object Detection ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Detection Methods ◽

Video Object ◽

Action Detection ◽

Video Frames ◽

Video Detection ◽

Human Action Detection

Video object and human action detection are applied in many fields, such as video surveillance, face recognition, etc. Video object detection includes object classification and object location within the frame. Human action recognition is the detection of human actions. Usually, video detection is more challenging than image detection, since video frames are often more blurry than images. Moreover, video detection often has other difficulties, such as video defocus, motion blur, part occlusion, etc. Nowadays, the video detection technology is able to implement real-time detection, or high-accurate detection of blurry video frames. In this paper, various video object and human action detection approaches are reviewed and discussed, many of them have performed state-of-the-art results. We mainly review and discuss the classic video detection methods with supervised learning. In addition, the frequently-used video object detection and human action recognition datasets are reviewed. Finally, a summarization of the video detection is represented, e.g., the video object and human action detection methods could be classified into frame-by-frame (frame-based) detection, extracting-key-frame detection and using-temporal-information detection; the methods of utilizing temporal information of adjacent video frames are mainly the optical flow method, Long Short-Term Memory and convolution among adjacent frames.

Download Full-text