A Multi-Scale Hierarchical Codebook Method for Human Action Recognition in Videos Using a Single Example

Author(s):  
Mehrsan Javan Roshtkhari ◽  
Martin D. Levine
2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Bin Wang ◽  
Yu Liu ◽  
Wei Wang ◽  
Wei Xu ◽  
Maojun Zhang

We propose a Multiscale Locality-Constrained Spatiotemporal Coding (MLSC) method to improve the traditional bag of features (BoF) algorithm which ignores the spatiotemporal relationship of local features for human action recognition in video. To model this spatiotemporal relationship, MLSC involves the spatiotemporal position of local feature into feature coding processing. It projects local features into a sub space-time-volume (sub-STV) and encodes them with a locality-constrained linear coding. A group of sub-STV features obtained from one video with MLSC and max-pooling are used to classify this video. In classification stage, the Locality-Constrained Group Sparse Representation (LGSR) is adopted to utilize the intrinsic group information of these sub-STV features. The experimental results on KTH, Weizmann, and UCF sports datasets show that our method achieves better performance than the competing local spatiotemporal feature-based human action recognition methods.


Author(s):  
M. N. Al-Berry ◽  
Mohammed A.-M. Salem ◽  
H. M. Ebeid ◽  
A. S. Hussein ◽  
Mohamed F. Tolba

Human action recognition is a very active field in computer vision. Many important applications depend on accurate human action recognition, which is based on accurate representation of the actions. These applications include surveillance, athletic performance analysis, driver assistance, robotics, and human-centered computing. This chapter presents a thorough review of the field, concentrating the recent action representation methods that use spatio-temporal information. In addition, the authors propose a stationary wavelet-based representation of natural human actions in realistic videos. The proposed representation utilizes the 3D Stationary Wavelet Transform to encode the directional multi-scale spatio-temporal characteristics of the motion available in a frame sequence. It was tested using the Weizmann, and KTH datasets, and produced good preliminary results while having reasonable computational complexity when compared to existing state–of–the–art methods.


Sign in / Sign up

Export Citation Format

Share Document