scholarly journals Multi-Scale Locality-Constrained Spatiotemporal Coding for Local Feature Based Human Action Recognition

2013 ◽  
Vol 2013 ◽  
pp. 1-11 ◽  
Author(s):  
Bin Wang ◽  
Yu Liu ◽  
Wei Wang ◽  
Wei Xu ◽  
Maojun Zhang

We propose a Multiscale Locality-Constrained Spatiotemporal Coding (MLSC) method to improve the traditional bag of features (BoF) algorithm which ignores the spatiotemporal relationship of local features for human action recognition in video. To model this spatiotemporal relationship, MLSC involves the spatiotemporal position of local feature into feature coding processing. It projects local features into a sub space-time-volume (sub-STV) and encodes them with a locality-constrained linear coding. A group of sub-STV features obtained from one video with MLSC and max-pooling are used to classify this video. In classification stage, the Locality-Constrained Group Sparse Representation (LGSR) is adopted to utilize the intrinsic group information of these sub-STV features. The experimental results on KTH, Weizmann, and UCF sports datasets show that our method achieves better performance than the competing local spatiotemporal feature-based human action recognition methods.

2020 ◽  
Vol 2020 ◽  
pp. 1-18
Author(s):  
Chao Tang ◽  
Huosheng Hu ◽  
Wenjian Wang ◽  
Wei Li ◽  
Hua Peng ◽  
...  

The representation and selection of action features directly affect the recognition effect of human action recognition methods. Single feature is often affected by human appearance, environment, camera settings, and other factors. Aiming at the problem that the existing multimodal feature fusion methods cannot effectively measure the contribution of different features, this paper proposed a human action recognition method based on RGB-D image features, which makes full use of the multimodal information provided by RGB-D sensors to extract effective human action features. In this paper, three kinds of human action features with different modal information are proposed: RGB-HOG feature based on RGB image information, which has good geometric scale invariance; D-STIP feature based on depth image, which maintains the dynamic characteristics of human motion and has local invariance; and S-JRPF feature-based skeleton information, which has good ability to describe motion space structure. At the same time, multiple K-nearest neighbor classifiers with better generalization ability are used to integrate decision-making classification. The experimental results show that the algorithm achieves ideal recognition results on the public G3D and CAD60 datasets.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1589
Author(s):  
Zeyuan Hu ◽  
Eung-Joo Lee

Traditional convolution neural networks have achieved great success in human action recognition. However, it is challenging to establish effective associations between different human bone nodes to capture detailed information. In this paper, we propose a dual attention-guided multiscale dynamic aggregate graph convolution neural network (DAG-GCN) for skeleton-based human action recognition. Our goal is to explore the best correlation and determine high-level semantic features. First, a multiscale dynamic aggregate GCN module is used to capture important semantic information and to establish dependence relationships for different bone nodes. Second, the higher level semantic feature is further refined, and the semantic relevance is emphasized through a dual attention guidance module. In addition, we exploit the relationship of joints hierarchically and the spatial temporal correlations through two modules. Experiments with the DAG-GCN method result in good performance on the NTU-60-RGB+D and NTU-120-RGB+D datasets. The accuracy is 95.76% and 90.01%, respectively, for the cross (X)-View and X-Subon the NTU60dataset.


2013 ◽  
Vol 99 ◽  
pp. 144-153 ◽  
Author(s):  
Xiaoyu Deng ◽  
Xiao Liu ◽  
Mingli Song ◽  
Jun Cheng ◽  
Jiajun Bu ◽  
...  

Author(s):  
Paruchuri Yogesh ◽  
R. Dillibabu ◽  
Palaniappan P

Automated human action Recognition has the potential to play an important role in Public security. In this project it compares three practical, reliable and generics systems for multiview video based human action recognition namely the nearest classifier, Gaussian mixture model classifier and nearest mean classifier.


Sign in / Sign up

Export Citation Format

Share Document