An improved interest point detector for human action recognition

Author(s):  
Songtao Ding ◽  
Shiru Qu
2014 ◽  
Vol 11 (01) ◽  
pp. 1450005
Author(s):  
Yangyang Wang ◽  
Yibo Li ◽  
Xiaofei Ji

Visual-based human action recognition is currently one of the most active research topics in computer vision. The feature representation directly has a crucial impact on the performance of the recognition. Feature representation based on bag-of-words is popular in current research, but the spatial and temporal relationship among these features is usually discarded. In order to solve this issue, a novel feature representation based on normalized interest points is proposed and utilized to recognize the human actions. The novel representation is called super-interest point. The novelty of the proposed feature is that the spatial-temporal correlation between the interest points and human body can be directly added to the representation without considering scale and location variance of the points by introducing normalized points clustering. The novelty concerns three tasks. First, to solve the diversity of human location and scale, interest points are normalized based on the normalization of the human region. Second, to obtain the spatial-temporal correlation among the interest points, the normalized points with similar spatial and temporal distance are constructed to a super-interest point by using three-dimensional clustering algorithm. Finally, by describing the appearance characteristic of the super-interest points and location relationship among the super-interest points, a new feature representation is gained. The proposed representation formation sets up the relationship among local features and human figure. Experiments on Weizmann, KTH, and UCF sports dataset demonstrate that the proposed feature is effective for human action recognition.


2013 ◽  
Vol 18 (2-3) ◽  
pp. 49-60 ◽  
Author(s):  
Damian Dudzńiski ◽  
Tomasz Kryjak ◽  
Zbigniew Mikrut

Abstract In this paper a human action recognition algorithm, which uses background generation with shadow elimination, silhouette description based on simple geometrical features and a finite state machine for recognizing particular actions is described. The performed tests indicate that this approach obtains a 81 % correct recognition rate allowing real-time image processing of a 360 X 288 video stream.


Sign in / Sign up

Export Citation Format

Share Document