Action Recognition by Joint Spatial-Temporal Motion Feature

This paper introduces a method for human action recognition based on optical flow motion features extraction. Automatic spatial and temporal alignments are combined together in order to encourage the temporal consistence on each action by an enhanced dynamic time warping (DTW) algorithm. At the same time, a fast method based on coarse-to-fine DTW constraint to improve computational performance without reducing accuracy is induced. The main contributions of this study include (1) a joint spatial-temporal multiresolution optical flow computation method which can keep encoding more informative motion information than recent proposed methods, (2) an enhanced DTW method to improve temporal consistence of motion in action recognition, and (3) coarse-to-fine DTW constraint on motion features pyramids to speed up recognition performance. Using this method, high recognition accuracy is achieved on different action databases like Weizmann database and KTH database.

Download Full-text

Background Invariant Faster Motion Modeling for Drone Action Recognition

Drones ◽

10.3390/drones5030087 ◽

2021 ◽

Vol 5 (3) ◽

pp. 87

Author(s):

Ketan Kotecha ◽

Deepak Garg ◽

Balmukund Mishra ◽

Pratik Narang ◽

Vipual Kumar Mishra

Keyword(s):

Optical Flow ◽

Action Recognition ◽

Critical Issue ◽

Human Action ◽

Motion Modeling ◽

Aerial Surveillance ◽

Public And Private ◽

Motion Feature ◽

Crowd Monitoring ◽

Top View

Visual data collected from drones has opened a new direction for surveillance applications and has recently attracted considerable attention among computer vision researchers. Due to the availability and increasing use of the drone for both public and private sectors, it is a critical futuristic technology to solve multiple surveillance problems in remote areas. One of the fundamental challenges in recognizing crowd monitoring videos’ human action is the precise modeling of an individual’s motion feature. Most state-of-the-art methods heavily rely on optical flow for motion modeling and representation, and motion modeling through optical flow is a time-consuming process. This article underlines this issue and provides a novel architecture that eliminates the dependency on optical flow. The proposed architecture uses two sub-modules, FMFM (faster motion feature modeling) and AAR (accurate action recognition), to accurately classify the aerial surveillance action. Another critical issue in aerial surveillance is a deficiency of the dataset. Out of few datasets proposed recently, most of them have multiple humans performing different actions in the same scene, such as a crowd monitoring video, and hence not suitable for directly applying to the training of action recognition models. Given this, we have proposed a novel dataset captured from top view aerial surveillance that has a good variety in terms of actors, daytime, and environment. The proposed architecture has shown the capability to be applied in different terrain as it removes the background before using the action recognition model. The proposed architecture is validated through the experiment with varying investigation levels and achieves a remarkable performance of 0.90 validation accuracy in aerial action recognition.

Download Full-text

Human Action Recognition Based on Hybrid Features

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.373-375.1188 ◽

2013 ◽

Vol 373-375 ◽

pp. 1188-1191

Author(s):

Ju Zhong ◽

Hua Wen Liu ◽

Chun Li Lin

Keyword(s):

Action Recognition ◽

Time Domain ◽

Recognition Performance ◽

Human Action Recognition ◽

Human Action ◽

Extraction Methods ◽

Hybrid Features ◽

Motion Feature ◽

Feature Based ◽

Efficient Recognition

The extraction methods of both the shape feature based on Fourier descriptors and the motion feature in time domain were introduced. These features were fused to get a hybrid feature which had higher distinguish ability. This combined representation was used for human action recognition. The experimental results show the proposed hybrid feature has efficient recognition performance in the Weizmann action database .

Download Full-text

Human action recognition based on two-view optical flow in the transformed domain

2014 IEEE 57th International Midwest Symposium on Circuits and Systems (MWSCAS) ◽

10.1109/mwscas.2014.6908537 ◽

2014 ◽

Author(s):

Mohamed A. Abdelwahab ◽

Moataz M. Abdelwahab

Keyword(s):

Optical Flow ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Lesions to the Motor System Affect Action Perception

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21206 ◽

2010 ◽

Vol 22 (3) ◽

pp. 413-426 ◽

Cited By ~ 32

Author(s):

Andrea Serino ◽

Laura De Filippo ◽

Chiara Casavecchia ◽

Michela Coccia ◽

Maggie Shiffrar ◽

...

Keyword(s):

Action Recognition ◽

Motor System ◽

Recognition Performance ◽

Human Action ◽

Visual Sensitivity ◽

Cortical Lesion ◽

Motor Deficit ◽

Action Perception ◽

Dynamic Stimuli ◽

Point Light

Several studies have shown that the motor system is involved in action perception, suggesting that action concepts are represented through sensory–motor processes. Such conclusions imply that motor system impairments should diminish action perception. To test this hypothesis, a group of 10 brain-damaged patients with hemiplegia (specifically, a lesion at the motor system that affected the contralesional arm) viewed point-light displays of arm gestures and attempted to name each gesture. To create the dynamic stimuli, patients individually performed simple gestures with their unaffected arm while being videotaped. The videotapes were converted into point-light animations. Each action was presented as it had been performed, that is, as having been produced by the observer's unaffected arm, and in its mirror reversed orientation, that is, as having been produced by the observer's hemiplegic arm. Action recognition accuracy by patients with hemiplegia was compared with that by 8 brain-damaged patients without any motor deficit and by 10 healthy controls. Overall, performance was better in control observers than in patients. Most importantly, performance by hemiplegic patients, but not by nonhemiplegic patients and controls, varied systematically as a function of the observed limb. Action recognition was best when hemiplegic patients viewed actions that appeared to have been performed by their unaffected arm. Action recognition performance dropped significantly when hemiplegic patients viewed actions that appeared to have been produced with their hemiplegic arm or the corresponding arm of another person. The results of a control study involving the recognition of point-light defined animals in motion indicate that a generic deficit to visual and cognitive functions cannot account for this laterality-specific deficit in action recognition. Taken together, these results suggest that motor cortex impairment decreases visual sensitivity to human action. Specifically, when a cortical lesion renders an observer incapable of performing an observed action, action perception is compromised, possibly by a failure to map the observed action onto the observer's contralesional hemisoma.

Download Full-text

SVM-Based Human Action Recognition and Its Remarkable Motion Features Discovery Algorithm

Springer Tracts in Advanced Robotics - Experimental Robotics IX ◽

10.1007/11552246_2 ◽

2006 ◽

pp. 15-25 ◽

Cited By ~ 2

Author(s):

Taketoshi Mori ◽

Masamichi Shimosaka ◽

Tomomasa Sato

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Motion Features

Download Full-text

DMMs-Based Multiple Features Fusion for Human Action Recognition

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2015100102 ◽

2015 ◽

Vol 6 (4) ◽

pp. 23-39 ◽

Cited By ~ 18

Author(s):

Mohammad Farhad Bulbul ◽

Yunsheng Jiang ◽

Jinwen Ma

Keyword(s):

Action Recognition ◽

Recognition Performance ◽

Recognition Task ◽

Human Action Recognition ◽

Fusion Rule ◽

Local Binary Patterns ◽

Human Action ◽

Decision Fusion ◽

Soft Decision ◽

Depth Sensors

The emerging cost-effective depth sensors have facilitated the action recognition task significantly. In this paper, the authors address the action recognition problem using depth video sequences combining three discriminative features. More specifically, the authors generate three Depth Motion Maps (DMMs) over the entire video sequence corresponding to the front, side, and top projection views. Contourlet-based Histogram of Oriented Gradients (CT-HOG), Local Binary Patterns (LBP), and Edge Oriented Histograms (EOH) are then computed from the DMMs. To merge these features, the authors consider decision-level fusion, where a soft decision-fusion rule, Logarithmic Opinion Pool (LOGP), is used to combine the classification outcomes from multiple classifiers each with an individual set of features. Experimental results on two datasets reveal that the fusion scheme achieves superior action recognition performance over the situations when using each feature individually.

Download Full-text

Human Action Recognition Based on Fusion Features Extraction of Adaptive Background Subtraction and Optical Flow Model

Mathematical Problems in Engineering ◽

10.1155/2015/387464 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 5

Author(s):

Shaoping Zhu ◽

Limin Xia

Keyword(s):

Optical Flow ◽

Action Recognition ◽

Background Subtraction ◽

Flow Model ◽

Feature Vector ◽

Human Action Recognition ◽

Human Action ◽

Multiple Instance Learning ◽

Data Sets ◽

Flow Feature

A novel method based on hybrid feature is proposed for human action recognition in video image sequences, which includes two stages of feature extraction and action recognition. Firstly, we use adaptive background subtraction algorithm to extract global silhouette feature and optical flow model to extract local optical flow feature. Then we combine global silhouette feature vector and local optical flow feature vector to form a hybrid feature vector. Secondly, in order to improve the recognition accuracy, we use an optimized Multiple Instance Learning algorithm to recognize human actions, in which an Iterative Querying Heuristic (IQH) optimization algorithm is used to train the Multiple Instance Learning model. We demonstrate that our hybrid feature-based action representation can effectively classify novel actions on two different data sets. Experiments show that our results are comparable to, and significantly better than, the results of two state-of-the-art approaches on these data sets, which meets the requirements of stable, reliable, high precision, and anti-interference ability and so forth.

Download Full-text

Human Action Recognition in Videos Using Hybrid Motion Features

Lecture Notes in Computer Science - Advances in Multimedia Modeling ◽

10.1007/978-3-642-11301-7_42 ◽

2010 ◽

pp. 411-421 ◽

Cited By ~ 4

Author(s):

Si Liu ◽

Jing Liu ◽

Tianzhu Zhang ◽

Hanqing Lu

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Motion Features ◽

Hybrid Motion

Download Full-text

Multi-Instance Multi-Label Action Recognition and Localization Based on Spatio-Temporal Pre-Trimming for Untrimmed Videos

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6986 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12886-12893

Author(s):

Xiao-Yu Zhang ◽

Haichao Shi ◽

Changsheng Li ◽

Peng Li

Keyword(s):

Action Recognition ◽

Human Action ◽

Learning Problem ◽

The Arts ◽

Key Factor ◽

Benchmark Datasets ◽

Spatio Temporal ◽

Weakly Supervised ◽

Temporal Localization ◽

Coarse To Fine

Weakly supervised action recognition and localization for untrimmed videos is a challenging problem with extensive applications. The overwhelming irrelevant background contents in untrimmed videos severely hamper effective identification of actions of interest. In this paper, we propose a novel multi-instance multi-label modeling network based on spatio-temporal pre-trimming to recognize actions and locate corresponding frames in untrimmed videos. Motivated by the fact that person is the key factor in a human action, we spatially and temporally segment each untrimmed video into person-centric clips with pose estimation and tracking techniques. Given the bag-of-instances structure associated with video-level labels, action recognition is naturally formulated as a multi-instance multi-label learning problem. The network is optimized iteratively with selective coarse-to-fine pre-trimming based on instance-label activation. After convergence, temporal localization is further achieved with local-global temporal class activation map. Extensive experiments are conducted on two benchmark datasets, i.e. THUMOS14 and ActivityNet1.3, and experimental results clearly corroborate the efficacy of our method when compared with the state-of-the-arts.

Download Full-text

Mixed Features Based Improved Human Action Recognition Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.2731 ◽

2014 ◽

Vol 989-994 ◽

pp. 2731-2734

Author(s):

Hai Long Jia ◽

Kun Cao

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Video Camera ◽

Human Action ◽

Recognition Algorithm ◽

Advantages And Disadvantages ◽

Single Feature ◽

Mixed Features ◽

Motion Features ◽

Full Consideration

The choice of the motion features affects the result of the human action recognition method directly. Many factors often influence the single feature differently, such as appearance of human body, environment and video camera. So the accuracy of action recognition is limited. On the basis of studying the representation and recognition of human actions, and giving full consideration to the advantages and disadvantages of different features, this paper proposes a mixed feature which combines global silhouette feature and local optical flow feature. This combined representation is used for human action recognition.

Download Full-text