A Co-Training Based Semi-Supervised Human Action Recognition Algorithm

2011 ◽  
Vol 267 ◽  
pp. 1065-1070 ◽  
Author(s):  
He Jin Yuan ◽  
Cui Ru Wang ◽  
Jun Liu

A novel semi-supervised algorithm based on co-training is proposed in this paper. In the method, the motion energy history image are used as the different feature representation of human action; then the co-training based semi-supervised learning algorithm is utilized to predict the category of unlabeled training examples. And the average motion energy and history images are calculated as the recognition model for each category action. When recognition, the observed action is firstly classified through its correlation coefficients to the prior established templates respectively; then its final category is determined according to the consistency between the classification results of motion energy and motion history images. The experiments on Weizmann dataset demonstrate that our method is effective for human action recognition.

2013 ◽  
Vol 18 (2-3) ◽  
pp. 49-60 ◽  
Author(s):  
Damian Dudzńiski ◽  
Tomasz Kryjak ◽  
Zbigniew Mikrut

Abstract In this paper a human action recognition algorithm, which uses background generation with shadow elimination, silhouette description based on simple geometrical features and a finite state machine for recognizing particular actions is described. The performed tests indicate that this approach obtains a 81 % correct recognition rate allowing real-time image processing of a 360 X 288 video stream.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Alexandros Andre Chaaraoui ◽  
Francisco Flórez-Revuelta

This paper presents a novel silhouette-based feature for vision-based human action recognition, which relies on the contour of the silhouette and a radial scheme. Its low-dimensionality and ease of extraction result in an outstanding proficiency for real-time scenarios. This feature is used in a learning algorithm that by means of model fusion of multiple camera streams builds a bag of key poses, which serves as a dictionary of known poses and allows converting the training sequences into sequences of key poses. These are used in order to perform action recognition by means of a sequence matching algorithm. Experimentation on three different datasets returns high and stable recognition rates. To the best of our knowledge, this paper presents the highest results so far on the MuHAVi-MAS dataset. Real-time suitability is given, since the method easily performs above video frequency. Therefore, the related requirements that applications as ambient-assisted living services impose are successfully fulfilled.


Author(s):  
L. Nirmala Devi ◽  
A.Nageswar Rao

Human action recognition (HAR) is one of most significant research topics, and it has attracted the concentration of many researchers. Automatic HAR system is applied in several fields like visual surveillance, data retrieval, healthcare, etc. Based on this inspiration, in this chapter, the authors propose a new HAR model that considers an image as input and analyses and exposes the action present in it. Under the analysis phase, they implement two different feature extraction methods with the help of rotation invariant Gabor filter and edge adaptive wavelet filter. For every action image, a new vector called as composite feature vector is formulated and then subjected to dimensionality reduction through principal component analysis (PCA). Finally, the authors employ the most popular supervised machine learning algorithm (i.e., support vector machine [SVM]) for classification. Simulation is done over two standard datasets; they are KTH and Weizmann, and the performance is measured through an accuracy metric.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 301
Author(s):  
Guocheng Liu ◽  
Caixia Zhang ◽  
Qingyang Xu ◽  
Ruoshi Cheng ◽  
Yong Song ◽  
...  

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.


Author(s):  
Jiajia Luo ◽  
Wei Wang ◽  
Hairong Qi

Multi-view human action recognition has gained a lot of attention in recent years for its superior performance as compared to single view recognition. In this paper, we propose a new framework for the real-time realization of human action recognition in distributed camera networks (DCNs). We first present a new feature descriptor (Mltp-hist) that is tolerant to illumination change, robust in homogeneous region and computationally efficient. Taking advantage of the proposed Mltp-hist, the noninformative 3-D patches generated from the background can be further removed automatically that effectively highlights the foreground patches. Next, a new feature representation method based on sparse coding is presented to generate the histogram representation of local videos to be transmitted to the base station for classification. Due to the sparse representation of extracted features, the approximation error is reduced. Finally, at the base station, a probability model is produced to fuse the information from various views and a class label is assigned accordingly. Compared to the existing algorithms, the proposed framework has three advantages while having less requirements on memory and bandwidth consumption: 1) no preprocessing is required; 2) communication among cameras is unnecessary; and 3) positions and orientations of cameras do not need to be fixed. We further evaluate the proposed framework on the most popular multi-view action dataset IXMAS. Experimental results indicate that our proposed framework repeatedly achieves state-of-the-art results when various numbers of views are tested. In addition, our approach is tolerant to the various combination of views and benefit from introducing more views at the testing stage. Especially, our results are still satisfactory even when large misalignment exists between the training and testing samples.


2014 ◽  
Vol 989-994 ◽  
pp. 2731-2734
Author(s):  
Hai Long Jia ◽  
Kun Cao

The choice of the motion features affects the result of the human action recognition method directly. Many factors often influence the single feature differently, such as appearance of human body, environment and video camera. So the accuracy of action recognition is limited. On the basis of studying the representation and recognition of human actions, and giving full consideration to the advantages and disadvantages of different features, this paper proposes a mixed feature which combines global silhouette feature and local optical flow feature. This combined representation is used for human action recognition.


2014 ◽  
Vol 11 (01) ◽  
pp. 1450005
Author(s):  
Yangyang Wang ◽  
Yibo Li ◽  
Xiaofei Ji

Visual-based human action recognition is currently one of the most active research topics in computer vision. The feature representation directly has a crucial impact on the performance of the recognition. Feature representation based on bag-of-words is popular in current research, but the spatial and temporal relationship among these features is usually discarded. In order to solve this issue, a novel feature representation based on normalized interest points is proposed and utilized to recognize the human actions. The novel representation is called super-interest point. The novelty of the proposed feature is that the spatial-temporal correlation between the interest points and human body can be directly added to the representation without considering scale and location variance of the points by introducing normalized points clustering. The novelty concerns three tasks. First, to solve the diversity of human location and scale, interest points are normalized based on the normalization of the human region. Second, to obtain the spatial-temporal correlation among the interest points, the normalized points with similar spatial and temporal distance are constructed to a super-interest point by using three-dimensional clustering algorithm. Finally, by describing the appearance characteristic of the super-interest points and location relationship among the super-interest points, a new feature representation is gained. The proposed representation formation sets up the relationship among local features and human figure. Experiments on Weizmann, KTH, and UCF sports dataset demonstrate that the proposed feature is effective for human action recognition.


Sign in / Sign up

Export Citation Format

Share Document