scholarly journals Driving Posture Recognition by Joint Application of Motion History Image and Pyramid Histogram of Oriented Gradients

2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Chao Yan ◽  
Frans Coenen ◽  
Bailing Zhang

In the field of intelligent transportation system (ITS), automatic interpretation of a driver’s behavior is an urgent and challenging topic. This paper studies vision-based driving posture recognition in the human action recognition framework. A driving action dataset was prepared by a side-mounted camera looking at a driver’s left profile. The driving actions, including operating the shift lever, talking on a cell phone, eating, and smoking, are first decomposed into a number of predefined action primitives, that is, interaction with shift lever, operating the shift lever, interaction with head, and interaction with dashboard. A global grid-based representation for the action primitives was emphasized, which first generate the silhouette shape from motion history image, followed by application of the pyramid histogram of oriented gradients (PHOG) for more discriminating characterization. The random forest (RF) classifier was then exploited to classify the action primitives together with comparisons to some other commonly applied classifiers such as kNN, multiple layer perceptron, and support vector machine. Classification accuracy is over 94% for the RF classifier in holdout and cross-validation experiments on the four manually decomposed driving actions.

2013 ◽  
Vol 846-847 ◽  
pp. 1102-1105 ◽  
Author(s):  
Chao Yan ◽  
Frans Coenen ◽  
Bai Ling Zhang

This paper presents a novel approach to vision-based driving posture recognition. A driving action dataset was prepared by a side-mounted camera looking at a driver's left profile. The driving actions, including operating the shift lever, talking on a cell phone, eating and smoking, are decomposed into a number of predefined action primitives, which include operation of the shift lever, interaction with the drivers head and interaction with the dashboard. A global grid-based representation for the action primitives was emphasized, which first generate the silhouette shape from the motion history image, followed by application of the Pyramid Histogram of Oriented Gradients (PHOG) for more discriminating characterization. The random forest (RF) classifier was then exploited to classify the action primitives. Comparisons with some other commonly applied classifiers, such as kNN, multiple layer perceptron (MLP) and support vector machine (SVM), were provided. Classification accuracy is over 95% for the RF classifier in holdout experiment on the four manually decomposed driving actions.


Author(s):  
L. Nirmala Devi ◽  
A.Nageswar Rao

Human action recognition (HAR) is one of most significant research topics, and it has attracted the concentration of many researchers. Automatic HAR system is applied in several fields like visual surveillance, data retrieval, healthcare, etc. Based on this inspiration, in this chapter, the authors propose a new HAR model that considers an image as input and analyses and exposes the action present in it. Under the analysis phase, they implement two different feature extraction methods with the help of rotation invariant Gabor filter and edge adaptive wavelet filter. For every action image, a new vector called as composite feature vector is formulated and then subjected to dimensionality reduction through principal component analysis (PCA). Finally, the authors employ the most popular supervised machine learning algorithm (i.e., support vector machine [SVM]) for classification. Simulation is done over two standard datasets; they are KTH and Weizmann, and the performance is measured through an accuracy metric.


Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 301
Author(s):  
Guocheng Liu ◽  
Caixia Zhang ◽  
Qingyang Xu ◽  
Ruoshi Cheng ◽  
Yong Song ◽  
...  

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.


Sensors ◽  
2019 ◽  
Vol 19 (7) ◽  
pp. 1599 ◽  
Author(s):  
Md Uddin ◽  
Young-Koo Lee

Human action recognition plays a significant part in the research community due to its emerging applications. A variety of approaches have been proposed to resolve this problem, however, several issues still need to be addressed. In action recognition, effectively extracting and aggregating the spatial-temporal information plays a vital role to describe a video. In this research, we propose a novel approach to recognize human actions by considering both deep spatial features and handcrafted spatiotemporal features. Firstly, we extract the deep spatial features by employing a state-of-the-art deep convolutional network, namely Inception-Resnet-v2. Secondly, we introduce a novel handcrafted feature descriptor, namely Weber’s law based Volume Local Gradient Ternary Pattern (WVLGTP), which brings out the spatiotemporal features. It also considers the shape information by using gradient operation. Furthermore, Weber’s law based threshold value and the ternary pattern based on an adaptive local threshold is presented to effectively handle the noisy center pixel value. Besides, a multi-resolution approach for WVLGTP based on an averaging scheme is also presented. Afterward, both these extracted features are concatenated and feed to the Support Vector Machine to perform the classification. Lastly, the extensive experimental analysis shows that our proposed method outperforms state-of-the-art approaches in terms of accuracy.


Author(s):  
Xueping Liu ◽  
Xingzuo Yue

The kernel function has been successfully utilized in the extreme learning machine (ELM) that provides a stabilized and generalized performance and greatly reduces the computational complexity. However, the selection and optimization of the parameters constituting the most common kernel functions are tedious and time-consuming. In this study, a set of new Hermit kernel functions derived from the generalized Hermit polynomials has been proposed. The significant contributions of the proposed kernel include only one parameter selected from a small set of natural numbers; thus, the parameter optimization is greatly facilitated and excessive structural information of the sample data is retained. Consequently, the new kernel functions can be used as optimal alternatives to other common kernel functions for ELM at a rapid learning speed. The experimental results showed that the proposed kernel ELM method tends to have similar or better robustness and generalized performance at a faster learning speed than the other common kernel ELM and support vector machine methods. Consequently, when applied to human action recognition by depth video sequence, the method also achieves excellent performance, demonstrating its time-based advantage on the video image data.


2019 ◽  
Vol 9 (10) ◽  
pp. 2126 ◽  
Author(s):  
Suge Dong ◽  
Daidi Hu ◽  
Ruijun Li ◽  
Mingtao Ge

Aimed at the problems of high redundancy of trajectory and susceptibility to background interference in traditional dense trajectory behavior recognition methods, a human action recognition method based on foreground trajectory and motion difference descriptors is proposed. First, the motion magnitude of each frame is estimated by optical flow, and the foreground region is determined according to each motion magnitude of the pixels; the trajectories are only extracted from behavior-related foreground regions. Second, in order to better describe the relative temporal information between different actions, a motion difference descriptor is introduced to describe the foreground trajectory, and the direction histogram of the motion difference is constructed by calculating the direction information of the motion difference per unit time of the trajectory point. Finally, a Fisher vector (FV) is used to encode histogram features to obtain video-level action features, and a support vector machine (SVM) is utilized to classify the action category. Experimental results show that this method can better extract the action-related trajectory, and it can improve the recognition accuracy by 7% compared to the traditional dense trajectory method.


Sign in / Sign up

Export Citation Format

Share Document