Human Action Recognition Using CNN-SVM Model

In this paper, a pre-trained CNN model VGG16 with the SVM classifier is presented for the HAR task. The deep features are learned via the VGG16 pre-trained CNN model. The VGG 16 network is previously used for the image classification task. We used VGG16 for the signal classification of human activity, which is recorded by the accelerometer sensor of the mobile phone. The UniMiB dataset contains the 11771 samples of the daily life activity of humans. A Smartphone records these samples through the accelerometer sensor. The features are learned via the fifth max-pooling layer of the VGG16 CNN model and feed to the SVM classifier. The SVM classifier replaced the fully connected layer of the VGG16 model. The proposed VGG16-SVM model achieves effective and efficient results. The proposed method of VGG16-SVM is compared with the previously used schemes. The classification accuracy and F-Score are the evaluation parameters, and the proposed method provided 79.55% accuracy and 71.63% F-Score.

Download Full-text

Classification of Action Based Video using Heterogeneous Feature Extraction and SVM

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2089.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1887-1892

Keyword(s):

Optical Flow ◽

Video Sequence ◽

Human Action ◽

Video Data ◽

Support Vector ◽

Svm Classifier ◽

Video Frames ◽

Integral Role ◽

Heterogeneous Feature

Action recognition (AR) plays a fundamental role in computer vision and video analysis. We are witnessing an astronomical increase of video data on the web and it is difficult to recognize the action in video due to different view point of camera. For AR in video sequence, it depends upon appearance in frame and optical flow in frames of video. In video spatial and temporal components of video frames features play integral role for better classification of action in videos. In the proposed system, RGB frames and optical flow frames are used for AR with the help of Convolutional Neural Network (CNN) pre-trained model Alex-Net extract features from fc7 layer. Support vector machine (SVM) classifier is used for the classification of AR in videos. For classification purpose, HMDB51 dataset have been used which includes 51 Classes of human action. The dataset is divided into 51 action categories. Using SVM classifier, extracted features are used for classification and achieved best result 95.6% accuracy as compared to other techniques of the state-of- art.v

Download Full-text

Human Action Recognition Algorithm Based on DBPSO-SVM Classifier

2019 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) ◽

10.1109/icspcc46631.2019.8960768 ◽

2019 ◽

Author(s):

Yunkun Ning ◽

Sheng Zhang ◽

Weimin Xiong ◽

Guanglin Li ◽

Guoru Zhao

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Recognition Algorithm ◽

Svm Classifier

Download Full-text

I3D-Shufflenet Based Human Action Recognition

Algorithms ◽

10.3390/a13110301 ◽

2020 ◽

Vol 13 (11) ◽

pp. 301

Author(s):

Guocheng Liu ◽

Caixia Zhang ◽

Qingyang Xu ◽

Ruoshi Cheng ◽

Yong Song ◽

...

Keyword(s):

Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Recognition Algorithm ◽

Convolution Kernel ◽

Histogram Of Oriented Gradients ◽

Temporal Features ◽

Convolution Kernels

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.

Download Full-text

Indian Classical Dance Action Identification and Classification with Convolutional Neural Networks

Advances in Multimedia ◽

10.1155/2018/5141402 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

P. V. V. Kishore ◽

K. V. V. Kumar ◽

E. Kiran Kumar ◽

A. S. C. S. Sastry ◽

M. Teja Kiran ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Recognition Rate ◽

Human Action Recognition ◽

Human Action ◽

Action Identification ◽

Classical Dance ◽

Human Movements ◽

Indian Classical Dance

Extracting and recognizing complex human movements from unconstrained online/offline video sequence is a challenging task in computer vision. This paper proposes the classification of Indian classical dance actions using a powerful artificial intelligence tool: convolutional neural networks (CNN). In this work, human action recognition on Indian classical dance videos is performed on recordings from both offline (controlled recording) and online (live performances, YouTube) data. The offline data is created with ten different subjects performing 200 familiar dance mudras/poses from different Indian classical dance forms under various background environments. The online dance data is collected from YouTube for ten different subjects. Each dance pose is occupied for 60 frames or images in a video in both the cases. CNN training is performed with 8 different sample sizes, each consisting of multiple sets of subjects. The remaining 2 samples are used for testing the trained CNN. Different CNN architectures were designed and tested with our data to obtain a better accuracy in recognition. We achieved a 93.33% recognition rate compared to other classifier models reported on the same dataset.

Download Full-text

Human action recognition using dimensionality reduction and support vector machine

Information Technology and Nanotechnology ◽

10.18287/1613-0073-2019-2391-48-53 ◽

2019 ◽

pp. 48-53 ◽

Cited By ~ 1

Author(s):

L V Shiripova ◽

E V Myasnikov

Keyword(s):

Support Vector Machine ◽

Dimensionality Reduction ◽

Experimental Studies ◽

Principal Component ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Human Actions ◽

Analysis Technique

The paper is devoted to the problem of recognizing human actions in videos recorded in the optical range of wavelengths. An approach proposed in this paper consists in the detection of a moving person on a video sequence with the subsequent size normalization, generation of subsequences and dimensionality reduction using the principal component analysis technique. The classification of human actions is carried out using a support vector machine classifier. Experimental studies performed on the Weizmann dataset allowed us to determine the best values of the method parameters. The results showed that with a small number of action classes, high classification accuracy can be achieved.

Download Full-text

Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions

Neurocomputing ◽

10.1016/j.neucom.2020.06.032 ◽

2020 ◽

Vol 410 ◽

pp. 304-316 ◽

Cited By ~ 2

Author(s):

Zufan Zhang ◽

Zongming Lv ◽

Chenquan Gan ◽

Qingyi Zhu

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Fully Connected ◽

Convolutional Lstm

Download Full-text

An Online Arrangement Method of Difficult Actions in Competitive Aerobics Based on Multimedia Technology

Security and Communication Networks ◽

10.1155/2021/9968401 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Li Li

Keyword(s):

Optical Flow ◽

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Development Trend ◽

Human Action ◽

Multimedia Technology ◽

Score Vector ◽

Flow Image ◽

Fully Connected

In accordance with the development trend of competitive aerobics’ arrangement structure, this paper studies the online arrangement method of difficult actions in competitive aerobics based on multimedia technology to improve the arrangement effect. RGB image, optical flow image, and corrected optical flow image are taken as the input modes of difficult action recognition network in competitive aerobics video based on top-down feature fusion. The key frames of input modes in competitive aerobics video are extracted by using the key frame extraction method based on subshot segmentation of a double-threshold sliding window and fully connected graph. Through forward propagation, the score vector of video relative to all categories is obtained, and the probability score of probability distribution is obtained after normalization. The human action recognition in competitive aerobics video is completed, and the online arrangement of difficult action in competitive aerobics is realized based on this. The experimental results show that this method has a high accuracy in identifying difficult actions in competitive aerobics video; the online arrangement of difficult actions in competitive aerobics has obvious advantages, meets the needs of users, and has strong practicability.

Download Full-text

Human Action Recognition using the Radial Basis Function-Support Vector Machine (RBF-SVM) Classifier

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2020.6170 ◽

2020 ◽

Vol 8 (6) ◽

pp. 1048-1060

Author(s):

Madhavi Verma

Keyword(s):

Support Vector Machine ◽

Radial Basis Function ◽

Action Recognition ◽

Basis Function ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Svm Classifier ◽

Radial Basis

Download Full-text

Efficient Sparse Representation based Action Recognition in video

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b2950.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 728-732

Keyword(s):

Sparse Representation ◽

Action Recognition ◽

Real Life ◽

Human Action Recognition ◽

Human Action ◽

Feature Representation ◽

Svm Classifier ◽

Interest Points ◽

Frame Sequence ◽

New Feature

Human Action Recognition (HAR) is an interesting and helpful topic in various real-life applications such as surveillance based security system, computer vision and robotics. The selected features and feature representation methods, classification algorithms decides the accuracy of the HAR systems. A new feature called, Skeletonized STIP (Spatio Temporal Interest Points) is identified and used in this work. The skeletonization on the action video’s foreground frames are performed and the new feature is generated as STIP values of the skeleton frame sequence. Then the feature set is used for initial dictionary construction in sparse coding. The data for action recognition is huge, since the feature set is represented using the sparse representation. To refine the sparse representation the max pooling method is used and the action recognition is performed using SVM classifier. The proposed approach outperforms on the benchmark datasets.

Download Full-text

A novel feature fusion based Human Action Recognition in 2D Videos

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.20.13297 ◽

2018 ◽

Vol 7 (2.20) ◽

pp. 207 ◽

Cited By ~ 1

Author(s):

K Rajendra Prasad ◽

P Srinivasa Rao

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Recognition Rate ◽

Human Action Recognition ◽

Human Action ◽

Support Vector ◽

Svm Classifier ◽

Feature Descriptor ◽

Scale Invariant ◽

Human Actions

Human action recognition from 2D videos is a demanding area due to its broad applications. Many methods have been proposed by the researchers for recognizing human actions. The improved accuracy in identifying human actions is desirable. This paper presents an improved method of human action recognition using support vector machine (SVM) classifier. This paper proposes a novel feature descriptor constructed by fusing the various investigated features. The handcrafted features such as scale invariant feature transform (SIFT) features, speed up robust features (SURF), histogram of oriented gradient (HOG) features and local binary pattern (LBP) features are obtained on online 2D action videos. The proposed method is tested on different action datasets having both static and dynamically varying backgrounds. The proposed method achieves shows best recognition rates on both static and dynamically varying backgrounds. The datasets considered for the experimentation are KTH, Weizmann, UCF101, UCF sports actions, MSR action and HMDB51.The performance of the proposed feature fusion model with SVM classifier is compared with the individual features with SVM. The fusion method showed best results. The efficiency of the classifier is also tested by comparing with the other state of the art classifiers such as k-nearest neighbors (KNN), artificial neural network (ANN) and Adaboost classifier. The method achieved an average of 94.41% recognition rate.

Download Full-text