Skeleton based Human Action Recognition using a Structured-Tree Neural Network

The ability for automated technologies to correctly identify a human’s actions provides considerable scope for systems that make use of human-machine interaction. Thus, automatic3D Human Action Recognition is an area that has seen significant research effort. In work described here, a human’s everyday 3D actions recorded in the NTU RGB+D dataset are identified using a novel structured-tree neural network. The nodes of the tree represent the skeleton joints, with the spine joint being represented by the root. The connection between a child node and its parent is known as the incoming edge while the reciprocal connection is known as the outgoing edge. The uses of tree structure lead to a system that intuitively maps to human movements. The classifier uses the change in displacement of joints and change in the angles between incoming and outgoing edges as features for classification of the actions performed

Download Full-text

Comparison between Recurrent Networks and Temporal Convolutional Networks Approaches for Skeleton-Based Action Recognition

Sensors ◽

10.3390/s21062051 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2051

Author(s):

Mihai Nan ◽

Mihai Trăscău ◽

Adina Magda Florea ◽

Cezar Cătălin Iacob

Keyword(s):

Action Recognition ◽

State Of The Art ◽

Human Action Recognition ◽

Human Action ◽

Assistive Robotics ◽

Recurrent Networks ◽

Human Machine Interaction ◽

Convolutional Networks ◽

Machine Interaction ◽

Spatial And Temporal Characteristics

Action recognition plays an important role in various applications such as video monitoring, automatic video indexing, crowd analysis, human-machine interaction, smart homes and personal assistive robotics. In this paper, we propose improvements to some methods for human action recognition from videos that work with data represented in the form of skeleton poses. These methods are based on the most widely used techniques for this problem—Graph Convolutional Networks (GCNs), Temporal Convolutional Networks (TCNs) and Recurrent Neural Networks (RNNs). Initially, the paper explores and compares different ways to extract the most relevant spatial and temporal characteristics for a sequence of frames describing an action. Based on this comparative analysis, we show how a TCN type unit can be extended to work even on the characteristics extracted from the spatial domain. To validate our approach, we test it against a benchmark often used for human action recognition problems and we show that our solution obtains comparable results to the state-of-the-art, but with a significant increase in the inference speed.

Download Full-text

End-to-end learning of deep convolutional neural network for 3D human action recognition

2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) ◽

10.1109/icmew.2017.8026281 ◽

2017 ◽

Author(s):

Chao Li ◽

Shouqian Sun ◽

Xin Min ◽

Wenqian Lin ◽

Binling Nie ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Deep Convolutional Neural Network ◽

End To End

Download Full-text

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

Multimedia Tools and Applications ◽

10.1007/s11042-018-5893-9 ◽

2018 ◽

Vol 77 (20) ◽

pp. 26901-26918 ◽

Cited By ~ 8

Author(s):

Bo Meng ◽

XueJun Liu ◽

Xiaolin Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Hybrid Feature Vector-Assisted Action Representation for Human Action Recognition Using Support Vector Machines

Methodologies and Applications of Computational Statistics for Machine Intelligence - Advances in Systems Analysis, Software Engineering, and High Performance Computing ◽

10.4018/978-1-7998-7701-1.ch001 ◽

2021 ◽

pp. 1-22

Author(s):

L. Nirmala Devi ◽

A.Nageswar Rao

Keyword(s):

Action Recognition ◽

Feature Vector ◽

Learning Algorithm ◽

Gabor Filter ◽

Principal Component ◽

Human Action Recognition ◽

Human Action ◽

Visual Surveillance ◽

Support Vector ◽

Significant Research

Human action recognition (HAR) is one of most significant research topics, and it has attracted the concentration of many researchers. Automatic HAR system is applied in several fields like visual surveillance, data retrieval, healthcare, etc. Based on this inspiration, in this chapter, the authors propose a new HAR model that considers an image as input and analyses and exposes the action present in it. Under the analysis phase, they implement two different feature extraction methods with the help of rotation invariant Gabor filter and edge adaptive wavelet filter. For every action image, a new vector called as composite feature vector is formulated and then subjected to dimensionality reduction through principal component analysis (PCA). Finally, the authors employ the most popular supervised machine learning algorithm (i.e., support vector machine [SVM]) for classification. Simulation is done over two standard datasets; they are KTH and Weizmann, and the performance is measured through an accuracy metric.

Download Full-text

I3D-Shufflenet Based Human Action Recognition

Algorithms ◽

10.3390/a13110301 ◽

2020 ◽

Vol 13 (11) ◽

pp. 301

Author(s):

Guocheng Liu ◽

Caixia Zhang ◽

Qingyang Xu ◽

Ruoshi Cheng ◽

Yong Song ◽

...

Keyword(s):

Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Recognition Algorithm ◽

Convolution Kernel ◽

Histogram Of Oriented Gradients ◽

Temporal Features ◽

Convolution Kernels

In view of difficulty in application of optical flow based human action recognition due to large amount of calculation, a human action recognition algorithm I3D-shufflenet model is proposed combining the advantages of I3D neural network and lightweight model shufflenet. The 5 × 5 convolution kernel of I3D is replaced by a double 3 × 3 convolution kernels, which reduces the amount of calculations. The shuffle layer is adopted to achieve feature exchange. The recognition and classification of human action is performed based on trained I3D-shufflenet model. The experimental results show that the shuffle layer improves the composition of features in each channel which can promote the utilization of useful information. The Histogram of Oriented Gradients (HOG) spatial-temporal features of the object are extracted for training, which can significantly improve the ability of human action expression and reduce the calculation of feature extraction. The I3D-shufflenet is testified on the UCF101 dataset, and compared with other models. The final result shows that the I3D-shufflenet has higher accuracy than the original I3D with an accuracy of 96.4%.

Download Full-text