Deep Residual Temporal Convolutional Networks for Skeleton-Based Human Action Recognition

Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling

Data ◽

10.3390/data5040104 ◽

2020 ◽

Vol 5 (4) ◽

pp. 104

Author(s):

Ashok Sarabu ◽

Ajit Kumar Santra

Keyword(s):

Action Recognition ◽

Data Augmentation ◽

Main Idea ◽

Human Action Recognition ◽

Human Action ◽

Great Success ◽

Temporal Modeling ◽

Convolutional Networks ◽

Temporal Features ◽

Augmentation Techniques

The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.

Get full-text (via PubEx)

Pixel Convolutional Networks for Skeleton-Based Human Action Recognition

Communications in Computer and Information Science - Methods and Applications for Modeling and Simulation of Complex Systems ◽

10.1007/978-981-13-2853-4_40 ◽

2018 ◽

pp. 513-523

Author(s):

Zhichao Chang ◽

Jiangyun Wang ◽

Liang Han

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Networks

Get full-text (via PubEx)

Dual Attention-Guided Multiscale Dynamic Aggregate Graph Convolutional Networks for Skeleton-Based Human Action Recognition

Symmetry ◽

10.3390/sym12101589 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1589

Author(s):

Zeyuan Hu ◽

Eung-Joo Lee

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Great Success ◽

Semantic Features ◽

Convolutional Networks ◽

Temporal Correlations ◽

Semantic Relevance ◽

High Level ◽

Relationship Of

Traditional convolution neural networks have achieved great success in human action recognition. However, it is challenging to establish effective associations between different human bone nodes to capture detailed information. In this paper, we propose a dual attention-guided multiscale dynamic aggregate graph convolution neural network (DAG-GCN) for skeleton-based human action recognition. Our goal is to explore the best correlation and determine high-level semantic features. First, a multiscale dynamic aggregate GCN module is used to capture important semantic information and to establish dependence relationships for different bone nodes. Second, the higher level semantic feature is further refined, and the semantic relevance is emphasized through a dual attention guidance module. In addition, we exploit the relationship of joints hierarchically and the spatial temporal correlations through two modules. Experiments with the DAG-GCN method result in good performance on the NTU-60-RGB+D and NTU-120-RGB+D datasets. The accuracy is 95.76% and 90.01%, respectively, for the cross (X)-View and X-Subon the NTU60dataset.

Get full-text (via PubEx)

Exploring hybrid spatio-temporal convolutional networks for human action recognition

Multimedia Tools and Applications ◽

10.1007/s11042-017-4514-3 ◽

2017 ◽

Vol 76 (13) ◽

pp. 15065-15081 ◽

Cited By ~ 11

Author(s):

Hao Wang ◽

Yanhua Yang ◽

Erkun Yang ◽

Cheng Deng

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Networks ◽

Spatio Temporal

Get full-text (via PubEx)

An Attention Enhanced Spatial–Temporal Graph Convolutional LSTM Network for Action Recognition in Karate

Applied Sciences ◽

10.3390/app11188641 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8641

Author(s):

Jianping Guo ◽

Hong Liu ◽

Xi Li ◽

Dahong Xu ◽

Yihan Zhang

Keyword(s):

Artificial Intelligence ◽

Action Recognition ◽

Structural Information ◽

Human Action Recognition ◽

Human Action ◽

Competitive Sports ◽

Convolutional Networks ◽

Convolution Model ◽

Artificial Intelligence Technology ◽

Temporal Graph

With the increasing popularity of artificial intelligence applications, artificial intelligence technology has begun to be applied in competitive sports. These applications have promoted the improvement of athletes’ competitive ability, as well as the fitness of the masses. Human action recognition technology, based on deep learning, has gradually been applied to the analysis of the technical actions of competitive sports athletes, as well as the analysis of tactics. In this paper, a new graph convolution model is proposed. Delaunay’s partitioning algorithm was used to construct a new spatiotemporal topology which can effectively obtain the structural information and spatiotemporal features of athletes’ technical actions. At the same time, the attention mechanism was integrated into the model, and different weight coefficients were assigned to the joints, which significantly improved the accuracy of technical action recognition. First, a comparison between the current state-of-the-art methods was undertaken using the general datasets of Kinect and NTU-RGB + D. The performance of the new algorithm model was slightly improved in comparison to the general dataset. Then, the performance of our algorithm was compared with spatial temporal graph convolutional networks (ST-GCN) for the karate technique action dataset. We found that the accuracy of our algorithm was significantly improved.

Get full-text (via PubEx)

Feature fusion for human action recognition based on classical descriptors and 3D convolutional networks

2017 Eleventh International Conference on Sensing Technology (ICST) ◽

10.1109/icsenst.2017.8304460 ◽

2017 ◽

Author(s):

Yang Qin ◽

Lingfei Mo ◽

Benyi Xie

Keyword(s):

Action Recognition ◽

Feature Fusion ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Networks

Get full-text (via PubEx)

Comparison between Recurrent Networks and Temporal Convolutional Networks Approaches for Skeleton-Based Action Recognition

Sensors ◽

10.3390/s21062051 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2051

Author(s):

Mihai Nan ◽

Mihai Trăscău ◽

Adina Magda Florea ◽

Cezar Cătălin Iacob

Keyword(s):

Action Recognition ◽

State Of The Art ◽

Human Action Recognition ◽

Human Action ◽

Assistive Robotics ◽

Recurrent Networks ◽

Human Machine Interaction ◽

Convolutional Networks ◽

Machine Interaction ◽

Spatial And Temporal Characteristics

Action recognition plays an important role in various applications such as video monitoring, automatic video indexing, crowd analysis, human-machine interaction, smart homes and personal assistive robotics. In this paper, we propose improvements to some methods for human action recognition from videos that work with data represented in the form of skeleton poses. These methods are based on the most widely used techniques for this problem—Graph Convolutional Networks (GCNs), Temporal Convolutional Networks (TCNs) and Recurrent Neural Networks (RNNs). Initially, the paper explores and compares different ways to extract the most relevant spatial and temporal characteristics for a sequence of frames describing an action. Based on this comparative analysis, we show how a TCN type unit can be extended to work even on the characteristics extracted from the spatial domain. To validate our approach, we test it against a benchmark often used for human action recognition problems and we show that our solution obtains comparable results to the state-of-the-art, but with a significant increase in the inference speed.

Get full-text (via PubEx)

Human Action Recognition Using Factorized Spatio-Temporal Convolutional Networks

2015 IEEE International Conference on Computer Vision (ICCV) ◽

10.1109/iccv.2015.522 ◽

2015 ◽

Cited By ~ 202

Author(s):

Lin Sun ◽

Kui Jia ◽

Dit-Yan Yeung ◽

Bertram E. Shi

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Networks ◽

Spatio Temporal

Get full-text (via PubEx)

Temporal-scale Convolutional Networks for Human Action Recognition Based on Key-Frame Extraction

DEStech Transactions on Computer Science and Engineering ◽

10.12783/dtcse/ccnt2018/24746 ◽

2018 ◽

Author(s):

Zhao-qiang WEI ◽

Yong-qiang KONG ◽

Zhen-gang WEI ◽

Xiao-long ZHANG

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Temporal Scale ◽

Key Frame Extraction ◽

Key Frame ◽

Convolutional Networks

Get full-text (via PubEx)

REGINA - Reasoning Graph Convolutional Networks in Human Action Recognition

IEEE Transactions on Information Forensics and Security ◽

10.1109/tifs.2021.3130437 ◽

2021 ◽

pp. 1-1

Author(s):

Bruno Degardin ◽

Vasco Lopes ◽

Hugo Proenca

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Convolutional Networks

Get full-text (via PubEx)