scholarly journals Global Spatio-Temporal Attention for Action Recognition Based on 3D Human Skeleton Data

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 88604-88616 ◽  
Author(s):  
Yun Han ◽  
Sheng-Luen Chung ◽  
Qiang Xiao ◽  
Wei You Lin ◽  
Shun-Feng Su
2020 ◽  
Vol 22 (11) ◽  
pp. 2990-3001 ◽  
Author(s):  
Jun Li ◽  
Xianglong Liu ◽  
Wenxuan Zhang ◽  
Mingyuan Zhang ◽  
Jingkuan Song ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 1005
Author(s):  
Pau Climent-Pérez ◽  
Francisco Florez-Revuelta

The potential benefits of recognising activities of daily living from video for active and assisted living have yet to be fully untapped. These technologies can be used for behaviour understanding, and lifelogging for caregivers and end users alike. The recent publication of realistic datasets for this purpose, such as the Toyota Smarthomes dataset, calls for pushing forward the efforts to improve action recognition. Using the separable spatio-temporal attention network proposed in the literature, this paper introduces a view-invariant normalisation of skeletal pose data and full activity crops for RGB data, which improve the baseline results by 9.5% (on the cross-subject experiments), outperforming state-of-the-art techniques in this field when using the original unmodified skeletal data in dataset. Our code and data are available online.


2018 ◽  
Vol 27 (7) ◽  
pp. 3459-3471 ◽  
Author(s):  
Sijie Song ◽  
Cuiling Lan ◽  
Junliang Xing ◽  
Wenjun Zeng ◽  
Jiaying Liu

2019 ◽  
Vol 21 (2) ◽  
pp. 416-428 ◽  
Author(s):  
Dong Li ◽  
Ting Yao ◽  
Ling-Yu Duan ◽  
Tao Mei ◽  
Yong Rui

Author(s):  
Chunyu Xie ◽  
Ce Li ◽  
Baochang Zhang ◽  
Chen Chen ◽  
Jungong Han ◽  
...  

Skeleton-based action recognition task is entangled with complex spatio-temporal variations of skeleton joints, and remains challenging for Recurrent Neural Networks (RNNs). In this work, we propose a temporal-then-spatial recalibration scheme to alleviate such complex variations, resulting in an end-to-end Memory Attention Networks (MANs) which consist of a Temporal Attention Recalibration Module (TARM) and a Spatio-Temporal Convolution Module (STCM). Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence. The STCM treats the attention calibrated skeleton joint sequences as images and leverages the Convolution Neural Networks (CNNs) to further model the spatial and temporal information of skeleton data. These two modules (TARM and STCM) seamlessly form a single network architecture that can be trained in an end-to-end fashion. MANs significantly boost the performance of skeleton-based action recognition and achieve the best results on four challenging benchmark datasets: NTU RGB+D, HDM05, SYSU-3D and UT-Kinect.


Sign in / Sign up

Export Citation Format

Share Document