scholarly journals Improved Action Recognition with Separable Spatio-Temporal Attention Using Alternative Skeletal and Video Pre-Processing

Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 1005
Author(s):  
Pau Climent-Pérez ◽  
Francisco Florez-Revuelta

The potential benefits of recognising activities of daily living from video for active and assisted living have yet to be fully untapped. These technologies can be used for behaviour understanding, and lifelogging for caregivers and end users alike. The recent publication of realistic datasets for this purpose, such as the Toyota Smarthomes dataset, calls for pushing forward the efforts to improve action recognition. Using the separable spatio-temporal attention network proposed in the literature, this paper introduces a view-invariant normalisation of skeletal pose data and full activity crops for RGB data, which improve the baseline results by 9.5% (on the cross-subject experiments), outperforming state-of-the-art techniques in this field when using the original unmodified skeletal data in dataset. Our code and data are available online.

Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 423 ◽  
Author(s):  
Mihai Trăscău ◽  
Mihai Nan ◽  
Adina Florea

Robust action recognition methods lie at the cornerstone of Ambient Assisted Living (AAL) systems employing optical devices. Using 3D skeleton joints extracted from depth images taken with time-of-flight (ToF) cameras has been a popular solution for accomplishing these tasks. Though seemingly scarce in terms of information availability compared to its RGB or depth image counterparts, the skeletal representation has proven to be effective in the task of action recognition. This paper explores different interpretations of both the spatial and the temporal dimensions of a sequence of frames describing an action. We show that rather intuitive approaches, often borrowed from other computer vision tasks, can improve accuracy. We report results based on these modifications and propose an architecture that uses temporal convolutions with results comparable to the state of the art.


2021 ◽  
pp. 620-631
Author(s):  
Xiang Li ◽  
Shenglan Liu ◽  
Yunheng Li ◽  
Hao Liu ◽  
Jinjing Zhao ◽  
...  

2020 ◽  
Vol 10 (15) ◽  
pp. 5326
Author(s):  
Xiaolei Diao ◽  
Xiaoqiang Li ◽  
Chen Huang

The same action takes different time in different cases. This difference will affect the accuracy of action recognition to a certain extent. We propose an end-to-end deep neural network called “Multi-Term Attention Networks” (MTANs), which solves the above problem by extracting temporal features with different time scales. The network consists of a Multi-Term Attention Recurrent Neural Network (MTA-RNN) and a Spatio-Temporal Convolutional Neural Network (ST-CNN). In MTA-RNN, a method for fusing multi-term temporal features are proposed to extract the temporal dependence of different time scales, and the weighted fusion temporal feature is recalibrated by the attention mechanism. Ablation research proves that this network has powerful spatio-temporal dynamic modeling capabilities for actions with different time scales. We perform extensive experiments on four challenging benchmark datasets, including the NTU RGB+D dataset, UT-Kinect dataset, Northwestern-UCLA dataset, and UWA3DII dataset. Our method achieves better results than the state-of-the-art benchmarks, which demonstrates the effectiveness of MTANs.


Sensors ◽  
2016 ◽  
Vol 16 (2) ◽  
pp. 184 ◽  
Author(s):  
Ivan Pires ◽  
Nuno Garcia ◽  
Nuno Pombo ◽  
Francisco Flórez-Revuelta

This paper focuses on the research on the state of the art for sensor fusion techniques, applied to the sensors embedded in mobile devices, as a means to help identify the mobile device user’s daily activities. Sensor data fusion techniques are used to consolidate the data collected from several sensors, increasing the reliability of the algorithms for the identification of the different activities. However, mobile devices have several constraints, e.g., low memory, low battery life and low processing power, and some data fusion techniques are not suited to this scenario. The main purpose of this paper is to present an overview of the state of the art to identify examples of sensor data fusion techniques that can be applied to the sensors available in mobile devices aiming to identify activities of daily living (ADLs).


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 88604-88616 ◽  
Author(s):  
Yun Han ◽  
Sheng-Luen Chung ◽  
Qiang Xiao ◽  
Wei You Lin ◽  
Shun-Feng Su

Sign in / Sign up

Export Citation Format

Share Document