scholarly journals Modality Distillation with Multiple Stream Networks for Action Recognition

Author(s):  
Nuno C. Garcia ◽  
Pietro Morerio ◽  
Vittorio Murino
2019 ◽  
Vol 28 (4) ◽  
pp. 1773-1782 ◽  
Author(s):  
Wenbing Huang ◽  
Lijie Fan ◽  
Mehrtash Harandi ◽  
Lin Ma ◽  
Huaping Liu ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8309
Author(s):  
Inwoong Lee ◽  
Doyoung Kim ◽  
Dongyoon Wee ◽  
Sanghoon Lee

In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our action region features are used as the inputs of the temporal action head network, which makes our framework more discriminative. We also propose novel temporal action head networks consisting of various modules, which reflect various temporal dynamics well. In the experiment, the proposed models achieve comparable performance with the state-of-the-art approaches on two challenging datasets. Furthermore, we evaluate the proposed features and networks to verify the effectiveness of them. Finally, we analyze the confusion matrix and visualize the recognized actions at human instance level when there are several people.


Author(s):  
Zhenbing Liu ◽  
Zeya Li ◽  
Ming Zong ◽  
Wanting Ji ◽  
Ruili Wang ◽  
...  

2020 ◽  
Vol 32 (18) ◽  
pp. 14593-14602 ◽  
Author(s):  
Zhenbing Liu ◽  
Zeya Li ◽  
Ruili Wang ◽  
Ming Zong ◽  
Wanting Ji

Author(s):  
Yaqing Hou ◽  
Hua Yu ◽  
Dongsheng Zhou ◽  
Pengfei Wang ◽  
Hongwei Ge ◽  
...  

AbstractIn the study of human action recognition, two-stream networks have made excellent progress recently. However, there remain challenges in distinguishing similar human actions in videos. This paper proposes a novel local-aware spatio-temporal attention network with multi-stage feature fusion based on compact bilinear pooling for human action recognition. To elaborate, taking two-stream networks as our essential backbones, the spatial network first employs multiple spatial transformer networks in a parallel manner to locate the discriminative regions related to human actions. Then, we perform feature fusion between the local and global features to enhance the human action representation. Furthermore, the output of the spatial network and the temporal information are fused at a particular layer to learn the pixel-wise correspondences. After that, we bring together three outputs to generate the global descriptors of human actions. To verify the efficacy of the proposed approach, comparison experiments are conducted with the traditional hand-engineered IDT algorithms, the classical machine learning methods (i.e., SVM) and the state-of-the-art deep learning methods (i.e., spatio-temporal multiplier networks). According to the results, our approach is reported to obtain the best performance among existing works, with the accuracy of 95.3% and 72.9% on UCF101 and HMDB51, respectively. The experimental results thus demonstrate the superiority and significance of the proposed architecture in solving the task of human action recognition.


2018 ◽  
Vol 30 (11) ◽  
pp. 2074
Author(s):  
Lifei Song ◽  
Liguo Weng ◽  
Lingfeng Wang ◽  
Min Xia

1990 ◽  
Vol 26 (9) ◽  
pp. 2243-2244 ◽  
Author(s):  
David G. Tarboton

2013 ◽  
Vol 18 (2-3) ◽  
pp. 49-60 ◽  
Author(s):  
Damian Dudzńiski ◽  
Tomasz Kryjak ◽  
Zbigniew Mikrut

Abstract In this paper a human action recognition algorithm, which uses background generation with shadow elimination, silhouette description based on simple geometrical features and a finite state machine for recognizing particular actions is described. The performed tests indicate that this approach obtains a 81 % correct recognition rate allowing real-time image processing of a 360 X 288 video stream.


2018 ◽  
Vol 6 (10) ◽  
pp. 323-328
Author(s):  
K.Kiruba . ◽  
D. Shiloah Elizabeth ◽  
C Sunil Retmin Raj

Sign in / Sign up

Export Citation Format

Share Document