Pipelining Localized Semantic Features for Fine-Grained Action Recognition

Author(s):  
Yang Zhou ◽  
Bingbing Ni ◽  
Shuicheng Yan ◽  
Pierre Moulin ◽  
Qi Tian
2021 ◽  
pp. 620-631
Author(s):  
Xiang Li ◽  
Shenglan Liu ◽  
Yunheng Li ◽  
Hao Liu ◽  
Jinjing Zhao ◽  
...  

Author(s):  
Weichun Liu ◽  
Xiaoan Tang ◽  
Chenglin Zhao

Recently, deep trackers based on the siamese networking are enjoying increasing popularity in the tracking community. Generally, those trackers learn a high-level semantic embedding space for feature representation but lose low-level fine-grained details. Meanwhile, the learned high-level semantic features are not updated during online tracking, which results in tracking drift in presence of target appearance variation and similar distractors. In this paper, we present a novel end-to-end trainable Convolutional Neural Network (CNN) based on the siamese network for distractor-aware tracking. It enhances target appearance representation in both the offline training stage and online tracking stage. In the offline training stage, this network learns both the low-level fine-grained details and high-level coarse-grained semantics simultaneously in a multi-task learning framework. The low-level features with better resolution are complementary to semantic features and able to distinguish the foreground target from background distractors. In the online stage, the learned low-level features are fed into a correlation filter layer and updated in an interpolated manner to encode target appearance variation adaptively. The learned high-level features are fed into a cross-correlation layer without online update. Therefore, the proposed tracker benefits from both the adaptability of the fine-grained correlation filter and the generalization capability of the semantic embedding. Extensive experiments are conducted on the public OTB100 and UAV123 benchmark datasets. Our tracker achieves state-of-the-art performance while running with a real-time frame-rate.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5279
Author(s):  
Yang Li ◽  
Huahu Xu ◽  
Junsheng Xiao

Language-based person search retrieves images of a target person using natural language description and is a challenging fine-grained cross-modal retrieval task. A novel hybrid attention network is proposed for the task. The network includes the following three aspects: First, a cubic attention mechanism for person image, which combines cross-layer spatial attention and channel attention. It can fully excavate both important midlevel details and key high-level semantics to obtain better discriminative fine-grained feature representation of a person image. Second, a text attention network for language description, which is based on bidirectional LSTM (BiLSTM) and self-attention mechanism. It can better learn the bidirectional semantic dependency and capture the key words of sentences, so as to extract the context information and key semantic features of the language description more effectively and accurately. Third, a cross-modal attention mechanism and a joint loss function for cross-modal learning, which can pay more attention to the relevant parts between text and image features. It can better exploit both the cross-modal and intra-modal correlation and can better solve the problem of cross-modal heterogeneity. Extensive experiments have been conducted on the CUHK-PEDES dataset. Our approach obtains higher performance than state-of-the-art approaches, demonstrating the advantage of the approach we propose.


Author(s):  
Dima Damen ◽  
Hazel Doughty ◽  
Giovanni Maria Farinella ◽  
Antonino Furnari ◽  
Evangelos Kazakos ◽  
...  

AbstractThis paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (Damen in Scaling egocentric vision: ECCV, 2018), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the “test of time”—i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.


Author(s):  
Xu Duan ◽  
Jingzheng Wu ◽  
Shouling Ji ◽  
Zhiqing Rui ◽  
Tianyue Luo ◽  
...  

With the explosive development of information technology, vulnerabilities have become one of the major threats to computer security. Most vulnerabilities with similar patterns can be detected effectively by static analysis methods. However, some vulnerable and non-vulnerable code is hardly distinguishable, resulting in low detection accuracy. In this paper, we define the accurate identification of vulnerabilities in similar code as a fine-grained vulnerability detection problem. We propose VulSniper which is designed to detect fine-grained vulnerabilities more effectively. In VulSniper, attention mechanism is used to capture the critical features of the vulnerabilities. Especially, we use bottom-up and top-down structures to learn the attention weights of different areas of the program. Moreover, in order to fully extract the semantic features of the program, we generate the code property graph, design a 144-dimensional vector to describe the relation between the nodes, and finally encode the program as a feature tensor. VulSniper achieves F1-scores of 80.6% and 73.3% on the two benchmark datasets, the SARD Buffer Error dataset and the SARD Resource Management Error dataset respectively, which are significantly higher than those of the state-of-the-art methods.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 103629-103638 ◽  
Author(s):  
Jian Xiong ◽  
Liguo Lu ◽  
Hengbing Wang ◽  
Jie Yang ◽  
Guan Gui

Author(s):  
Yaparla Ganesh ◽  
Allaparthi Sri Teja ◽  
Sai Krishna Munnangi ◽  
Garimella Rama Murthy

Sign in / Sign up

Export Citation Format

Share Document