scholarly journals Fine-grained Iterative Attention Network for Temporal Language Localization in Videos

Author(s):  
Xiaoye Qu ◽  
Pengwei Tang ◽  
Zhikang Zou ◽  
Yu Cheng ◽  
Jianfeng Dong ◽  
...  
2021 ◽  
pp. 620-631
Author(s):  
Xiang Li ◽  
Shenglan Liu ◽  
Yunheng Li ◽  
Hao Liu ◽  
Jinjing Zhao ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5279
Author(s):  
Yang Li ◽  
Huahu Xu ◽  
Junsheng Xiao

Language-based person search retrieves images of a target person using natural language description and is a challenging fine-grained cross-modal retrieval task. A novel hybrid attention network is proposed for the task. The network includes the following three aspects: First, a cubic attention mechanism for person image, which combines cross-layer spatial attention and channel attention. It can fully excavate both important midlevel details and key high-level semantics to obtain better discriminative fine-grained feature representation of a person image. Second, a text attention network for language description, which is based on bidirectional LSTM (BiLSTM) and self-attention mechanism. It can better learn the bidirectional semantic dependency and capture the key words of sentences, so as to extract the context information and key semantic features of the language description more effectively and accurately. Third, a cross-modal attention mechanism and a joint loss function for cross-modal learning, which can pay more attention to the relevant parts between text and image features. It can better exploit both the cross-modal and intra-modal correlation and can better solve the problem of cross-modal heterogeneity. Extensive experiments have been conducted on the CUHK-PEDES dataset. Our approach obtains higher performance than state-of-the-art approaches, demonstrating the advantage of the approach we propose.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 196013-196023
Author(s):  
Chunlong Hu ◽  
Junbin Gao ◽  
Jianjun Chen ◽  
Dengbiao Jiang ◽  
Yucheng Shu

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Shaoqi Hou ◽  
Chunhui Liu ◽  
Kangning Yin ◽  
Yiyin Ding ◽  
Zhiguo Wang ◽  
...  

Person Re-identification (Re-ID) is aimed at solving the matching problem of the same pedestrian at a different time and in different places. Due to the cross-device condition, the appearance of different pedestrians may have a high degree of similarity; at this time, using the global features of pedestrians to match often cannot achieve good results. In order to solve these problems, we designed a Spatial Attention Network Guided by Attribute Label (SAN-GAL), which is a dual-trace network containing both attribute classification and Re-ID. Different from the previous approach of simply adding a branch of attribute binary classification network, our SAN-GAL is mainly divided into two connecting steps. First, with attribute labels as guidance, we generate Attribute Attention Heat map (AAH) through Grad-CAM algorithm to accurately locate fine-grained attribute areas of pedestrians. Then, the Attribute Spatial Attention Module (ASAM) is constructed according to the AHH which is taken as the prior knowledge and introduced into the Re-ID network to assist in the discrimination of the Re-ID task. In particular, our SAN-GAL network can integrate the local attribute information and global ID information of pedestrians without introducing additional attribute region annotation, which has good flexibility and adaptability. The test results on Market1501 and DukeMTMC-reID show that our SAN-GAL can achieve good results and can achieve 85.8% Rank-1 accuracy on DukeMTMC-reID dataset, which is obviously competitive compared with most Re-ID algorithms.


2019 ◽  
Vol 11 (21) ◽  
pp. 2584 ◽  
Author(s):  
Yuan He ◽  
Xinyu Li ◽  
Xiaojun Jing

Short-range radar has become one of the latest sensor technologies for the Internet of Things (IoT), and it plays an increasingly vital role in IoT applications. As the essential task for various smart-sensing applications, radar-based human activity recognition and person identification have received more attention due to radar’s robustness to the environment and low power consumption. Activity recognition and person identification are generally treated as separate problems. However, designing different networks for these two tasks brings a high computational complexity and wastes of resources to some extent. Furthermore, there are some correlations in activity recognition and person identification tasks. In this work, we propose a multiscale residual attention network (MRA-Net) for joint activity recognition and person identification with radar micro-Doppler signatures. A fine-grained loss weight learning (FLWL) mechanism is presented for elaborating a multitask loss to optimize MRA-Net. In addition, we construct a new radar micro-Doppler dataset with dual labels of activity and identity. With the proposed model trained on this dataset, we demonstrate that our method achieves the state-of-the-art performance in both radar-based activity recognition and person identification tasks. The impact of the FLWL mechanism was further investigated, and ablation studies of the efficacy of each component in MRA-Net were also conducted.


Sign in / Sign up

Export Citation Format

Share Document