Context-Aware Attention Network for Image-Text Retrieval

Author(s):  
Qi Zhang ◽  
Zhen Lei ◽  
Zhaoxiang Zhang ◽  
Stan Z. Li
2019 ◽  
Vol 31 (12) ◽  
pp. 9295-9305 ◽  
Author(s):  
Jiaxu Leng ◽  
Ying Liu ◽  
Shang Chen

2021 ◽  
Author(s):  
Jie Cao ◽  
Shengsheng Qian ◽  
Huaiwen Zhang ◽  
Quan Fang ◽  
Changsheng Xu

2021 ◽  
Vol 10 (5) ◽  
pp. 336
Author(s):  
Jian Yu ◽  
Meng Zhou ◽  
Xin Wang ◽  
Guoliang Pu ◽  
Chengqi Cheng ◽  
...  

Forecasting the motion of surrounding vehicles is necessary for an autonomous driving system applied in complex traffic. Trajectory prediction helps vehicles make more sensible decisions, which provides vehicles with foresight. However, traditional models consider the trajectory prediction as a simple sequence prediction task. The ignorance of inter-vehicle interaction and environment influence degrades these models in real-world datasets. To address this issue, we propose a novel Dynamic and Static Context-aware Attention Network named DSCAN in this paper. The DSCAN utilizes an attention mechanism to dynamically decide which surrounding vehicles are more important at the moment. We also equip the DSCAN with a constraint network to consider the static environment information. We conducted a series of experiments on a real-world dataset, and the experimental results demonstrated the effectiveness of our model. Moreover, the present study suggests that the attention mechanism and static constraints enhance the prediction results.


2021 ◽  
Author(s):  
Xiaoshuai Hao ◽  
Yucan Zhou ◽  
Dayan Wu ◽  
Wanqian Zhang ◽  
Bo Li ◽  
...  

2020 ◽  
pp. 1-1
Author(s):  
Yaxiong Wang ◽  
Hao Yang ◽  
Xiuxiu Bai ◽  
Xueming Qian ◽  
Lin Ma ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Xiaodong Liu ◽  
Miao Wang

Recognition of human emotion from facial expression is affected by distortions of pictorial quality and facial pose, which is often ignored by traditional video emotion recognition methods. On the other hand, context information can also provide different degrees of extra clues, which can further improve the recognition accuracy. In this paper, we first build a video dataset with seven categories of human emotion, named human emotion in the video (HEIV). With the HEIV dataset, we trained a context-aware attention network (CAAN) to recognize human emotion. The network consists of two subnetworks to process both face and context information. Features from facial expression and context clues are fused to represent the emotion of video frames, which will be then passed through an attention network and generate emotion scores. Then, the emotion features of all frames will be aggregated according to their emotional score. Experimental results show that our proposed method is effective on HEIV dataset.


Sign in / Sign up

Export Citation Format

Share Document