Temporal Attention and Consistency Measuring for Video Question Answering

Proceedings of the 2020 International Conference on Multimodal Interaction ◽

10.1145/3382507.3418886 ◽

2020 ◽

Author(s):

Lingyu Zhang ◽

Richard J. Radke

Keyword(s):

Question Answering ◽

Temporal Attention ◽

Video Question Answering

Download Full-text

Video Question Answering via Hierarchical Spatio-Temporal Attention Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/492 ◽

2017 ◽

Author(s):

Zhou Zhao ◽

Qifan Yang ◽

Deng Cai ◽

Xiaofei He ◽

Yueting Zhuang

Keyword(s):

Visual Information ◽

Large Scale ◽

Question Answering ◽

Temporal Dynamics ◽

Temporal Attention ◽

Attention Networks ◽

Learning Framework ◽

Spatio Temporal ◽

The Given ◽

Video Question Answering

Open-ended video question answering is a challenging problem in visual information retrieval, which automatically generates the natural language answer from the referenced video content according to the question. However, the existing visual question answering works only focus on the static image, which may be ineffectively applied to video question answering due to the temporal dynamics of video contents. In this paper, we consider the problem of open-ended video question answering from the viewpoint of spatio-temporal attentional encoder-decoder learning framework. We propose the hierarchical spatio-temporal attention network for learning the joint representation of the dynamic video contents according to the given question. We then develop the encoder-decoder learning method with reasoning recurrent neural networks for open-ended video question answering. We construct a large-scale video question answering dataset. The extensive experiments show the effectiveness of our method.

Download Full-text

Video Question Answering via Knowledge-based Progressive Spatial-Temporal Attention Network

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3321505 ◽

2019 ◽

Vol 15 (2s) ◽

pp. 1-22 ◽

Keyword(s):

Question Answering ◽

Temporal Attention ◽

Attention Network ◽

Knowledge Based ◽

Video Question Answering

Download Full-text

Integrating Video Retrieval and Moment Detection in a Unified Corpus for Video Question Answering

10.21437/interspeech.2019-1736 ◽

2019 ◽

Author(s):

Hongyin Luo ◽

Mitra Mohtarami ◽

James Glass ◽

Karthik Krishnamurthy ◽

Brigitte Richardson

Keyword(s):

Question Answering ◽

Video Retrieval ◽

Video Question Answering

Download Full-text

Fusing Temporally Distributed Multi-Modal Semantic Clues for Video Question Answering

2021 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme51207.2021.9428225 ◽

2021 ◽

Author(s):

Fuwei Zhang ◽

Ruomei Wang ◽

Songhua Xu ◽

Fan Zhou

Keyword(s):

Question Answering ◽

Modal Semantic ◽

Video Question Answering

Download Full-text

BVideoQA: Online English/Chinese bilingual video question answering

Journal of the American Society for Information Science and Technology ◽

10.1002/asi.21002 ◽

2009 ◽

Vol 60 (3) ◽

pp. 509-525 ◽

Author(s):

Yue-Shi Lee ◽

Yu-Chieh Wu ◽

Jie-Chi Yang

Keyword(s):

Question Answering ◽

Video Question Answering

Download Full-text

Multi-interaction Network with Object Relation for Video Question Answering

Proceedings of the 27th ACM International Conference on Multimedia ◽

10.1145/3343031.3351065 ◽

2019 ◽

Author(s):

Weike Jin ◽

Zhou Zhao ◽

Mao Gu ◽

Jun Yu ◽

Jun Xiao ◽

...

Keyword(s):

Question Answering ◽

Interaction Network ◽

Object Relation ◽

Video Question Answering

Download Full-text

Explore Multi-Step Reasoning in Video Question Answering

Proceedings of the 1st Workshop and Challenge on Comprehensive Video Understanding in the Wild - CoVieW'18 ◽

10.1145/3265987.3265996 ◽

2018 ◽

Author(s):

Yahong Han

Keyword(s):

Question Answering ◽

Video Question Answering

Download Full-text

Hierarchical Conditional Relation Networks for Video Question Answering

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr42600.2020.00999 ◽

2020 ◽

Author(s):

Thao Minh Le ◽

Vuong Le ◽

Svetha Venkatesh ◽

Truyen Tran

Keyword(s):

Question Answering ◽

Video Question Answering

Download Full-text

Heterogeneous Memory Enhanced Multimodal Attention Model for Video Question Answering

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) ◽

10.1109/cvpr.2019.00210 ◽

2019 ◽

Author(s):

Chenyou Fan ◽

Xiaofan Zhang ◽

Shu Zhang ◽

Wensheng Wang ◽

Chi Zhang ◽

...

Keyword(s):

Question Answering ◽

Attention Model ◽

Multimodal Attention ◽

Video Question Answering

Download Full-text

The forgettable-watcher model for video question answering

Neurocomputing ◽

10.1016/j.neucom.2018.06.069 ◽

2018 ◽

Vol 314 ◽

pp. 386-393 ◽

Author(s):

Wenqing Chu ◽

Hongyang Xue ◽

Zhou Zhao ◽

Deng Cai ◽

Chengwei Yao

Keyword(s):

Question Answering ◽

Video Question Answering

Download Full-text