scholarly journals Overview of BioASQ 2020: The Eighth BioASQ Challenge on Large-Scale Biomedical Semantic Indexing and Question Answering

Author(s):  
Anastasios Nentidis ◽  
Anastasia Krithara ◽  
Konstantinos Bougiatiotis ◽  
Martin Krallinger ◽  
Carlos Rodriguez-Penagos ◽  
...  
Author(s):  
Anastasia Krithara ◽  
Anastasios Nentidis ◽  
Georgios Paliouras ◽  
Martin Krallinger ◽  
Antonio Miranda

2015 ◽  
Vol 16 (1) ◽  
Author(s):  
George Tsatsaronis ◽  
Georgios Balikas ◽  
Prodromos Malakasiotis ◽  
Ioannis Partalas ◽  
Matthias Zschunke ◽  
...  

2021 ◽  
pp. 239-263
Author(s):  
Anastasios Nentidis ◽  
Georgios Katsimpras ◽  
Eirini Vandorou ◽  
Anastasia Krithara ◽  
Luis Gasco ◽  
...  

Author(s):  
Martin Krallinger ◽  
Anastasia Krithara ◽  
Anastasios Nentidis ◽  
Georgios Paliouras ◽  
Marta Villegas

2016 ◽  
Author(s):  
Eirini Papagiannopoulou ◽  
Yiannis Papanikolaou ◽  
Dimitris Dimitriadis ◽  
Sakis Lagopoulos ◽  
Grigorios Tsoumakas ◽  
...  

Author(s):  
Lianli Gao ◽  
Pengpeng Zeng ◽  
Jingkuan Song ◽  
Yuan-Fang Li ◽  
Wu Liu ◽  
...  

To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%.


2008 ◽  
Vol 7 (1) ◽  
pp. 182-191 ◽  
Author(s):  
Sebastian Klie ◽  
Lennart Martens ◽  
Juan Antonio Vizcaíno ◽  
Richard Côté ◽  
Phil Jones ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document