Unified Adaptive Relevance Distinguishable Attention Network for Image-Text Matching

Image-text matching tasks have recently attracted a lot of attention in the computer vision field. The key point of this cross-domain problem is how to accurately measure the similarity between the visual and the textual contents, which demands a fine understanding of both modalities. In this paper, we propose a novel position focused attention network (PFAN) to investigate the relation between the visual and the textual views. In this work, we integrate the object position clue to enhance the visual-text joint-embedding learning. We first split the images into blocks, by which we infer the relative position of region in the image. Then, an attention mechanism is proposed to model the relations between the image region and blocks and generate the valuable position feature, which will be further utilized to enhance the region expression and model a more reliable relationship between the visual image and the textual sentence. Experiments on the popular datasets Flickr30K and MS-COCO show the effectiveness of the proposed method. Besides the public datasets, we also conduct experiments on our collected practical news dataset (Tencent-News) to validate the practical application value of proposed method. As far as we know, this is the first attempt to test the performance on the practical application. Our method can achieve the state-of-art performance on all of these three datasets.

Download Full-text

Dual Semantic Relationship Attention Network for Image-Text Matching

2020 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn48605.2020.9206782 ◽

2020 ◽

Author(s):

Keyu Wen ◽

Xiaodong Gu

Keyword(s):

Semantic Relationship ◽

Attention Network ◽

Text Matching

Download Full-text

Multi-Level Visual-Semantic Alignments with Relation-Wise Dual Attention Network for Image and Text Matching

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/111 ◽

2019 ◽

Cited By ~ 3

Author(s):

Zhibin Hu ◽

Yongsheng Luo ◽

Jiong Lin ◽

Yan Yan ◽

Jian Chen

Keyword(s):

Semantic Network ◽

Attention Network ◽

Main Challenge ◽

Concrete Objects ◽

Benchmark Datasets ◽

Dual Pathway ◽

Local Correlations ◽

Complete Set ◽

Multi Level ◽

Text Matching

Image-text matching is central to visual-semantic cross-modal retrieval and has been attracting extensive attention recently. Previous studies have been devoted to finding the latent correspondence between image regions and words, e.g., connecting key words to specific regions of salient objects. However, existing methods are usually committed to handle concrete objects, rather than abstract ones, e.g., a description of some action, which in fact are also ubiquitous in description texts of real-world. The main challenge in dealing with abstract objects is that there is no explicit connections between them, unlike their concrete counterparts. One therefore has to alternatively find the implicit and intrinsic connections between them. In this paper, we propose a relation-wise dual attention network (RDAN) for image-text matching. Specifically, we maintain an over-complete set that contains pairs of regions and words. Then built upon this set, we encode the local correlations and the global dependencies between regions and words by training a visual-semantic network. Then a dual pathway attention network is presented to infer the visual-semantic alignments and image-text similarity. Extensive experiments validate the efficacy of our method, by achieving the state-of-the-art performance on several public benchmark datasets.

Download Full-text

Neurofeedback-Training bei Kindern mit Aufmerksamkeitsdefizit-/ Hyperaktivitätsstörung (ADHS)

Zeitschrift für Kinder- und Jugendpsychiatrie und Psychotherapie ◽

10.1024/1422-4917/a000070 ◽

2010 ◽

Vol 38 (6) ◽

pp. 409-420 ◽

Cited By ~ 10

Author(s):

Holger Gevensleben ◽

Gunther H. Moll ◽

Hartmut Heinrich

Keyword(s):

Attention Network Test ◽

Sich Eine ◽

Neurofeedback Training ◽

Attention Network ◽

Klinische Wirksamkeit

Im Rahmen einer multizentrischen, randomisierten, kontrollierten Studie evaluierten wir die klinische Wirksamkeit eines Neurofeedback-Trainings (NF) bei Kindern mit einer Aufmerksamkeitsdefizit-/Hyperaktivitätsstörung (ADHS) und untersuchten die einem erfolgreichen Training zugrunde liegenden neurophysiologischen Wirkmechanismen. Als Vergleichstraining diente ein computergestütztes Aufmerksamkeitstraining, das dem Setting des Neurofeedback-Trainings in den wesentlichen Anforderungen und Rahmenbedingungen angeglichen war. Auf Verhaltensebene (Eltern- und Lehrerbeurteilung) zeigte sich das NF-Training nach Trainingsende dem Kontrolltraining sowohl hinsichtlich der ADHS-Kernsymptomatik als auch in assoziierten Bereichen überlegen. Für das Hauptzielkriterium (Verbesserung im FBB-HKS Gesamtwert) ergab sich eine mittlere Effektstärke (von 0.6). Sechs Monate nach Trainingsende (follow-up) konnte das gleiche Ergebnismuster gefunden werden. Die Ergebnisse legen somit den Schluss nahe, dass NF einen klinisch wirksamen Therapiebaustein zur Behandlung von Kindern mit ADHS darstellt. Auf neurophysiologischer Ebene (EEG; ereignisbezogene Potentiale, EPs) konnten für die beiden Neurofeedback-Protokolle Theta/Beta-Training und Training langsamer kortikaler Potentiale spezifische Effekte aufgezeigt werden. So war für das Theta/Beta-Training beispielsweise die Abnahme der Theta-Aktivität mit einer Reduzierung der ADHS-Symptomatik assoziiert. Für das SCP-Training wurde u. a. im Attention Network Test eine Erhöhung der kontingenten negativen Variation beobachtet, die die mobilisierten Ressourcen bei Vorbereitungsprozessen widerspiegelt. EEG- und EP-basierte Prädiktorvariablen konnten ermittelt werden. Der vorliegende Artikel bietet einen Gesamtüberblick über die in verschiedenen Publikationen unserer Arbeitsgruppe beschriebenen Ergebnisse der Studie und zeigt zukünftige Fragestellungen auf.

Download Full-text