GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

Deep learning based data-driven approaches have been successfully applied in various image understanding applications ranging from object recognition, semantic segmentation to visual question answering. However, the lack of knowledge integration as well as higher-level reasoning capabilities with the methods still pose a hindrance. In this work, we present a brief survey of a few representative reasoning mechanisms, knowledge integration methods and their corresponding image understanding applications developed by various groups of researchers, approaching the problem from a variety of angles. Furthermore, we discuss upon key efforts on integrating external knowledge with neural networks. Taking cues from these efforts, we conclude by discussing potential pathways to improve reasoning capabilities.

Download Full-text

CFGNN: Cross Flow Graph Neural Networks for Question Answering on Complex Tables

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6506 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9596-9603

Author(s):

Xuanyu Zhang

Keyword(s):

Neural Networks ◽

Large Scale ◽

Question Answering ◽

Cross Flow ◽

Context Information ◽

Flow Graph ◽

Graph Neural Networks ◽

Relationship Of ◽

The Relationship ◽

Parent Child

Question answering on complex tables is a challenging task for machines. In the Spider, a large-scale complex table dataset, relationships between tables and columns can be easily modeled as graph. But most of graph neural networks (GNNs) ignore the relationship of sibling nodes and use summation as aggregation function to model the relationship of parent-child nodes. It may cause nodes with less degrees, like column nodes in schema graph, to obtain little information. And the context information is important for natural language. To leverage more context information flow comprehensively, we propose novel cross flow graph neural networks in this paper. The information flows of parent-child and sibling nodes cross with history states between different layers. Besides, we use hierarchical encoding layer to obtain contextualized representation in tables. Experiments on the Spider show that our approach achieves substantial performance improvement comparing with previous GNN models and their variants.

Download Full-text

Deep Generative Probabilistic Graph Neural Networks for Scene Graph Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6783 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11237-11245

Author(s):

Mahmoud Khademi ◽

Oliver Schulte

Keyword(s):

Neural Networks ◽

Message Passing ◽

Question Answering ◽

Ground Truth ◽

Mass Function ◽

Probability Mass Function ◽

Scene Graph ◽

Probabilistic Graph ◽

Graph Neural Networks ◽

Graph Generation

We propose a new algorithm, called Deep Generative Probabilistic Graph Neural Networks (DG-PGNN), to generate a scene graph for an image. The input to DG-PGNN is an image, together with a set of region-grounded captions and object bounding-box proposals for the image. To generate the scene graph, DG-PGNN constructs and updates a new model, called a Probabilistic Graph Network (PGN). A PGN can be thought of as a scene graph with uncertainty: it represents each node and each edge by a CNN feature vector and defines a probability mass function (PMF) for node-type (object category) of each node and edge-type (predicate class) of each edge. The DG-PGNN sequentially adds a new node to the current PGN by learning the optimal ordering in a Deep Q-learning framework, where states are partial PGNs, actions choose a new node, and rewards are defined based on the ground-truth. After adding a node, DG-PGNN uses message passing to update the feature vectors of the current PGN by leveraging contextual relationship information, object co-occurrences, and language priors from captions. The updated features are then used to fine-tune the PMFs. Our experiments show that the proposed algorithm significantly outperforms the state-of-the-art results on the Visual Genome dataset for scene graph generation. We also show that the scene graphs constructed by DG-PGNN improve performance on the visual question answering task, for questions that need reasoning about objects and their interactions in the scene context.

Download Full-text

Visual Question Answering using Convolutional Neural Networks

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i1s.1602 ◽

2021 ◽

Vol 12 (1S) ◽

pp. 170-175

Author(s):

K. P. Moholkar, Et. al.

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Neural Networks ◽

Computer Vision ◽

Recurrent Neural Network ◽

Question Answering ◽

Question Answering System ◽

Visual Question Answering ◽

Generalized System ◽

Multiple Scenarios

The ability of a computer system to be able to understand surroundings and elements and to think like a human being to process the information has always been the major point of focus in the field of Computer Science. One of the ways to achieve this artificial intelligence is Visual Question Answering. Visual Question Answering (VQA) is a trained system which can answer the questions associated to a given image in Natural Language. VQA is a generalized system which can be used in any image-based scenario with adequate training on the relevant data. This is achieved with the help of Neural Networks, particularly Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In this study, we have compared different approaches of VQA, out of which we are exploring CNN based model. With the continued progress in the field of Computer Vision and Question answering system, Visual Question Answering is becoming the essential system which can handle multiple scenarios with their respective data.

Download Full-text

Graph Neural Networks for Prediction of Fuel Ignition Quality

10.26434/chemrxiv.12280325.v1 ◽

2020 ◽

Author(s):

Artur Schweidtmann ◽

Jan Rittig ◽

Andrea König ◽

Martin Grohe ◽

Alexander Mitsos ◽

...

Keyword(s):

Neural Networks ◽

Octane Number ◽

Molecular Graph ◽

Chemical Properties ◽

Graph Representation ◽

Structure Property ◽

Oxygenated Hydrocarbons ◽

Physico Chemical ◽

Ignition Quality ◽

Graph Neural Networks

<div>Prediction of combustion-related properties of (oxygenated) hydrocarbons is an important and challenging task for which quantitative structure-property relationship (QSPR) models are frequently employed. Recently, a machine learning method, graph neural networks (GNNs), has shown promising results for the prediction of structure-property relationships. GNNs utilize a graph representation of molecules, where atoms correspond to nodes and bonds to edges containing information about the molecular structure. More specifically, GNNs learn physico-chemical properties as a function of the molecular graph in a supervised learning setup using a backpropagation algorithm. This end-to-end learning approach eliminates the need for selection of molecular descriptors or structural groups, as it learns optimal fingerprints through graph convolutions and maps the fingerprints to the physico-chemical properties by deep learning. We develop GNN models for predicting three fuel ignition quality indicators, i.e., the derived cetane number (DCN), the research octane number (RON), and the motor octane number (MON), of oxygenated and non-oxygenated hydrocarbons. In light of limited experimental data in the order of hundreds, we propose a combination of multi-task learning, transfer learning, and ensemble learning. The results show competitive performance of the proposed GNN approach compared to state-of-the-art QSPR models making it a promising field for future research. The prediction tool is available via a web front-end at www.avt.rwth-aachen.de/gnn.</div>

Download Full-text