Message Passing Attention Networks for Document Understanding

Giannis Nikolentzos; Antoine Tixier; Michalis Vazirgiannis

doi:10.1609/aaai.v34i05.6376

Message Passing Attention Networks for Document Understanding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6376 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8544-8551 ◽

Cited By ~ 2

Author(s):

Giannis Nikolentzos ◽

Antoine Tixier ◽

Michalis Vazirgiannis

Keyword(s):

Neural Networks ◽

Text Classification ◽

Message Passing ◽

State Of The Art ◽

Structured Data ◽

Attention Networks ◽

Document Understanding ◽

Standard Text ◽

Graph Neural Networks ◽

The Impact

Graph neural networks have recently emerged as a very effective framework for processing graph-structured data. These models have achieved state-of-the-art performance in many tasks. Most graph neural networks can be described in terms of message passing, vertex update, and readout functions. In this paper, we represent documents as word co-occurrence networks and propose an application of the message passing framework to NLP, the Message Passing Attention network for Document understanding (MPAD). We also propose several hierarchical variants of MPAD. Experiments conducted on 10 standard text classification datasets show that our architectures are competitive with the state-of-the-art. Ablation studies reveal further insights about the impact of the different components on performance. Code is publicly available at: https://github.com/giannisnik/mpad.

Download Full-text

Multi-View Attribute Graph Convolution Networks for Clustering

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/411 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jiafeng Cheng ◽

Qianqian Wang ◽

Zhiqiang Tao ◽

Deyan Xie ◽

Quanxue Gao

Keyword(s):

Neural Networks ◽

State Of The Art ◽

Graph Embedding ◽

Structured Data ◽

Attention Networks ◽

Graph Data ◽

Graph Reconstruction ◽

Node Attributes ◽

Graph Neural Networks ◽

Geometric Relationship

Graph neural networks (GNNs) have made considerable achievements in processing graph-structured data. However, existing methods can not allocate learnable weights to different nodes in the neighborhood and lack of robustness on account of neglecting both node attributes and graph reconstruction. Moreover, most of multi-view GNNs mainly focus on the case of multiple graphs, while designing GNNs for solving graph-structured data of multi-view attributes is still under-explored. In this paper, we propose a novel Multi-View Attribute Graph Convolution Networks (MAGCN) model for the clustering task. MAGCN is designed with two-pathway encoders that map graph embedding features and learn the view-consistency information. Specifically, the first pathway develops multi-view attribute graph attention networks to reduce the noise/redundancy and learn the graph embedding features for each multi-view graph data. The second pathway develops consistent embedding encoders to capture the geometric relationship and probability distribution consistency among different views, which adaptively finds a consistent clustering embedding space for multi-view attributes. Experiments on three benchmark graph datasets show the superiority of our method compared with several state-of-the-art algorithms.

Download Full-text

Coloring Graph Neural Networks for Node Disambiguation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/294 ◽

2020 ◽

Author(s):

George Dasoulas ◽

Ludovic Dos Santos ◽

Kevin Scaman ◽

Aladin Virmaux

Keyword(s):

Neural Networks ◽

Message Passing ◽

State Of The Art ◽

Structural Characteristics ◽

Expressive Power ◽

Continuous Functions ◽

Graph Classification ◽

Node Attributes ◽

Graph Neural Networks ◽

Coloring Graph

In this paper, we show that a simple coloring scheme can improve, both theoretically and empirically, the expressive power of Message Passing Neural Networks (MPNNs). More specifically, we introduce a graph neural network called Colored Local Iterative Procedure (CLIP) that uses colors to disambiguate identical node attributes, and show that this representation is a universal approximator of continuous functions on graphs with node attributes. Our method relies on separability, a key topological characteristic that allows to extend well-chosen neural networks into universal representations. Finally, we show experimentally that CLIP is capable of capturing structural characteristics that traditional MPNNs fail to distinguish, while being state-of-the-art on benchmark graph classification datasets.

Download Full-text

UniGNN: a Unified Framework for Graph and Hypergraph Neural Networks

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/353 ◽

2021 ◽

Author(s):

Jing Huang ◽

Jie Yang

Keyword(s):

Neural Networks ◽

Message Passing ◽

State Of The Art ◽

Representation Learning ◽

Graph Representation ◽

Challenging Problem ◽

Unified Framework ◽

Real World Datasets ◽

Graph Neural Networks ◽

Research Domains

Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neural Networks (GNNs) for graph representation learning, how to adapt the powerful GNN-variants directly into hypergraphs remains a challenging problem. In this paper, we propose UniGNN, a unified framework for interpreting the message passing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs. In this framework, meticulously-designed architectures aiming to deepen GNNs can also be incorporated into hypergraphs with the least effort. Extensive experiments have been conducted to demonstrate the effectiveness of UniGNN on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin. Especially for the DBLP dataset, we increase the accuracy from 77.4% to 88.8% in the semi-supervised hypernode classification task. We further prove that the proposed message-passing based UniGNN models are at most as powerful as the 1-dimensional Generalized Weisfeiler-Leman (1-GWL) algorithm in terms of distinguishing non-isomorphic hypergraphs. Our code is available at https://github.com/OneForward/UniGNN.

Download Full-text

Learning Multi-Task Communication with Message Passing for Sequence Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014360 ◽

2019 ◽

Vol 33 ◽

pp. 4360-4367 ◽

Cited By ~ 1

Author(s):

Pengfei Liu ◽

Jie Fu ◽

Yue Dong ◽

Xipeng Qiu ◽

Jackie Chi Kit Cheung

Keyword(s):

Neural Networks ◽

Transfer Learning ◽

Text Classification ◽

Sequence Learning ◽

Message Passing ◽

Ad Hoc ◽

General Graph ◽

Learning Framework ◽

Task Learning ◽

Graph Neural Networks

We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different tasks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks, and propose a general graph multi-task learning framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labelling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines, but also learn interpretable and transferable patterns across tasks.

Download Full-text

Improving Attention Mechanism in Graph Neural Networks via Cardinality Preservation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/194 ◽

2020 ◽

Author(s):

Shuo Zhang ◽

Lei Xie

Keyword(s):

Neural Networks ◽

Theoretical Analysis ◽

Message Passing ◽

Representation Learning ◽

Attention Mechanism ◽

Structured Data ◽

Clear Understanding ◽

Graph Classification ◽

Competitive Performance ◽

Graph Neural Networks

Graph Neural Networks (GNNs) are powerful for the representation learning of graph-structured data. Most of the GNNs use a message-passing scheme, where the embedding of a node is iteratively updated by aggregating the information from its neighbors. To achieve a better expressive capability of node influences, attention mechanism has grown to be popular to assign trainable weights to the nodes in aggregation. Though the attention-based GNNs have achieved remarkable results in various tasks, a clear understanding of their discriminative capacities is missing. In this work, we present a theoretical analysis of the representational properties of the GNN that adopts the attention mechanism as an aggregator. Our analysis determines all cases when those attention-based GNNs can always fail to distinguish certain distinct structures. Those cases appear due to the ignorance of cardinality information in attention-based aggregation. To improve the performance of attention-based GNNs, we propose cardinality preserved attention (CPA) models that can be applied to any kind of attention mechanisms. Our experiments on node and graph classification confirm our theoretical analysis and show the competitive performance of our CPA models. The code is available online: https://github.com/zetayue/CPA.

Download Full-text

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Download Full-text

Hybrid Graph Neural Networks for Crowd Counting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6839 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11693-11700 ◽

Cited By ~ 2

Author(s):

Ao Luo ◽

Fan Yang ◽

Xin Li ◽

Dong Nie ◽

Zhicheng Jiao ◽

...

Keyword(s):

Network Architecture ◽

Message Passing ◽

Large Scale ◽

State Of The Art ◽

Density Variation ◽

Feature Maps ◽

Crowd Counting ◽

Multi Scale ◽

Crowd Density ◽

Graph Neural Networks

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.

Download Full-text

Graph Neural Networks Meet Neural-Symbolic Computing: A Survey and Perspective

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/679 ◽

2020 ◽

Cited By ~ 2

Author(s):

Luís C. Lamb ◽

Artur d’Avila Garcez ◽

Marco Gori ◽

Marcelo O.R. Prates ◽

Pedro H.C. Avelar ◽

...

Keyword(s):

Neural Networks ◽

Constraint Satisfaction ◽

State Of The Art ◽

Relational Reasoning ◽

Widespread Application ◽

Symbolic Computing ◽

Industry Research ◽

Optimization Constraint ◽

The Subject ◽

Graph Neural Networks

Neural-symbolic computing has now become the subject of interest of both academic and industry research laboratories. Graph Neural Networks (GNNs) have been widely used in relational and symbolic domains, with widespread application of GNNs in combinatorial optimization, constraint satisfaction, relational reasoning and other scientific domains. The need for improved explainability, interpretability and trust of AI systems in general demands principled methodologies, as suggested by neural-symbolic computing. In this paper, we review the state-of-the-art on the use of GNNs as a model of neural-symbolic computing. This includes the application of GNNs in several domains as well as their relationship to current developments in neural-symbolic computing.

Download Full-text

Gaussian Transformer: A Lightweight Approach for Natural Language Inference

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016489 ◽

2019 ◽

Vol 33 ◽

pp. 6489-6496 ◽

Cited By ~ 2

Author(s):

Maosheng Guo ◽

Yu Zhang ◽

Ting Liu

Keyword(s):

Neural Networks ◽

Natural Language ◽

State Of The Art ◽

Research Area ◽

High Order Interaction ◽

Training Time ◽

Attention Networks ◽

Local Dependency ◽

Active Research ◽

Active Research Area

Natural Language Inference (NLI) is an active research area, where numerous approaches based on recurrent neural networks (RNNs), convolutional neural networks (CNNs), and self-attention networks (SANs) has been proposed. Although obtaining impressive performance, previous recurrent approaches are hard to train in parallel; convolutional models tend to cost more parameters, while self-attention networks are not good at capturing local dependency of texts. To address this problem, we introduce a Gaussian prior to selfattention mechanism, for better modeling the local structure of sentences. Then we propose an efficient RNN/CNN-free architecture named Gaussian Transformer for NLI, which consists of encoding blocks modeling both local and global dependency, high-order interaction blocks collecting the evidence of multi-step inference, and a lightweight comparison block saving lots of parameters. Experiments show that our model achieves new state-of-the-art performance on both SNLI and MultiNLI benchmarks with significantly fewer parameters and considerably less training time. Besides, evaluation using the Hard NLI datasets demonstrates that our approach is less affected by the undesirable annotation artifacts.

Download Full-text

Active Learning for Node Classification: An Evaluation

Entropy ◽

10.3390/e22101164 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1164

Author(s):

Kaushalya Madhawa ◽

Tsuyoshi Murata

Keyword(s):

Neural Networks ◽

Active Learning ◽

State Of The Art ◽

Learning Algorithms ◽

Classification Performance ◽

Data Types ◽

Neural Network Models ◽

Attributed Graph ◽

Attributed Graphs ◽

Graph Neural Networks

Current breakthroughs in the field of machine learning are fueled by the deployment of deep neural network models. Deep neural networks models are notorious for their dependence on large amounts of labeled data for training them. Active learning is being used as a solution to train classification models with less labeled instances by selecting only the most informative instances for labeling. This is especially important when the labeled data are scarce or the labeling process is expensive. In this paper, we study the application of active learning on attributed graphs. In this setting, the data instances are represented as nodes of an attributed graph. Graph neural networks achieve the current state-of-the-art classification performance on attributed graphs. The performance of graph neural networks relies on the careful tuning of their hyperparameters, usually performed using a validation set, an additional set of labeled instances. In label scarce problems, it is realistic to use all labeled instances for training the model. In this setting, we perform a fair comparison of the existing active learning algorithms proposed for graph neural networks as well as other data types such as images and text. With empirical results, we demonstrate that state-of-the-art active learning algorithms designed for other data types do not perform well on graph-structured data. We study the problem within the framework of the exploration-vs.-exploitation trade-off and propose a new count-based exploration term. With empirical evidence on multiple benchmark graphs, we highlight the importance of complementing uncertainty-based active learning models with an exploration term.

Download Full-text