scholarly journals Contextualized Non-Local Neural Networks for Sequence Learning

Author(s):  
Pengfei Liu ◽  
Shuaichen Chang ◽  
Xuanjing Huang ◽  
Jian Tang ◽  
Jackie Chi Kit Cheung

Recently, a large number of neural mechanisms and models have been proposed for sequence learning, of which selfattention, as exemplified by the Transformer model, and graph neural networks (GNNs) have attracted much attention. In this paper, we propose an approach that combines and draws on the complementary strengths of these two methods. Specifically, we propose contextualized non-local neural networks (CN3), which can both dynamically construct a task-specific structure of a sentence and leverage rich local dependencies within a particular neighbourhood.Experimental results on ten NLP tasks in text classification, semantic matching, and sequence labelling show that our proposed model outperforms competitive baselines and discovers task-specific dependency structures, thus providing better interpretability to users.

Author(s):  
Pengfei Liu ◽  
Jie Fu ◽  
Yue Dong ◽  
Xipeng Qiu ◽  
Jackie Chi Kit Cheung

We present two architectures for multi-task learning with neural sequence models. Our approach allows the relationships between different tasks to be learned dynamically, rather than using an ad-hoc pre-defined structure as in previous work. We adopt the idea from message-passing graph neural networks, and propose a general graph multi-task learning framework in which different tasks can communicate with each other in an effective and interpretable way. We conduct extensive experiments in text classification and sequence labelling to evaluate our approach on multi-task learning and transfer learning. The empirical results show that our models not only outperform competitive baselines, but also learn interpretable and transferable patterns across tasks.


2021 ◽  
Author(s):  
Yonghao Liu ◽  
Renchu Guan ◽  
Fausto Giunchiglia ◽  
Yanchun Liang ◽  
Xiaoyue Feng

2020 ◽  
Vol 34 (04) ◽  
pp. 6656-6663 ◽  
Author(s):  
Huaxiu Yao ◽  
Chuxu Zhang ◽  
Ying Wei ◽  
Meng Jiang ◽  
Suhang Wang ◽  
...  

Towards the challenging problem of semi-supervised node classification, there have been extensive studies. As a frontier, Graph Neural Networks (GNNs) have aroused great interest recently, which update the representation of each node by aggregating information of its neighbors. However, most GNNs have shallow layers with a limited receptive field and may not achieve satisfactory performance especially when the number of labeled nodes is quite small. To address this challenge, we innovatively propose a graph few-shot learning (GFL) algorithm that incorporates prior knowledge learned from auxiliary graphs to improve classification accuracy on the target graph. Specifically, a transferable metric space characterized by a node embedding and a graph-specific prototype embedding function is shared between auxiliary graphs and the target, facilitating the transfer of structural knowledge. Extensive experiments and ablation studies on four real-world graph datasets demonstrate the effectiveness of our proposed model and the contribution of each component.


Author(s):  
Jingyun Xu ◽  
Yi Cai

Some text classification methods don’t work well on short texts due to the data sparsity. What’s more, they don’t fully exploit context-relevant knowledge. In order to tackle these problems, we propose a neural network to incorporate context-relevant knowledge into a convolutional neural network for short text classification. Our model consists of two modules. The first module utilizes two layers to extract concept and context features respectively and then employs an attention layer to extract those context-relevant concepts. The second module utilizes a convolutional neural network to extract high-level features from the word and the contextrelevant concept features. The experimental results on three datasets show that our proposed model outperforms the stateof-the-art models.


2021 ◽  
Author(s):  
Ge Lan ◽  
Ye Li ◽  
Mengting Hu ◽  
Yufei Sun ◽  
Yuzhi Zhang

2020 ◽  
Vol 34 (05) ◽  
pp. 7464-7471
Author(s):  
Deng Cai ◽  
Wai Lam

The dominant graph-to-sequence transduction models employ graph neural networks for graph representation learning, where the structural information is reflected by the receptive field of neurons. Unlike graph neural networks that restrict the information exchange between immediate neighborhood, we propose a new model, known as Graph Transformer, that uses explicit relation encoding and allows direct communication between two distant nodes. It provides a more efficient way for global graph structure modeling. Experiments on the applications of text generation from Abstract Meaning Representation (AMR) and syntax-based neural machine translation show the superiority of our proposed model. Specifically, our model achieves 27.4 BLEU on LDC2015E86 and 29.7 BLEU on LDC2017T10 for AMR-to-text generation, outperforming the state-of-the-art results by up to 2.2 points. On the syntax-based translation tasks, our model establishes new single-model state-of-the-art BLEU scores, 21.3 for English-to-German and 14.1 for English-to-Czech, improving over the existing best results, including ensembles, by over 1 BLEU.


2020 ◽  
Vol 34 (05) ◽  
pp. 8544-8551 ◽  
Author(s):  
Giannis Nikolentzos ◽  
Antoine Tixier ◽  
Michalis Vazirgiannis

Graph neural networks have recently emerged as a very effective framework for processing graph-structured data. These models have achieved state-of-the-art performance in many tasks. Most graph neural networks can be described in terms of message passing, vertex update, and readout functions. In this paper, we represent documents as word co-occurrence networks and propose an application of the message passing framework to NLP, the Message Passing Attention network for Document understanding (MPAD). We also propose several hierarchical variants of MPAD. Experiments conducted on 10 standard text classification datasets show that our architectures are competitive with the state-of-the-art. Ablation studies reveal further insights about the impact of the different components on performance. Code is publicly available at: https://github.com/giannisnik/mpad.


2018 ◽  
Author(s):  
Daniel Beck ◽  
Gholamreza Haffari ◽  
Trevor Cohn

Sign in / Sign up

Export Citation Format

Share Document