scholarly journals Sentence Compression Using BERT and Graph Convolutional Networks

2021 ◽  
Vol 11 (21) ◽  
pp. 9910
Author(s):  
Yo-Han Park ◽  
Gyong-Ho Lee ◽  
Yong-Seok Choi ◽  
Kong-Joo Lee

Sentence compression is a natural language-processing task that produces a short paraphrase of an input sentence by deleting words from the input sentence while ensuring grammatical correctness and preserving meaningful core information. This study introduces a graph convolutional network (GCN) into a sentence compression task to encode syntactic information, such as dependency trees. As we upgrade the GCN to activate a directed edge, the compression model with the GCN layers can distinguish between parent and child nodes in a dependency tree when aggregating adjacent nodes. Furthermore, by increasing the number of GCN layers, the model can gradually collect high-order information of a dependency tree when propagating node information through the layers. We implement a sentence compression model for Korean and English, respectively. This model consists of three components: pre-trained BERT model, GCN layers, and a scoring layer. The scoring layer can determine whether a word should remain in a compressed sentence by relying on the word vector containing contextual and syntactic information encoded by BERT and GCN layers. To train and evaluate the proposed model, we used the Google sentence compression dataset for English and a Korean sentence compression corpus containing about 140,000 sentence pairs for Korean. The experimental results demonstrate that the proposed model achieves state-of-the-art performance for English. To the best of our knowledge, this sentence compression model based on the deep learning model trained with a large-scale corpus is the first attempt for Korean.

2020 ◽  
Vol 10 (3) ◽  
pp. 957 ◽  
Author(s):  
Luwei Xiao ◽  
Xiaohui Hu ◽  
Yinong Chen ◽  
Yun Xue ◽  
Donghong Gu ◽  
...  

Targeted sentiment classification aims to predict the emotional trend of a specific goal. Currently, most methods (e.g., recurrent neural networks and convolutional neural networks combined with an attention mechanism) are not able to fully capture the semantic information of the context and they also lack a mechanism to explain the relevant syntactical constraints and long-range word dependencies. Therefore, syntactically irrelevant context words may mistakenly be recognized as clues to predict the target sentiment. To tackle these problems, this paper considers that the semantic information, syntactic information, and their interaction information are very crucial to targeted sentiment analysis, and propose an attentional-encoding-based graph convolutional network (AEGCN) model. Our proposed model is mainly composed of multi-head attention and an improved graph convolutional network built over the dependency tree of a sentence. Pre-trained BERT is applied to this task, and new state-of-art performance is achieved. Experiments on five datasets show the effectiveness of the model proposed in this paper compared with a series of the latest models.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Fangyuan Lei ◽  
Xun Liu ◽  
Qingyun Dai ◽  
Bingo Wing-Kuen Ling ◽  
Huimin Zhao ◽  
...  

With the higher-order neighborhood information of a graph network, the accuracy of graph representation learning classification can be significantly improved. However, the current higher-order graph convolutional networks have a large number of parameters and high computational complexity. Therefore, we propose a hybrid lower-order and higher-order graph convolutional network (HLHG) learning model, which uses a weight sharing mechanism to reduce the number of network parameters. To reduce the computational complexity, we propose a novel information fusion pooling layer to combine the high-order and low-order neighborhood matrix information. We theoretically compare the computational complexity and the number of parameters of the proposed model with those of the other state-of-the-art models. Experimentally, we verify the proposed model on large-scale text network datasets using supervised learning and on citation network datasets using semisupervised learning. The experimental results show that the proposed model achieves higher classification accuracy with a small set of trainable weight parameters.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 566
Author(s):  
Xiaoqiang Chi ◽  
Yang Xiang

Paraphrase generation is an important yet challenging task in natural language processing. Neural network-based approaches have achieved remarkable success in sequence-to-sequence learning. Previous paraphrase generation work generally ignores syntactic information regardless of its availability, with the assumption that neural nets could learn such linguistic knowledge implicitly. In this work, we make an endeavor to probe into the efficacy of explicit syntactic information for the task of paraphrase generation. Syntactic information can appear in the form of dependency trees, which could be easily acquired from off-the-shelf syntactic parsers. Such tree structures could be conveniently encoded via graph convolutional networks to obtain more meaningful sentence representations, which could improve generated paraphrases. Through extensive experiments on four paraphrase datasets with different sizes and genres, we demonstrate the utility of syntactic information in neural paraphrase generation under the framework of sequence-to-sequence modeling. Specifically, our graph convolutional network-enhanced models consistently outperform their syntax-agnostic counterparts using multiple evaluation metrics.


2021 ◽  
Vol 11 (8) ◽  
pp. 3640
Author(s):  
Guangtao Xu ◽  
Peiyu Liu ◽  
Zhenfang Zhu ◽  
Jie Liu ◽  
Fuyong Xu

The purpose of aspect-based sentiment classification is to identify the sentiment polarity of each aspect in a sentence. Recently, due to the introduction of Graph Convolutional Networks (GCN), more and more studies have used sentence structure information to establish the connection between aspects and opinion words. However, the accuracy of these methods is limited by noise information and dependency tree parsing performance. To solve this problem, we proposed an attention-enhanced graph convolutional network (AEGCN) for aspect-based sentiment classification with multi-head attention (MHA). Our proposed method can better combine semantic and syntactic information by introducing MHA and GCN. We also added an attention mechanism to GCN to enhance its performance. In order to verify the effectiveness of our proposed method, we conducted a lot of experiments on five benchmark datasets. The experimental results show that our proposed method can make more reasonable use of semantic and syntactic information, and further improve the performance of GCN.


Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 79 ◽  
Author(s):  
Xiaoyu Han ◽  
Yue Zhang ◽  
Wenkai Zhang ◽  
Tinglei Huang

Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to be helpful in relation extraction. Additional information such as entity type getting by NER (Named Entity Recognition) and description provided by knowledge base both have their limitations. Nevertheless, there exists another way to provide additional information which can overcome these limitations in Chinese relation extraction. As Chinese characters usually have explicit meanings and can carry more information than English letters. We suggest that characters that constitute the entities can provide additional information which is helpful for the relation extraction task, especially in large scale datasets. This assumption has never been verified before. The main obstacle is the lack of large-scale Chinese relation datasets. In this paper, first, we generate a large scale Chinese relation extraction dataset based on a Chinese encyclopedia. Second, we propose an attention-based model using the characters that compose the entities. The result on the generated dataset shows that these characters can provide useful information for the Chinese relation extraction task. By using this information, the attention mechanism we used can recognize the crucial part of the sentence that can express the relation. The proposed model outperforms other baseline models on our Chinese relation extraction dataset.


2020 ◽  
Vol 34 (01) ◽  
pp. 27-34 ◽  
Author(s):  
Lei Chen ◽  
Le Wu ◽  
Richang Hong ◽  
Kun Zhang ◽  
Meng Wang

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.


Author(s):  
Hao Zhou ◽  
Tom Young ◽  
Minlie Huang ◽  
Haizhou Zhao ◽  
Jingfang Xu ◽  
...  

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines. 


Author(s):  
Muhammad Raza Khan ◽  
Joshua E. Blumenstock

With the rapid expansion of mobile phone networks in developing countries, large-scale graph machine learning has gained sudden relevance in the study of global poverty. Recent applications range from humanitarian response and poverty estimation to urban planning and epidemic containment. Yet the vast majority of computational tools and algorithms used in these applications do not account for the multi-view nature of social networks: people are related in myriad ways, but most graph learning models treat relations as binary. In this paper, we develop a graph-based convolutional network for learning on multi-view networks. We show that this method outperforms state-of-the-art semi-supervised learning algorithms on three different prediction tasks using mobile phone datasets from three different developing countries. We also show that, while designed specifically for use in poverty research, the algorithm also outperforms existing benchmarks on a broader set of learning tasks on multi-view networks, including node labelling in citation networks.


2021 ◽  
Vol 11 (4) ◽  
pp. 1377
Author(s):  
Jun Long ◽  
Ye Wang ◽  
Xiangxiang Wei ◽  
Zhen Ding ◽  
Qianqian Qi ◽  
...  

Relation classification is an important task in the field of natural language processing, and it is one of the important steps in constructing a knowledge graph, which can greatly reduce the cost of constructing a knowledge graph. The Graph Convolutional Network (GCN) is an effective model for accurate relation classification, which models the dependency tree of textual instances to extract the semantic features of relation mentions. Previous GCN based methods treat each node equally. However, the contribution of different words to express a certain relation is different, especially the entity mentions in the sentence. In this paper, a novel GCN based relation classifier is propose, which treats the entity nodes as two global nodes in the dependency tree. These two global nodes directly connect with other nodes, which can aggregate information from the whole tree with only one convolutional layer. In this way, the method can not only simplify the complexity of the model, but also generate expressive relation representation. Experimental results on two widely used data sets, SemEval-2010 Task 8 and TACRED, show that our model outperforms all the compared baselines in this paper, which illustrates that the model can effectively utilize the dependencies between nodes and improve the performance of relation classification.


2021 ◽  
Vol 12 (1) ◽  
pp. 4
Author(s):  
Chengming Liu ◽  
Ronghua Fu ◽  
Yinghao Li ◽  
Yufei Gao ◽  
Lei Shi ◽  
...  

In this paper, we propose a new method for detecting abnormal human behavior based on skeleton features using self-attention augment graph convolution. The skeleton data have been proved to be robust to the complex background, illumination changes, and dynamic camera scenes and are naturally constructed as a graph in non-Euclidean space. Particularly, the establishment of spatial temporal graph convolutional networks (ST-GCN) can effectively learn the spatio-temporal relationships of Non-Euclidean Structure Data. However, it only operates on local neighborhood nodes and thereby lacks global information. We propose a novel spatial temporal self-attention augmented graph convolutional networks (SAA-Graph) by combining improved spatial graph convolution operator with a modified transformer self-attention operator to capture both local and global information of the joints. The spatial self-attention augmented module is used to understand the intra-frame relationships between human body parts. As far as we know, we are the first group to utilize self-attention for video anomaly detection tasks by enhancing spatial temporal graph convolution. Moreover, to validate the proposed model, we performed extensive experiments on two large-scale publicly standard datasets (i.e., ShanghaiTech Campus and CUHK Avenue datasets) which reveal the state-of-art performance for our proposed approach when compared to existing skeleton-based methods and graph convolution methods.


Sign in / Sign up

Export Citation Format

Share Document