Zero-Shot Text Classification with Semantically Extended Graph Convolutional Network

Author(s):  
Tengfei Liu ◽  
Yongli Hu ◽  
Junbin Gao ◽  
Yanfeng Sun ◽  
Baocai Yin
2020 ◽  
Vol 2020 ◽  
pp. 1-7 ◽  
Author(s):  
Aboubakar Nasser Samatin Njikam ◽  
Huan Zhao

This paper introduces an extremely lightweight (with just over around two hundred thousand parameters) and computationally efficient CNN architecture, named CharTeC-Net (Character-based Text Classification Network), for character-based text classification problems. This new architecture is composed of four building blocks for feature extraction. Each of these building blocks, except the last one, uses 1 × 1 pointwise convolutional layers to add more nonlinearity to the network and to increase the dimensions within each building block. In addition, shortcut connections are used in each building block to facilitate the flow of gradients over the network, but more importantly to ensure that the original signal present in the training data is shared across each building block. Experiments on eight standard large-scale text classification and sentiment analysis datasets demonstrate CharTeC-Net’s superior performance over baseline methods and yields competitive accuracy compared with state-of-the-art methods, although CharTeC-Net has only between 181,427 and 225,323 parameters and weighs less than 1 megabyte.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 148865-148876
Author(s):  
Hengliang Tang ◽  
Yuan Mi ◽  
Fei Xue ◽  
Yang Cao

2021 ◽  
Vol 2137 (1) ◽  
pp. 012052
Author(s):  
Bingxin Xue ◽  
Cui Zhu ◽  
Xuan Wang ◽  
Wenjun Zhu

Abstract Recently, Graph Convolutional Neural Network (GCN) is widely used in text classification tasks, and has effectively completed tasks that are considered to have a rich relational structure. However, due to the sparse adjacency matrix constructed by GCN, GCN cannot make full use of context-dependent information in text classification, and cannot capture local information. The Bidirectional Encoder Representation from Transformers (BERT) has been shown to have the ability to capture the contextual information in a sentence or document, but its ability to capture global information about the vocabulary of a language is relatively limited. The latter is the advantage of GCN. Therefore, in this paper, Mutual Graph Convolution Networks (MGCN) is proposed to solve the above problems. It introduces semantic dictionary (WordNet), dependency and BERT. MGCN uses dependency to solve the problem of context dependence and WordNet to obtain more semantic information. Then the local information generated by BERT and the global information generated by GCN are interacted through the attention mechanism, so that they can influence each other and improve the classification effect of the model. The experimental results show that our model is more effective than previous research reports on three text classification data sets.


2021 ◽  
pp. 127-139
Author(s):  
Ting Pu ◽  
Shiqun Yin ◽  
Wenwen Li ◽  
Wenqiang Xu

2021 ◽  
pp. 664-675
Author(s):  
Bingquan Wang ◽  
Jie Liu ◽  
Shaowei Chen ◽  
Xiao Ling ◽  
Shanpeng Wang ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Fangyuan Lei ◽  
Xun Liu ◽  
Zhengming Li ◽  
Qingyun Dai ◽  
Senhong Wang

Graph convolutional network (GCN) is an efficient network for learning graph representations. However, it costs expensive to learn the high-order interaction relationships of the node neighbor. In this paper, we propose a novel graph convolutional model to learn and fuse multihop neighbor information relationships. We adopt the weight-sharing mechanism to design different order graph convolutions for avoiding the potential concerns of overfitting. Moreover, we design a new multihop neighbor information fusion (MIF) operator which mixes different neighbor features from 1-hop to k-hops. We theoretically analyse the computational complexity and the number of trainable parameters of our models. Experiment on text networks shows that the proposed models achieve state-of-the-art performance than the text GCN.


Author(s):  
L E Sapozhnikova ◽  
O A Gordeeva

In this article, the method of text classification using a convolutional neural network is presented. The problem of text classification is formulated, the architecture and the parameters of a convolutional neural network for solving the problem are described, the steps of the solution and the results of classification are given. The convolutional network which was used was trained to classify the texts of the news messages of Internet information portals. The semantic preprocessing of the text and the translation of words into attribute vectors are generated using the open word2vec model. The analysis of the dependence of the classification quality on the parameters of the neural network is presented. The using of the network allowed obtaining a classification accuracy of about 84%. In the estimation of the accuracy of the classification, the texts were checked to belong to the group of semantically similar classes. This approach allowed analyzing news messages in cases where the text themes and the number of classification classes in the training and control samples do not equal.


2021 ◽  
pp. 1-13
Author(s):  
Weiqi Gao ◽  
Hao Huang

Graph convolutional networks (GCNs), which are capable of effectively processing graph-structural data, have been successfully applied in text classification task. Existing studies on GCN based text classification model largely concerns with the utilization of word co-occurrence and Term Frequency-Inverse Document Frequency (TF–IDF) information for graph construction, which to some extent ignore the context information of the texts. To solve this problem, we propose a gating context-aware text classification model with Bidirectional Encoder Representations from Transformers (BERT) and graph convolutional network, named as Gating Context GCN (GC-GCN). More specifically, we integrates the graph embedding with BERT embedding by using a GCN with gating mechanism enables the acquisition of context coding. We carry out text classification experiments to show the effectiveness of the proposed model. Experimental results shown our model has respectively obtained 0.19%, 0.57%, 1.05% and 1.17% improvements over the Text-GCN baseline on the 20NG, R8, R52, and Ohsumed benchmark datasets. Furthermore, to overcome the problem that word co-occurrence and TF–IDF are not suitable for graph construction for short texts, Euclidean distance is used to combine with word co-occurrence and TF–IDF information. We obtain an improvement by 1.38% on the MR dataset compared to Text-GCN baseline.


Sign in / Sign up

Export Citation Format

Share Document