scholarly journals An Integration Model for Text Classification using Graph Convolutional Network and BERT

2021 ◽  
Vol 2137 (1) ◽  
pp. 012052
Author(s):  
Bingxin Xue ◽  
Cui Zhu ◽  
Xuan Wang ◽  
Wenjun Zhu

Abstract Recently, Graph Convolutional Neural Network (GCN) is widely used in text classification tasks, and has effectively completed tasks that are considered to have a rich relational structure. However, due to the sparse adjacency matrix constructed by GCN, GCN cannot make full use of context-dependent information in text classification, and cannot capture local information. The Bidirectional Encoder Representation from Transformers (BERT) has been shown to have the ability to capture the contextual information in a sentence or document, but its ability to capture global information about the vocabulary of a language is relatively limited. The latter is the advantage of GCN. Therefore, in this paper, Mutual Graph Convolution Networks (MGCN) is proposed to solve the above problems. It introduces semantic dictionary (WordNet), dependency and BERT. MGCN uses dependency to solve the problem of context dependence and WordNet to obtain more semantic information. Then the local information generated by BERT and the global information generated by GCN are interacted through the attention mechanism, so that they can influence each other and improve the classification effect of the model. The experimental results show that our model is more effective than previous research reports on three text classification data sets.

Author(s):  
Qingyu Yin ◽  
Weinan Zhang ◽  
Yu Zhang ◽  
Ting Liu

Existing approaches for Chinese zero pronoun resolution overlook semantic information. This is because zero pronouns have no descriptive information, which results in difficulty in explicitly capturing their semantic similarities with antecedents. Moreover, when dealing with candidate antecedents, traditional systems simply take advantage of the local information of a single candidate antecedent while failing to consider the underlying information provided by the other candidates from a global perspective. To address these weaknesses, we propose a novel zero pronoun-specific neural network, which is capable of representing zero pronouns by utilizing the contextual information at the semantic level. In addition, when dealing with candidate antecedents, a two-level candidate encoder is employed to explicitly capture both the local and global information of candidate antecedents. We conduct experiments on the Chinese portion of the OntoNotes 5.0 corpus. Experimental results show that our approach substantially outperforms the state-of-the-art method in various experimental settings.


2021 ◽  
Vol 2132 (1) ◽  
pp. 012032
Author(s):  
Bing Ai ◽  
Yibing Wang ◽  
Liang Ji ◽  
Jia Yi ◽  
Ting Wang ◽  
...  

Abstract Graph neural network (GNN) has done a good job of processing intricate architecture and fusion of global messages, research has explored GNN technology for text classification. However, the model that fixed the entire corpus as a graph in the past faced many problems such as high memory consumption and the inability to modify the construction of the graph. We propose an improved model based on GNN to solve these problems. The model no longer fixes the entire corpus as a graph but constructs different graphs for each text. This method reduces memory consumption, but still retains global information. We conduct experiments on the R8, R52, and 20newsgroups data sets, and use accuracy as the experimental standard. Experiments show that even if it consumes less memory, our model accomplish higher than existing models on multiple text classification data sets.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 148865-148876
Author(s):  
Hengliang Tang ◽  
Yuan Mi ◽  
Fei Xue ◽  
Yang Cao

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Nanxin Wang ◽  
Libin Yang ◽  
Yu Zheng ◽  
Xiaoyan Cai ◽  
Xin Mei ◽  
...  

Heterogeneous information network (HIN), which contains various types of nodes and links, has been applied in recommender systems. Although HIN-based recommendation approaches perform better than the traditional recommendation approaches, they still have the following problems: for example, meta-paths are manually selected, not automatically; meta-path representations are rarely explicitly learned; and the global and local information of each node in HIN has not been simultaneously explored. To solve the above deficiencies, we propose a tri-attention neural network (TANN) model for recommendation task. The proposed TANN model applies the stud genetic algorithm to automatically select meta-paths at first. Then, it learns global and local representations of each node, as well as the representations of meta-paths existing in HIN. After that, a tri-attention mechanism is proposed to enhance the mutual influence among users, items, and their related meta-paths. Finally, the encoded interaction information among the user, the item, and their related meta-paths, which contain more semantic information can be used for recommendation task. Extensive experiments on the Douban Movie, MovieLens, and Yelp datasets have demonstrated the outstanding performance of the proposed approach.


Author(s):  
Muhammad Zulqarnain ◽  
Rozaida Ghazali ◽  
Yana Mazwin Mohmad Hassim ◽  
Muhammad Rehan

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>


2021 ◽  
Vol 16 (1) ◽  
pp. 1-23
Author(s):  
Keyu Yang ◽  
Yunjun Gao ◽  
Lei Liang ◽  
Song Bian ◽  
Lu Chen ◽  
...  

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.


2020 ◽  
Vol 21 (S1) ◽  
Author(s):  
Dina Abdelhafiz ◽  
Jinbo Bi ◽  
Reda Ammar ◽  
Clifford Yang ◽  
Sheida Nabavi

Abstract Background Automatic segmentation and localization of lesions in mammogram (MG) images are challenging even with employing advanced methods such as deep learning (DL) methods. We developed a new model based on the architecture of the semantic segmentation U-Net model to precisely segment mass lesions in MG images. The proposed end-to-end convolutional neural network (CNN) based model extracts contextual information by combining low-level and high-level features. We trained the proposed model using huge publicly available databases, (CBIS-DDSM, BCDR-01, and INbreast), and a private database from the University of Connecticut Health Center (UCHC). Results We compared the performance of the proposed model with those of the state-of-the-art DL models including the fully convolutional network (FCN), SegNet, Dilated-Net, original U-Net, and Faster R-CNN models and the conventional region growing (RG) method. The proposed Vanilla U-Net model outperforms the Faster R-CNN model significantly in terms of the runtime and the Intersection over Union metric (IOU). Training with digitized film-based and fully digitized MG images, the proposed Vanilla U-Net model achieves a mean test accuracy of 92.6%. The proposed model achieves a mean Dice coefficient index (DI) of 0.951 and a mean IOU of 0.909 that show how close the output segments are to the corresponding lesions in the ground truth maps. Data augmentation has been very effective in our experiments resulting in an increase in the mean DI and the mean IOU from 0.922 to 0.951 and 0.856 to 0.909, respectively. Conclusions The proposed Vanilla U-Net based model can be used for precise segmentation of masses in MG images. This is because the segmentation process incorporates more multi-scale spatial context, and captures more local and global context to predict a precise pixel-wise segmentation map of an input full MG image. These detected maps can help radiologists in differentiating benign and malignant lesions depend on the lesion shapes. We show that using transfer learning, introducing augmentation, and modifying the architecture of the original model results in better performance in terms of the mean accuracy, the mean DI, and the mean IOU in detecting mass lesion compared to the other DL and the conventional models.


2020 ◽  
Author(s):  
Harshvardhan Sikka

One of the popular directions in Deep Learning (DL) research has been to build larger and more complex deep networks that can perform well on several different learning tasks, commonly known as multitask learning. This work is usually done within specific domains, e.g. multitask models that perform captioning, translation, and text classification tasks. Some work has been done in building multimodal/crossmodal networks that use deep networks with a combination of different neural network primitives (Convolutional Layers, Recurrent Layers, Mixture of Expert layers, etc). This paper explores various topics and ideas that may prove relevant to large, sparse, multitask networks and explores the potential for a general approach to building and managing these networks. A framework to automatically build, update, and interpret modular LSMNs is presented in the context of current tooling and theory.


2020 ◽  
Vol 34 (08) ◽  
pp. 13332-13337
Author(s):  
Neil Mallinar ◽  
Abhishek Shah ◽  
Tin Kam Ho ◽  
Rajendra Ugrani ◽  
Ayush Gupta

Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents.


Sign in / Sign up

Export Citation Format

Share Document