An Integration Model for Text Classification using Graph Convolutional Network and BERT

Abstract Recently, Graph Convolutional Neural Network (GCN) is widely used in text classification tasks, and has effectively completed tasks that are considered to have a rich relational structure. However, due to the sparse adjacency matrix constructed by GCN, GCN cannot make full use of context-dependent information in text classification, and cannot capture local information. The Bidirectional Encoder Representation from Transformers (BERT) has been shown to have the ability to capture the contextual information in a sentence or document, but its ability to capture global information about the vocabulary of a language is relatively limited. The latter is the advantage of GCN. Therefore, in this paper, Mutual Graph Convolution Networks (MGCN) is proposed to solve the above problems. It introduces semantic dictionary (WordNet), dependency and BERT. MGCN uses dependency to solve the problem of context dependence and WordNet to obtain more semantic information. Then the local information generated by BERT and the global information generated by GCN are interacted through the attention mechanism, so that they can influence each other and improve the classification effect of the model. The experimental results show that our model is more effective than previous research reports on three text classification data sets.

Download Full-text

A Deep Neural Network for Chinese Zero Pronoun Resolution

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/464 ◽

2017 ◽

Author(s):

Qingyu Yin ◽

Weinan Zhang ◽

Yu Zhang ◽

Ting Liu

Keyword(s):

Neural Network ◽

Semantic Information ◽

State Of The Art ◽

Contextual Information ◽

Global Perspective ◽

Global Information ◽

Pronoun Resolution ◽

Semantic Level ◽

Zero Pronouns ◽

Zero Pronoun

Existing approaches for Chinese zero pronoun resolution overlook semantic information. This is because zero pronouns have no descriptive information, which results in difficulty in explicitly capturing their semantic similarities with antecedents. Moreover, when dealing with candidate antecedents, traditional systems simply take advantage of the local information of a single candidate antecedent while failing to consider the underlying information provided by the other candidates from a global perspective. To address these weaknesses, we propose a novel zero pronoun-specific neural network, which is capable of representing zero pronouns by utilizing the contextual information at the semantic level. In addition, when dealing with candidate antecedents, a two-level candidate encoder is employed to explicitly capture both the local and global information of candidate antecedents. We conduct experiments on the Chinese portion of the OntoNotes 5.0 corpus. Experimental results show that our approach substantially outperforms the state-of-the-art method in various experimental settings.

Download Full-text

A graph neural network fused with multi-head attention for text classification

Journal of Physics Conference Series ◽

10.1088/1742-6596/2132/1/012032 ◽

2021 ◽

Vol 2132 (1) ◽

pp. 012032

Author(s):

Bing Ai ◽

Yibing Wang ◽

Liang Ji ◽

Jia Yi ◽

Ting Wang ◽

...

Keyword(s):

Neural Network ◽

Text Classification ◽

Data Sets ◽

Global Information ◽

Memory Consumption ◽

The Past ◽

Model Based ◽

Improved Model

Abstract Graph neural network (GNN) has done a good job of processing intricate architecture and fusion of global messages, research has explored GNN technology for text classification. However, the model that fixed the entire corpus as a graph in the past faced many problems such as high memory consumption and the inability to modify the construction of the graph. We propose an improved model based on GNN to solve these problems. The model no longer fixes the entire corpus as a graph but constructs different graphs for each text. This method reduces memory consumption, but still retains global information. We conduct experiments on the R8, R52, and 20newsgroups data sets, and use accuracy as the experimental standard. Experiments show that even if it consumes less memory, our model accomplish higher than existing models on multiple text classification data sets.

Download Full-text

An Integration Model Based on Graph Convolutional Network for Text Classification

IEEE Access ◽

10.1109/access.2020.3015770 ◽

2020 ◽

Vol 8 ◽

pp. 148865-148876

Author(s):

Hengliang Tang ◽

Yuan Mi ◽

Fei Xue ◽

Yang Cao

Keyword(s):

Text Classification ◽

Convolutional Network ◽

Integration Model ◽

Model Based

Download Full-text

A Tri-Attention Neural Network Model-BasedRecommendation

Complexity ◽

10.1155/2020/3857871 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10 ◽

Cited By ~ 1

Author(s):

Nanxin Wang ◽

Libin Yang ◽

Yu Zheng ◽

Xiaoyan Cai ◽

Xin Mei ◽

...

Keyword(s):

Neural Network ◽

Semantic Information ◽

Mutual Influence ◽

Local Information ◽

Information Network ◽

Heterogeneous Information ◽

Meta Path ◽

Global And Local ◽

Local Representations ◽

Better Than

Heterogeneous information network (HIN), which contains various types of nodes and links, has been applied in recommender systems. Although HIN-based recommendation approaches perform better than the traditional recommendation approaches, they still have the following problems: for example, meta-paths are manually selected, not automatically; meta-path representations are rarely explicitly learned; and the global and local information of each node in HIN has not been simultaneously explored. To solve the above deficiencies, we propose a tri-attention neural network (TANN) model for recommendation task. The proposed TANN model applies the stud genetic algorithm to automatically select meta-paths at first. Then, it learns global and local representations of each node, as well as the representations of meta-paths existing in HIN. After that, a tri-attention mechanism is proposed to enhance the mutual influence among users, items, and their related meta-paths. Finally, the encoded interaction information among the user, the item, and their related meta-paths, which contain more semantic information can be used for recommendation task. Extensive experiments on the Douban Movie, MovieLens, and Yelp datasets have demonstrated the outstanding performance of the proposed approach.

Download Full-text

A comparative review on deep learning models for text classification

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i1.pp325-335 ◽

2020 ◽

Vol 19 (1) ◽

pp. 325

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Yana Mazwin Mohmad Hassim ◽

Muhammad Rehan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Learning Models ◽

Semantic Classification ◽

Analysis Question ◽

Comparative Review ◽

Classification Tasks

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>

Download Full-text

CrowdTC: Crowd-powered Learning for Text Classification

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3457216 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-23

Author(s):

Keyu Yang ◽

Yunjun Gao ◽

Lei Liang ◽

Song Bian ◽

Lu Chen ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Text Classification ◽

Deep Neural Networks ◽

Semantic Information ◽

Human Beings ◽

Hybrid Neural Network ◽

Public Datasets ◽

Almost All ◽

The Cost

Text classification is a fundamental task in content analysis. Nowadays, deep learning has demonstrated promising performance in text classification compared with shallow models. However, almost all the existing models do not take advantage of the wisdom of human beings to help text classification. Human beings are more intelligent and capable than machine learning models in terms of understanding and capturing the implicit semantic information from text. In this article, we try to take guidance from human beings to classify text. We propose Crowd-powered learning for Text Classification (CrowdTC for short). We design and post the questions on a crowdsourcing platform to extract keywords in text. Sampling and clustering techniques are utilized to reduce the cost of crowdsourcing. Also, we present an attention-based neural network and a hybrid neural network to incorporate the extracted keywords as human guidance into deep neural networks. Extensive experiments on public datasets confirm that CrowdTC improves the text classification accuracy of neural networks by using the crowd-powered keyword guidance.

Download Full-text

Multi-stream neural network fused with local information and global information for HOI detection

Applied Intelligence ◽

10.1007/s10489-020-01794-1 ◽

2020 ◽

Vol 50 (12) ◽

pp. 4495-4505

Author(s):

Limin Xia ◽

Rui Li

Keyword(s):

Neural Network ◽

Local Information ◽

Global Information

Download Full-text

Convolutional neural network for automated mass segmentation in mammography

BMC Bioinformatics ◽

10.1186/s12859-020-3521-y ◽

2020 ◽

Vol 21 (S1) ◽

Author(s):

Dina Abdelhafiz ◽

Jinbo Bi ◽

Reda Ammar ◽

Clifford Yang ◽

Sheida Nabavi

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Data Augmentation ◽

Contextual Information ◽

Automatic Segmentation ◽

Ground Truth ◽

Test Accuracy ◽

Convolutional Network ◽

Proposed Model ◽

The Mean

Abstract Background Automatic segmentation and localization of lesions in mammogram (MG) images are challenging even with employing advanced methods such as deep learning (DL) methods. We developed a new model based on the architecture of the semantic segmentation U-Net model to precisely segment mass lesions in MG images. The proposed end-to-end convolutional neural network (CNN) based model extracts contextual information by combining low-level and high-level features. We trained the proposed model using huge publicly available databases, (CBIS-DDSM, BCDR-01, and INbreast), and a private database from the University of Connecticut Health Center (UCHC). Results We compared the performance of the proposed model with those of the state-of-the-art DL models including the fully convolutional network (FCN), SegNet, Dilated-Net, original U-Net, and Faster R-CNN models and the conventional region growing (RG) method. The proposed Vanilla U-Net model outperforms the Faster R-CNN model significantly in terms of the runtime and the Intersection over Union metric (IOU). Training with digitized film-based and fully digitized MG images, the proposed Vanilla U-Net model achieves a mean test accuracy of 92.6%. The proposed model achieves a mean Dice coefficient index (DI) of 0.951 and a mean IOU of 0.909 that show how close the output segments are to the corresponding lesions in the ground truth maps. Data augmentation has been very effective in our experiments resulting in an increase in the mean DI and the mean IOU from 0.922 to 0.951 and 0.856 to 0.909, respectively. Conclusions The proposed Vanilla U-Net based model can be used for precise segmentation of masses in MG images. This is because the segmentation process incorporates more multi-scale spatial context, and captures more local and global context to predict a precise pixel-wise segmentation map of an input full MG image. These detected maps can help radiologists in differentiating benign and malignant lesions depend on the lesion shapes. We show that using transfer learning, introducing augmentation, and modifying the architecture of the original model results in better performance in terms of the mean accuracy, the mean DI, and the mean IOU in detecting mass lesion compared to the other DL and the conventional models.

Download Full-text

Creating, Managing, and Understanding Large, Sparse, Multitask Neural Networks

10.31219/osf.io/bv4qp ◽

2020 ◽

Author(s):

Harshvardhan Sikka

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Text Classification ◽

Multitask Learning ◽

Learning Tasks ◽

Deep Networks ◽

Classification Tasks

One of the popular directions in Deep Learning (DL) research has been to build larger and more complex deep networks that can perform well on several different learning tasks, commonly known as multitask learning. This work is usually done within specific domains, e.g. multitask models that perform captioning, translation, and text classification tasks. Some work has been done in building multimodal/crossmodal networks that use deep networks with a combination of different neural network primitives (Convolutional Layers, Recurrent Layers, Mixture of Expert layers, etc). This paper explores various topics and ideas that may prove relevant to large, sparse, multitask networks and explores the potential for a general approach to building and managing these networks. A framework to automatically build, update, and interpret modular LSMNs is presented in the context of current tooling and theory.

Download Full-text

Iterative Data Programming for Expanding Text Classification Corpora

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i08.7045 ◽

2020 ◽

Vol 34 (08) ◽

pp. 13332-13337

Author(s):

Neil Mallinar ◽

Abhishek Shah ◽

Tin Kam Ho ◽

Rajendra Ugrani ◽

Ayush Gupta

Keyword(s):

Text Classification ◽

Training Data ◽

Data Sets ◽

Conversational Agents ◽

Intent Recognition ◽

Text Data ◽

Learning Techniques ◽

Programming Techniques ◽

Classification Tasks ◽

Training Examples

Real-world text classification tasks often require many labeled training examples that are expensive to obtain. Recent advancements in machine teaching, specifically the data programming paradigm, facilitate the creation of training data sets quickly via a general framework for building weak models, also known as labeling functions, and denoising them through ensemble learning techniques. We present a fast, simple data programming method for augmenting text data sets by generating neighborhood-based weak models with minimal supervision. Furthermore, our method employs an iterative procedure to identify sparsely distributed examples from large volumes of unlabeled data. The iterative data programming techniques improve newer weak models as more labeled data is confirmed with human-in-loop. We show empirical results on sentence classification tasks, including those from a task of improving intent recognition in conversational agents.

Download Full-text