Deep Short Text Classification with Knowledge Powered Attention

Short text classification is one of important tasks in Natural Language Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous since they have not enough contextual information, which poses a great challenge for classification. In this paper, we retrieve knowledge from external knowledge source to enhance the semantic representation of short texts. We take conceptual information as a kind of knowledge and incorporate it into deep neural networks. For the purpose of measuring the importance of knowledge, we introduce attention mechanisms and propose deep Short Text Classification with Knowledge powered Attention (STCKA). We utilize Concept towards Short Text (CST) attention and Concept towards Concept Set (C-CS) attention to acquire the weight of concepts from two aspects. And we classify a short text with the help of conceptual information. Unlike traditional approaches, our model acts like a human being who has intrinsic ability to make decisions based on observation (i.e., training data for machines) and pays more attention to important knowledge. We also conduct extensive experiments on four public datasets for different tasks. The experimental results and case studies show that our model outperforms the state-of-the-art methods, justifying the effectiveness of knowledge powered attention.

Download Full-text

Joint Representations of Texts and Labels with Compositional Loss for Short Text Classification

Journal of Web Engineering ◽

10.13052/jwe1540-9589.2035 ◽

2021 ◽

Author(s):

Ming Hao ◽

Weijing Wang ◽

Fang Zhou

Keyword(s):

Language Processing ◽

Text Classification ◽

Ground Truth ◽

Language Models ◽

Text Representation ◽

Short Text ◽

Practical Applications ◽

Classical Models ◽

Multi Class Classification ◽

Public Datasets

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.

Download Full-text

A Classification Model of Legal Consulting Questions Based on Multi-Attention Prototypical Networks

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00053-6 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Jianzhou Feng ◽

Jinman Cui ◽

Qikai Wei ◽

Zhengji Zhou ◽

Yuxiong Wang

Keyword(s):

Supervised Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Training Data ◽

Classification Model ◽

Great Progress ◽

Public Datasets ◽

The Cost

AbstractText classification is a research hotspot in the field of natural language processing. Existing text classification models based on supervised learning, especially deep learning models, have made great progress on public datasets. But most of these methods rely on a large amount of training data, and these datasets coverage is limited. In the legal intelligent question-answering system, accurate classification of legal consulting questions is a necessary prerequisite for the realization of intelligent question answering. However, due to lack of sufficient annotation data and the cost of labeling is high, which lead to the poor effect of traditional supervised learning methods under sparse labeling. In response to the above problems, we construct a few-shot legal consulting questions dataset, and propose a prototypical networks model based on multi-attention. For the same category of instances, this model first highlights the key features in the instances as much as possible through instance-dimension level attention. Then it realizes the classification of legal consulting questions by prototypical networks. Experimental results show that our model achieves state-of-the-art results compared with baseline models. The code and dataset are released on https://github.com/cjm0824/MAPN.

Download Full-text

A Sequential Graph Neural Network for Short Text Classification

Algorithms ◽

10.3390/a14120352 ◽

2021 ◽

Vol 14 (12) ◽

pp. 352

Author(s):

Ke Zhao ◽

Lan Huang ◽

Rui Song ◽

Qiang Shen ◽

Hao Xu

Keyword(s):

Language Processing ◽

Text Classification ◽

Short Term Memory ◽

Contextual Information ◽

Extended Model ◽

Short Text ◽

Convolutional Networks ◽

Word Representation ◽

Benchmark Datasets ◽

Sequential Information

Short text classification is an important problem of natural language processing (NLP), and graph neural networks (GNNs) have been successfully used to solve different NLP problems. However, few studies employ GNN for short text classification, and most of the existing graph-based models ignore sequential information (e.g., word orders) in each document. In this work, we propose an improved sequence-based feature propagation scheme, which fully uses word representation and document-level word interaction and overcomes the limitations of textual features in short texts. On this basis, we utilize this propagation scheme to construct a lightweight model, sequential GNN (SGNN), and its extended model, ESGNN. Specifically, we build individual graphs for each document in the short text corpus based on word co-occurrence and use a bidirectional long short-term memory network (Bi-LSTM) to extract the sequential features of each document; therefore, word nodes in the document graph retain contextual information. Furthermore, two different simplified graph convolutional networks (GCNs) are used to learn word representations based on their local structures. Finally, word nodes combined with sequential information and local information are incorporated as the document representation. Extensive experiments on seven benchmark datasets demonstrate the effectiveness of our method.

Download Full-text

Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention

Computational Intelligence and Neuroscience ◽

10.1155/2021/9425655 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Sunil Kumar Prabhakar ◽

Dong-Ok Won

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Patient Information ◽

Classification Accuracy ◽

Learning Model ◽

Training Data ◽

Machine Learning Techniques ◽

Medical Text ◽

Deep Learning Model

To unlock information present in clinical description, automatic medical text classification is highly useful in the arena of natural language processing (NLP). For medical text classification tasks, machine learning techniques seem to be quite effective; however, it requires extensive effort from human side, so that the labeled training data can be created. For clinical and translational research, a huge quantity of detailed patient information, such as disease status, lab tests, medication history, side effects, and treatment outcomes, has been collected in an electronic format, and it serves as a valuable data source for further analysis. Therefore, a huge quantity of detailed patient information is present in the medical text, and it is quite a huge challenge to process it efficiently. In this work, a medical text classification paradigm, using two novel deep learning architectures, is proposed to mitigate the human efforts. The first approach is that a quad channel hybrid long short-term memory (QC-LSTM) deep learning model is implemented utilizing four channels, and the second approach is that a hybrid bidirectional gated recurrent unit (BiGRU) deep learning model with multihead attention is developed and implemented successfully. The proposed methodology is validated on two medical text datasets, and a comprehensive analysis is conducted. The best results in terms of classification accuracy of 96.72% is obtained with the proposed QC-LSTM deep learning model, and a classification accuracy of 95.76% is obtained with the proposed hybrid BiGRU deep learning model.

Download Full-text

Explicit Interaction Model towards Text Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016359 ◽

2019 ◽

Vol 33 ◽

pp. 6359-6366 ◽

Cited By ~ 3

Author(s):

Cunxiao Du ◽

Zhaozheng Chen ◽

Fuli Feng ◽

Lei Zhu ◽

Tian Gan ◽

...

Keyword(s):

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Interaction Mechanism ◽

Interaction Model ◽

Classification Task ◽

Fine Grained ◽

Word Level ◽

Benchmark Datasets ◽

Classification Tasks

Text classification is one of the fundamental tasks in natural language processing. Recently, deep neural networks have achieved promising performance in the text classification task compared to shallow models. Despite of the significance of deep models, they ignore the fine-grained (matching signals between words and classes) classification clues since their classifications mainly rely on the text-level representations. To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task. In particular, we design a novel framework, EXplicit interAction Model (dubbed as EXAM), equipped with the interaction mechanism. We justified the proposed approach on several benchmark datasets including both multilabel and multi-class text classification tasks. Extensive experimental results demonstrate the superiority of the proposed method. As a byproduct, we have released the codes and parameter settings to facilitate other researches.

Download Full-text

Towards Robust Text Classification with Semantics-Aware Recurrent Neural Architecture

Machine Learning and Knowledge Extraction ◽

10.3390/make1020034 ◽

2019 ◽

Vol 1 (2) ◽

pp. 575-589 ◽

Cited By ~ 1

Author(s):

Blaž Škrlj ◽

Jan Kralj ◽

Nada Lavrač ◽

Senja Pollak

Keyword(s):

Text Mining ◽

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Semantic Knowledge ◽

Text Documents ◽

Neural Architecture ◽

Classification Tasks ◽

And Gender ◽

Semantic Resources

Deep neural networks are becoming ubiquitous in text mining and natural language processing, but semantic resources, such as taxonomies and ontologies, are yet to be fully exploited in a deep learning setting. This paper presents an efficient semantic text mining approach, which converts semantic information related to a given set of documents into a set of novel features that are used for learning. The proposed Semantics-aware Recurrent deep Neural Architecture (SRNA) enables the system to learn simultaneously from the semantic vectors and from the raw text documents. We test the effectiveness of the approach on three text classification tasks: news topic categorization, sentiment analysis and gender profiling. The experiments show that the proposed approach outperforms the approach without semantic knowledge, with highest accuracy gain (up to 10%) achieved on short document fragments.

Download Full-text

Cognitive Aspects-Based Short Text Representation with Named Entity, Concept and Knowledge

Applied Sciences ◽

10.3390/app10144893 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4893 ◽

Cited By ~ 1

Author(s):

Wenfeng Hou ◽

Qing Liu ◽

Longbing Cao

Keyword(s):

Text Classification ◽

Semantic Representation ◽

Semantic Features ◽

Text Representation ◽

Short Text ◽

Individual Level ◽

Named Entity ◽

Concept Knowledge ◽

Representation Model ◽

Multi Level

Short text is widely seen in applications including Internet of Things (IoT). The appropriate representation and classification of short text could be severely disrupted by the sparsity and shortness of short text. One important solution is to enrich short text representation by involving cognitive aspects of text, including semantic concept, knowledge, and category. In this paper, we propose a named Entity-based Concept Knowledge-Aware (ECKA) representation model which incorporates semantic information into short text representation. ECKA is a multi-level short text semantic representation model, which extracts the semantic features from the word, entity, concept and knowledge levels by CNN, respectively. Since word, entity, concept and knowledge entity in the same short text have different cognitive informativeness for short text classification, attention networks are formed to capture these category-related attentive representations from the multi-level textual features, respectively. The final multi-level semantic representations are formed by concatenating all of these individual-level representations, which are used for text classification. Experiments on three tasks demonstrate our method significantly outperforms the state-of-the-art methods.

Download Full-text

Deep Neural Networks Techniques using for Learning Automata Based Incremental Learning Method

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i6.1268 ◽

2021 ◽

Vol 12 (6) ◽

pp. 69-73

Author(s):

C. Swetha Reddy Et.al

Keyword(s):

Neural Networks ◽

Language Processing ◽

Visual Recognition ◽

Deep Neural Networks ◽

Learning Automata ◽

Training Data ◽

Neuron Models ◽

Effectiveness Of Training ◽

Learning Machine ◽

Incremental Learning Method

Surprisingly comprehensive learning methods are implemented in many large learning machine data, such as visual recognition and visual language processing. Much of the success of advanced training in recent years is due to leadership training, which requires a set of information for specific tasks, before such training. However, in reality, selected tasks related to personal study are gradually accumulated over time as it is difficult to collect and submit training data manually. It provides a way to continue learning some information columns and examples of steps that are specific to the new class and called additional learning. In this post, we recommend the best machine training method for further training for deep neural networks. The basic idea is to learn a deep system with strong connections that can be "activated" or "turned off" at different stages. The approach you suggest allows you to reduce the distribution of old services as you learn new for example new training, which increases the effectiveness of training in the additional training phase. Experiments with MNIST and CIFAR-100 show that our approach can be implemented in other long-term phases in deep neuron models and achieve better results from zero-base training.

Download Full-text

Efficient processing of GRU based on word embedding for text classification

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.3.4.289 ◽

2019 ◽

Vol 3 (4) ◽

Cited By ~ 2

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Muhammad Ghulam Ghouse ◽

Muhammad Faheem Mushtaq

Keyword(s):

Language Processing ◽

Text Classification ◽

Classification Performance ◽

Word Embedding ◽

Training Data ◽

Superior Performance ◽

Sequential Data ◽

Online Data ◽

Benchmark Datasets ◽

Recurrent Architecture

Text classification has become very serious problem for big organization to manage the large amount of online data and has been extensively applied in the tasks of Natural Language Processing (NLP). Text classification can support users to excellently manage and exploit meaningful information require to be classified into various categories for further use. In order to best classify texts, our research efforts to develop a deep learning approach which obtains superior performance in text classification than other RNNs approaches. However, the main problem in text classification is how to enhance the classification accuracy and the sparsity of the data semantics sensitivity to context often hinders the classification performance of texts. In order to overcome the weakness, in this paper we proposed unified structure to investigate the effects of word embedding and Gated Recurrent Unit (GRU) for text classification on two benchmark datasets included (Google snippets and TREC). GRU is a well-known type of recurrent neural network (RNN), which is ability of computing sequential data over its recurrent architecture. Experimentally, the semantically connected words are commonly near to each other in embedding spaces. First, words in posts are changed into vectors via word embedding technique. Then, the words sequential in sentences are fed to GRU to extract the contextual semantics between words. The experimental results showed that proposed GRU model can effectively learn the word usage in context of texts provided training data. The quantity and quality of training data significantly affected the performance. We evaluated the performance of proposed approach with traditional recurrent approaches, RNN, MV-RNN and LSTM, the proposed approach is obtained better results on two benchmark datasets in the term of accuracy and error rate.

Download Full-text

IDC: Quantitative Evaluation Benchmark of Interpretation Methods for Deep Text Classification Models

10.21203/rs.3.rs-960359/v1 ◽

2021 ◽

Author(s):

Mohammed Khaleel ◽

Lei Qi ◽

Wallapak Tavanapong ◽

Johnny Wong ◽

Adisak Sukul ◽

...

Keyword(s):

Quantitative Evaluation ◽

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Performance Metrics ◽

Ground Truth ◽

Decision Making Process ◽

Research Attention ◽

Prediction Confidence ◽

Insight Into

Abstract Recent advances in deep neural networks have achieved outstanding success in natural language processing. Due to the success and the black-box nature of the deep models, interpretation methods that provide insight into the decision-making process of the models have received an influx of research attention. However, there is no quantitative evaluation comparing interpretation methods for text classification other than observing classification accuracy or prediction confidence when important word grams are removed. This is due to the lack of interpretation ground truth. Manual labeling of a large interpretation ground truth is time-consuming. We propose IDC, a new benchmark for quantitative evaluation of I nterpretation methods for D eep text C lassification models. IDC consists of three methods that take existing text classification ground truth and generate three corresponding pseudo-interpretation ground truth datasets. We propose to use interpretation recall, interpretation precision, and Cohen’s kappa inter-agreement as performance metrics. We used the pseudo ground truth datasets and the metrics to evaluate six interpretation methods.

Download Full-text