Explicit Interaction Model towards Text Classification

Text classification is one of the fundamental tasks in natural language processing. Recently, deep neural networks have achieved promising performance in the text classification task compared to shallow models. Despite of the significance of deep models, they ignore the fine-grained (matching signals between words and classes) classification clues since their classifications mainly rely on the text-level representations. To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task. In particular, we design a novel framework, EXplicit interAction Model (dubbed as EXAM), equipped with the interaction mechanism. We justified the proposed approach on several benchmark datasets including both multilabel and multi-class text classification tasks. Extensive experimental results demonstrate the superiority of the proposed method. As a byproduct, we have released the codes and parameter settings to facilitate other researches.

Download Full-text

Towards Robust Text Classification with Semantics-Aware Recurrent Neural Architecture

Machine Learning and Knowledge Extraction ◽

10.3390/make1020034 ◽

2019 ◽

Vol 1 (2) ◽

pp. 575-589 ◽

Cited By ~ 1

Author(s):

Blaž Škrlj ◽

Jan Kralj ◽

Nada Lavrač ◽

Senja Pollak

Keyword(s):

Text Mining ◽

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Semantic Knowledge ◽

Text Documents ◽

Neural Architecture ◽

Classification Tasks ◽

And Gender ◽

Semantic Resources

Deep neural networks are becoming ubiquitous in text mining and natural language processing, but semantic resources, such as taxonomies and ontologies, are yet to be fully exploited in a deep learning setting. This paper presents an efficient semantic text mining approach, which converts semantic information related to a given set of documents into a set of novel features that are used for learning. The proposed Semantics-aware Recurrent deep Neural Architecture (SRNA) enables the system to learn simultaneously from the semantic vectors and from the raw text documents. We test the effectiveness of the approach on three text classification tasks: news topic categorization, sentiment analysis and gender profiling. The experiments show that the proposed approach outperforms the approach without semantic knowledge, with highest accuracy gain (up to 10%) achieved on short document fragments.

Download Full-text

Universal Adversarial Attack via Conditional Sampling for Text Classification

Applied Sciences ◽

10.3390/app11209539 ◽

2021 ◽

Vol 11 (20) ◽

pp. 9539

Author(s):

Yu Zhang ◽

Kun Shao ◽

Junan Yang ◽

Hui Liu

Keyword(s):

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

High Quality ◽

The Face ◽

Adversarial Examples ◽

Specific Prediction ◽

Novel Method ◽

Adversarial Attack ◽

Classification Tasks

Despite deep neural networks (DNNs) having achieved impressive performance in various domains, it has been revealed that DNNs are vulnerable in the face of adversarial examples, which are maliciously crafted by adding human-imperceptible perturbations to an original sample to cause the wrong output by the DNNs. Encouraged by numerous researches on adversarial examples for computer vision, there has been growing interest in designing adversarial attacks for Natural Language Processing (NLP) tasks. However, the adversarial attacking for NLP is challenging because text is discrete data and a small perturbation can bring a notable shift to the original input. In this paper, we propose a novel method, based on conditional BERT sampling with multiple standards, for generating universal adversarial perturbations: input-agnostic of words that can be concatenated to any input in order to produce a specific prediction. Our universal adversarial attack can create an appearance closer to natural phrases and yet fool sentiment classifiers when added to benign inputs. Based on automatic detection metrics and human evaluations, the adversarial attack we developed dramatically reduces the accuracy of the model on classification tasks, and the trigger is less easily distinguished from natural text. Experimental results demonstrate that our method crafts more high-quality adversarial examples as compared to baseline methods. Further experiments show that our method has high transferability. Our goal is to prove that adversarial attacks are more difficult to detect than previously thought and enable appropriate defenses.

Download Full-text

Self-Amplificated Network: Learning fine-grained learner with few samples

Journal of Physics Conference Series ◽

10.1088/1742-6596/2050/1/012006 ◽

2021 ◽

Vol 2050 (1) ◽

pp. 012006

Author(s):

Xili Dai ◽

Chunmei Ma ◽

Jingwei Sun ◽

Tao Zhang ◽

Haigang Gong ◽

...

Keyword(s):

Deep Neural Networks ◽

Classification Problem ◽

The Self ◽

Superior Performance ◽

Query Image ◽

Network Learning ◽

Fine Grained ◽

Support Set ◽

Meta Learning ◽

Benchmark Datasets

Abstract Training deep neural networks from only a few examples has been an interesting topic that motivated few shot learning. In this paper, we study the fine-grained image classification problem in a challenging few-shot learning setting, and propose the Self-Amplificated Network (SAN), a method based on meta-learning to tackle this problem. The SAN model consists of three parts, which are the Encoder, Amplification and Similarity Modules. The Encoder Module encodes a fine-grained image input into a feature vector. The Amplification Module is used to amplify subtle differences between fine-grained images based on the self attention mechanism which is composed of multi-head attention. The Similarity Module measures how similar the query image and the support set are in order to determine the classification result. In-depth experiments on three benchmark datasets have showcased that our network achieves superior performance over the competing baselines.

Download Full-text

The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification

10.3233/faia210317 ◽

2021 ◽

Author(s):

Benjamin Clavié ◽

Marc Alphonsus

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Traditional Approach ◽

Error Reduction ◽

Support Vector ◽

Learning Models ◽

Legal Text ◽

Classification Tasks ◽

Legal Domain

We aim to highlight an interesting trend to contribute to the ongoing debate around advances within legal Natural Language Processing. Recently, the focus for most legal text classification tasks has shifted towards large pre-trained deep learning models such as BERT. In this paper, we show that a more traditional approach based on Support Vector Machine classifiers reaches competitive performance with deep learning models. We also highlight that error reduction obtained by using specialised BERT-based models over baselines is noticeably smaller in the legal domain when compared to general language tasks. We discuss some hypotheses for these results to support future discussions.

Download Full-text

A comparative review on deep learning models for text classification

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v19.i1.pp325-335 ◽

2020 ◽

Vol 19 (1) ◽

pp. 325

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Yana Mazwin Mohmad Hassim ◽

Muhammad Rehan

Keyword(s):

Neural Network ◽

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Learning Models ◽

Semantic Classification ◽

Analysis Question ◽

Comparative Review ◽

Classification Tasks

<p>Text classification is a fundamental task in several areas of natural language processing (NLP), including words semantic classification, sentiment analysis, question answering, or dialog management. This paper investigates three basic architectures of deep learning models for the tasks of text classification: Deep Belief Neural (DBN), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), these three main types of deep learning architectures, are largely explored to handled various classification tasks. DBN have excellent learning capabilities to extracts highly distinguishable features and good for general purpose. CNN have supposed to be better at extracting the position of various related features while RNN is modeling in sequential of long-term dependencies. This paper work shows the systematic comparison of DBN, CNN, and RNN on text classification tasks. Finally, we show the results of deep models by research experiment. The aim of this paper to provides basic guidance about the deep learning models that which models are best for the task of text classification.</p>

Download Full-text

Efficient processing of GRU based on word embedding for text classification

JOIV International Journal on Informatics Visualization ◽

10.30630/joiv.3.4.289 ◽

2019 ◽

Vol 3 (4) ◽

Cited By ~ 2

Author(s):

Muhammad Zulqarnain ◽

Rozaida Ghazali ◽

Muhammad Ghulam Ghouse ◽

Muhammad Faheem Mushtaq

Keyword(s):

Language Processing ◽

Text Classification ◽

Classification Performance ◽

Word Embedding ◽

Training Data ◽

Superior Performance ◽

Sequential Data ◽

Online Data ◽

Benchmark Datasets ◽

Recurrent Architecture

Text classification has become very serious problem for big organization to manage the large amount of online data and has been extensively applied in the tasks of Natural Language Processing (NLP). Text classification can support users to excellently manage and exploit meaningful information require to be classified into various categories for further use. In order to best classify texts, our research efforts to develop a deep learning approach which obtains superior performance in text classification than other RNNs approaches. However, the main problem in text classification is how to enhance the classification accuracy and the sparsity of the data semantics sensitivity to context often hinders the classification performance of texts. In order to overcome the weakness, in this paper we proposed unified structure to investigate the effects of word embedding and Gated Recurrent Unit (GRU) for text classification on two benchmark datasets included (Google snippets and TREC). GRU is a well-known type of recurrent neural network (RNN), which is ability of computing sequential data over its recurrent architecture. Experimentally, the semantically connected words are commonly near to each other in embedding spaces. First, words in posts are changed into vectors via word embedding technique. Then, the words sequential in sentences are fed to GRU to extract the contextual semantics between words. The experimental results showed that proposed GRU model can effectively learn the word usage in context of texts provided training data. The quantity and quality of training data significantly affected the performance. We evaluated the performance of proposed approach with traditional recurrent approaches, RNN, MV-RNN and LSTM, the proposed approach is obtained better results on two benchmark datasets in the term of accuracy and error rate.

Download Full-text

IDC: Quantitative Evaluation Benchmark of Interpretation Methods for Deep Text Classification Models

10.21203/rs.3.rs-960359/v1 ◽

2021 ◽

Author(s):

Mohammed Khaleel ◽

Lei Qi ◽

Wallapak Tavanapong ◽

Johnny Wong ◽

Adisak Sukul ◽

...

Keyword(s):

Quantitative Evaluation ◽

Language Processing ◽

Text Classification ◽

Deep Neural Networks ◽

Performance Metrics ◽

Ground Truth ◽

Decision Making Process ◽

Research Attention ◽

Prediction Confidence ◽

Insight Into

Abstract Recent advances in deep neural networks have achieved outstanding success in natural language processing. Due to the success and the black-box nature of the deep models, interpretation methods that provide insight into the decision-making process of the models have received an influx of research attention. However, there is no quantitative evaluation comparing interpretation methods for text classification other than observing classification accuracy or prediction confidence when important word grams are removed. This is due to the lack of interpretation ground truth. Manual labeling of a large interpretation ground truth is time-consuming. We propose IDC, a new benchmark for quantitative evaluation of I nterpretation methods for D eep text C lassification models. IDC consists of three methods that take existing text classification ground truth and generate three corresponding pseudo-interpretation ground truth datasets. We propose to use interpretation recall, interpretation precision, and Cohen’s kappa inter-agreement as performance metrics. We used the pseudo ground truth datasets and the metrics to evaluate six interpretation methods.

Download Full-text

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/642 ◽

2018 ◽

Cited By ~ 3

Author(s):

Renjie Zheng ◽

Junkun Chen ◽

Xipeng Qiu

Keyword(s):

Deep Learning ◽

Language Processing ◽

Text Classification ◽

Representation Learning ◽

Training Data ◽

Specific Information ◽

Distributed Representation ◽

Source Codes ◽

Multiple Tasks ◽

Classification Tasks

Distributed representation plays an important role in deep learning based natural language processing. However, the representation of a sentence often varies in different tasks, which is usually learned from scratch and suffers from the limited amounts of training data. In this paper, we claim that a good sentence representation should be invariant and can benefit the various subsequent tasks. To achieve this purpose, we propose a new scheme of information sharing for multi-task learning. More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanisms. The query vector of each task's attention could be either static parameters or generated dynamically. We conduct extensive experiments on 16 different text classification tasks, which demonstrate the benefits of our architecture. Source codes of this paper are available on Github.

Download Full-text

A discourse-aware neural network-based text model for document-level text classification

Journal of Information Science ◽

10.1177/0165551517743644 ◽

2017 ◽

Vol 44 (6) ◽

pp. 715-735 ◽

Cited By ~ 1

Author(s):

Kangwook Lee ◽

Sanggyu Han ◽

Sung-Hyon Myaeng

Keyword(s):

Neural Network ◽

Language Processing ◽

Text Classification ◽

Structure Theory ◽

Distributed Representations ◽

Document Models ◽

Discourse Relations ◽

Classification Tasks ◽

The Impact ◽

Document Level

Capturing semantics scattered across entire text is one of the important issues for Natural Language Processing (NLP) tasks. It would be particularly critical with long text embodying a flow of themes. This article proposes a new text modelling method that can handle thematic flows of text with Deep Neural Networks (DNNs) in such a way that discourse information and distributed representations of text are incorporate. Unlike previous DNN-based document models, the proposed model enables discourse-aware analysis of text and composition of sentence-level distributed representations guided by the discourse structure. More specifically, our method identifies Elementary Discourse Units (EDUs) and their discourse relations in a given document by applying Rhetorical Structure Theory (RST)-based discourse analysis. The result is fed into a tree-structured neural network that reflects the discourse information including the structure of the document and the discourse roles and relation types. We evaluate the document model for two document-level text classification tasks, sentiment analysis and sarcasm detection, with comparisons against the reference systems that also utilise discourse information. In addition, we conduct additional experiments to evaluate the impact of neural network types and adopted discourse factors on modelling documents vis-à-vis the two classification tasks. Furthermore, we investigate the effects of various learning methods, input units on the quality of the proposed discourse-aware document model.

Download Full-text

Text Normalization Using Encoder–Decoder Networks Based on the Causal Feature Extractor

Applied Sciences ◽

10.3390/app10134551 ◽

2020 ◽

Vol 10 (13) ◽

pp. 4551 ◽

Cited By ~ 1

Author(s):

Adrián Javaloy ◽

Ginés García-Mateos

Keyword(s):

Language Processing ◽

Convergence Time ◽

Intermediate Form ◽

Fine Grained ◽

Word Level ◽

Feature Extractor ◽

Decoder Architecture ◽

Different Types ◽

Main Ideas ◽

Text Normalization

The encoder–decoder architecture is a well-established, effective and widely used approach in many tasks of natural language processing (NLP), among other domains. It consists of two closely-collaborating components: An encoder that transforms the input into an intermediate form, and a decoder producing the output. This paper proposes a new method for the encoder, named Causal Feature Extractor (CFE), based on three main ideas: Causal convolutions, dilatations and bidirectionality. We apply this method to text normalization, which is a ubiquitous problem that appears as the first step of many text-to-speech (TTS) systems. Given a text with symbols, the problem consists in writing the text exactly as it should be read by the TTS system. We make use of an attention-based encoder–decoder architecture using a fine-grained character-level approach rather than the usual word-level one. The proposed CFE is compared to other common encoders, such as convolutional neural networks (CNN) and long-short term memories (LSTM). Experimental results show the feasibility of CFE, achieving better results in terms of accuracy, number of parameters, convergence time, and use of an attention mechanism based on attention matrices. The obtained accuracy ranges from 83.5% to 96.8% correctly normalized sentences, depending on the dataset. Moreover, the proposed method is generic and can be applied to different types of input such as text, audio and images.

Download Full-text