scholarly journals Correlation-Guided Representation for Multi-Label Text Classification

Author(s):  
Qian-Wen Zhang ◽  
Ximing Zhang ◽  
Zhao Yan ◽  
Ruifang Liu ◽  
Yunbo Cao ◽  
...  

Multi-label text classification is an essential task in natural language processing. Existing multi-label classification models generally consider labels as categorical variables and ignore the exploitation of label semantics. In this paper, we view the task as a correlation-guided text representation problem: an attention-based two-step framework is proposed to integrate text information and label semantics by jointly learning words and labels in the same space. In this way, we aim to capture high-order label-label correlations as well as context-label correlations. Specifically, the proposed approach works by learning token-level representations of words and labels globally through a multi-layer Transformer and constructing an attention vector through word-label correlation matrix to generate the text representation. It ensures that relevant words receive higher weights than irrelevant words and thus directly optimizes the classification performance. Extensive experiments over benchmark multi-label datasets clearly validate the effectiveness of the proposed approach, and further analysis demonstrates that it is competitive in both predicting low-frequency labels and convergence speed.

2019 ◽  
Vol 7 ◽  
pp. 139-155 ◽  
Author(s):  
Nikolaos Pappas ◽  
James Henderson

Neural text classification models typically treat output labels as categorical variables that lack description and semantics. This forces their parametrization to be dependent on the label set size, and, hence, they are unable to scale to large label sets and generalize to unseen ones. Existing joint input-label text models overcome these issues by exploiting label descriptions, but they are unable to capture complex label relationships, have rigid parametrization, and their gains on unseen labels happen often at the expense of weak performance on the labels seen during training. In this paper, we propose a new input-label model that generalizes over previous such models, addresses their limitations, and does not compromise performance on seen labels. The model consists of a joint nonlinear input-label embedding with controllable capacity and a joint-space-dependent classification unit that is trained with cross-entropy loss to optimize classification performance. We evaluate models on full-resource and low- or zero-resource text classification of multilingual news and biomedical text with a large label set. Our model outperforms monolingual and multilingual models that do not leverage label semantics and previous joint input-label space models in both scenarios.


Author(s):  
Ming Hao ◽  
Weijing Wang ◽  
Fang Zhou

Short text classification is an important foundation for natural language processing (NLP) tasks. Though, the text classification based on deep language models (DLMs) has made a significant headway, in practical applications however, some texts are ambiguous and hard to classify in multi-class classification especially, for short texts whose context length is limited. The mainstream method improves the distinction of ambiguous text by adding context information. However, these methods rely only the text representation, and ignore that the categories overlap and are not completely independent of each other. In this paper, we establish a new general method to solve the problem of ambiguous text classification by introducing label embedding to represent each category, which makes measurable difference between the categories. Further, a new compositional loss function is proposed to train the model, which makes the text representation closer to the ground-truth label and farther away from others. Finally, a constraint is obtained by calculating the similarity between the text representation and label embedding. Errors caused by ambiguous text can be corrected by adding constraints to the output layer of the model. We apply the method to three classical models and conduct experiments on six public datasets. Experiments show that our method can effectively improve the classification accuracy of the ambiguous texts. In addition, combining our method with BERT, we obtain the state-of-the-art results on the CNT dataset.


Author(s):  
Muhammad Zulqarnain ◽  
Rozaida Ghazali ◽  
Muhammad Ghulam Ghouse ◽  
Muhammad Faheem Mushtaq

Text classification has become very serious problem for big organization to manage the large amount of online data and has been extensively applied in the tasks of Natural Language Processing (NLP). Text classification can support users to excellently manage and exploit meaningful information require to be classified into various categories for further use. In order to best classify texts, our research efforts to develop a deep learning approach which obtains superior performance in text classification than other RNNs approaches. However, the main problem in text classification is how to enhance the classification accuracy and the sparsity of the data semantics sensitivity to context often hinders the classification performance of texts. In order to overcome the weakness, in this paper we proposed unified structure to investigate the effects of word embedding and Gated Recurrent Unit (GRU) for text classification on two benchmark datasets included (Google snippets and TREC). GRU is a well-known type of recurrent neural network (RNN), which is ability of computing sequential data over its recurrent architecture. Experimentally, the semantically connected words are commonly near to each other in embedding spaces. First, words in posts are changed into vectors via word embedding technique. Then, the words sequential in sentences are fed to GRU to extract the contextual semantics between words. The experimental results showed that proposed GRU model can effectively learn the word usage in context of texts provided training data. The quantity and quality of training data significantly affected the performance. We evaluated the performance of proposed approach with traditional recurrent approaches, RNN, MV-RNN and LSTM, the proposed approach is obtained better results on two benchmark datasets in the term of accuracy and error rate.


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Jin Dai ◽  
Xin Liu

The similarity between objects is the core research area of data mining. In order to reduce the interference of the uncertainty of nature language, a similarity measurement between normal cloud models is adopted to text classification research. On this basis, a novel text classifier based on cloud concept jumping up (CCJU-TC) is proposed. It can efficiently accomplish conversion between qualitative concept and quantitative data. Through the conversion from text set to text information table based on VSM model, the text qualitative concept, which is extraction from the same category, is jumping up as a whole category concept. According to the cloud similarity between the test text and each category concept, the test text is assigned to the most similar category. By the comparison among different text classifiers in different feature selection set, it fully proves that not only does CCJU-TC have a strong ability to adapt to the different text features, but also the classification performance is also better than the traditional classifiers.


Author(s):  
Yakobus Wiciaputra ◽  
Julio Young ◽  
Andre Rusli

With the large amount of text information circulating on the internet, there is a need of a solution that can help processing data in the form of text for various purposes. In Indonesia, text information circulating on the internet generally uses 2 languages, English and Indonesian. This research focuses in building a model that is able to classify text in more than one language, or also commonly known as multilingual text classification. The multilingual text classification will use the XLM-RoBERTa model in its implementation. This study applied the transfer learning concept used by XLM-RoBERTa to build a classification model for texts in Indonesian using only the English News Dataset as a training dataset with Matthew Correlation Coefficient value of 42.2%. The results of this study also have the highest accuracy value when tested on a large English News Dataset (37,886) with Matthew Correlation Coefficient value of 90.8%, accuracy of 93.3%, precision of 93.4%, recall of 93.3%, and F1 of 93.3% and the accuracy value when tested on a large Indonesian News Dataset (70,304) with Matthew Correlation Coefficient value of 86.4%, accuracy, precision, recall, and F1 values of 90.2% using the large size Mixed News Dataset (108,190) in the model training process. Keywords: Multilingual Text Classification, Natural Language Processing, News Dataset, Transfer Learning, XLM-RoBERTa


Author(s):  
Wenfu Liu ◽  
Jianmin Pang ◽  
Nan Li ◽  
Xin Zhou ◽  
Feng Yue

AbstractSingle-label classification technology has difficulty meeting the needs of text classification, and multi-label text classification has become an important research issue in natural language processing (NLP). Extracting semantic features from different levels and granularities of text is a basic and key task in multi-label text classification research. A topic model is an effective method for the automatic organization and induction of text information. It can reveal the latent semantics of documents and analyze the topics contained in massive information. Therefore, this paper proposes a multi-label text classification method based on tALBERT-CNN: an LDA topic model and ALBERT model are used to obtain the topic vector and semantic context vector of each word (document), a certain fusion mechanism is adopted to obtain in-depth topic and semantic representations of the document, and the multi-label features of the text are extracted through the TextCNN model to train a multi-label classifier. The experimental results obtained on standard datasets show that the proposed method can extract multi-label features from documents, and its performance is better than that of the existing state-of-the-art multi-label text classification algorithms.


2009 ◽  
Vol 12 (7) ◽  
pp. 5-14
Author(s):  
Anh Hoang Tu Nguyen ◽  
Chi Tran Kim Nguyen ◽  
Phi Hong Nguyen

Text representation models are very important pre-processing step in various domains such as text mining, information retrieval, natural language processing. In this paper we summarize graph-based text representation models. Graph-based model can capture structural information such as the location, order and proximity of term occurrence, which is discarded under the standard text vector representation models. We have tested this graph model in Vietnamese text classification system.


Author(s):  
Shujuan Yu ◽  
Danlei Liu ◽  
Yun Zhang ◽  
Shengmei Zhao ◽  
Weigang Wang

As an important branch of Nature Language Processing (NLP), how to extract useful text information and effective long-range associations has always been a bottleneck for text classification. With the great effort of deep learning researchers, deep Convolutional Neural Networks (CNNs) have made remarkable achievements in Computer Vision but still controversial in NLP tasks. In this paper, we propose a novel deep CNN named Deep Pyramid Temporal Convolutional Network (DPTCN) for short text classification, which is mainly consisting of concatenated embedding layer, causal convolution, 1/2 max pooling down-sampling and residual blocks. It is worth mentioning that our work was highly inspired by two well-designed models: one is temporal convolutional network for sequential modeling; another is deep pyramid CNN for text categorization; as their applicability and pertinence remind us how to build a model in a special domain. In the experiments, we evaluate the proposed model on 7 datasets with 6 models and analyze the impact of three different embedding methods. The results prove that our work is a good attempt to apply word-level deep convolutional network in short text classification.


2021 ◽  
Vol 11 (1) ◽  
pp. 428
Author(s):  
Donghoon Oh ◽  
Jeong-Sik Park ◽  
Ji-Hwan Kim ◽  
Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.


Sign in / Sign up

Export Citation Format

Share Document