text classification
Recently Published Documents





2022 ◽  
Vol 59 (2) ◽  
pp. 102798
Haihua Chen ◽  
Lei Wu ◽  
Jiangping Chen ◽  
Wei Lu ◽  
Junhua Ding

With the explosion of internet information, people feel helpless and difficult to choose in the face of massive information. However, the traditional method to organize a huge set of original documents is not only time-consuming and laborious, but also not ideal. The automatic text classification can liberate users from the tedious document processing work, recognize and distinguish different document contents more conveniently, make a large number of complicated documents institutionalized and systematized, and greatly improve the utilization rate of information. This paper adopts termed-based model to extract the features in web semantics to represent document. The extracted web semantics features are used to learn a reduced support vector machine. The experimental results show that the proposed method can correctly identify most of the writing styles.

2022 ◽  
Vol 11 (2) ◽  
pp. 0-0

In the recent times transfer learning models have known to exhibited good results in the area of text classification for question-answering, summarization, next word prediction but these learning models have not been extensively used for the problem of hate speech detection yet. We anticipate that these networks may give better results in another task of text classification i.e. hate speech detection. This paper introduces a novel method of hate speech detection based on the concept of attention networks using the BERT attention model. We have conducted exhaustive experiments and evaluation over publicly available datasets using various evaluation metrics (precision, recall and F1 score). We show that our model outperforms all the state-of-the-art methods by almost 4%. We have also discussed in detail the technical challenges faced during the implementation of the proposed model.

Hu Zhang ◽  
Bangze Pan ◽  
Ru Li

Legal judgment elements extraction (LJEE) aims to identify the different judgment features from the fact description in legal documents automatically, which helps to improve the accuracy and interpretability of the judgment results. In real court rulings, judges usually need to scan both the fact descriptions and the law articles repeatedly to find out the relevant information, and it is hard to acquire the key judgment features quickly, so legal judgment elements extraction is a crucial and challenging task for legal judgment prediction. However, most existing methods follow the text classification framework, which fails to model the attentive relations of the law articles and the legal judgment elements. To address this issue, we simulate the working process of human judges, and propose a legal judgment elements extraction method with a law article-aware mechanism, which captures the complex semantic correlations of the law article and the legal judgment elements. Experimental results show that our proposed method achieves significant improvements than other state-of-the-art baselines on the element recognition task dataset. Compared with the BERT-CNN model, the proposed “All labels Law Articles Embedding Model (ALEM)” improves the accuracy, recall, and F1 value by 0.5, 1.4 and 1.0, respectively.

2022 ◽  
Vol 2022 ◽  
pp. 1-17
Rukhma Qasim ◽  
Waqas Haider Bangyal ◽  
Mohammed A. Alqarni ◽  
Abdulwahab Ali Almazroi

Text Classification problem has been thoroughly studied in information retrieval problems and data mining tasks. It is beneficial in multiple tasks including medical diagnose health and care department, targeted marketing, entertainment industry, and group filtering processes. A recent innovation in both data mining and natural language processing gained the attention of researchers from all over the world to develop automated systems for text classification. NLP allows categorizing documents containing different texts. A huge amount of data is generated on social media sites through social media users. Three datasets have been used for experimental purposes including the COVID-19 fake news dataset, COVID-19 English tweet dataset, and extremist-non-extremist dataset which contain news blogs, posts, and tweets related to coronavirus and hate speech. Transfer learning approaches do not experiment on COVID-19 fake news and extremist-non-extremist datasets. Therefore, the proposed work applied transfer learning classification models on both these datasets to check the performance of transfer learning models. Models are trained and evaluated on the accuracy, precision, recall, and F1-score. Heat maps are also generated for every model. In the end, future directions are proposed.

2022 ◽  
Vol 7 ◽  
pp. e831
Xudong Jia ◽  
Li Wang

Text classification is a fundamental task in many applications such as topic labeling, sentiment analysis, and spam detection. The text syntactic relationship and word sequence are important and useful for text classification. How to model and incorporate them to improve performance is one key challenge. Inspired by human behavior in understanding text. In this paper, we combine the syntactic relationship, sequence structure, and semantics for text representation, and propose an attention-enhanced capsule network-based text classification model. Specifically, we use graph convolutional neural networks to encode syntactic dependency trees, build multi-head attention to encode dependencies relationship in text sequence, merge with semantic information by capsule network at last. Extensive experiments on five datasets demonstrate that our approach can effectively improve the performance of text classification compared with state-of-the-art methods. The result also shows capsule network, graph convolutional neural network, and multi-headed attention has integration effects on text classification tasks.

2022 ◽  
Vol 2022 ◽  
pp. 1-9
Wanli Luo ◽  
Lei Zhang

The Internet of Things applications are diverse in nature, and a key aspect of it is multimedia sensors and devices. These IoT multimedia devices form the Internet of Multimedia Things (IoMT). Compared with the Internet of Things, it generates a large amount of text data with different characteristics and requirements. Aiming at the problems that machine learning and single structure deep learning model cannot effectively grasp the text emotional information in text processing, resulting in poor classification effect, this paper proposes a text classification method of tourism questions based on deep learning model. First, the corpus is trained with word2vec tool based on continuous word bag model to obtain the text word vector representation. Then, the attention mechanism is introduced into the long-short term network (LSTM), and the attention-based LSTM model is constructed for text feature extraction, which highlights the impact of different words in the input text on the text emotion category. Finally, the text features are input into the Softmax classifier to obtain the probability distribution of text categories, and the model is trained combined with the cross entropy loss function. The experimental results show that the average accuracy, recall, and F value are 0.943, 0.867, and 0.903, respectively, which has better classification effect than other methods.

Sign in / Sign up

Export Citation Format

Share Document