Imbalanced sentiment classification based on sequence generative adversarial nets

2020 ◽  
Vol 39 (5) ◽  
pp. 7909-7919
Author(s):  
Chuantao Wang ◽  
Xuexin Yang ◽  
Linkai Ding

The purpose of sentiment classification is to solve the problem of automatic judgment of sentiment tendency. In the sentiment classification task of text data (such as online reviews), the traditional deep learning model focuses on algorithm optimization, but ignores the characteristics of the imbalanced distribution of the number of samples in each classification, which will cause the classification performance of the model to decrease in practical applications. In this paper, the experiment is divided into two stages. In the first stage, samples of minority class in the sample distribution are used to train a sequence generative adversarial nets, so that the sequence generative adversarial nets can learn the features of the samples of minority class in depth. In the second stage, the trained generator of sequence generative adversarial nets is used to generate false samples of minority class and mix them with the original samples to balance the sample distribution. After that, the mixed samples are input into the sentiment classification deep model to complete the model training. Experimental results show that the model has excellent classification performance in comparing a variety of deep learning models based on classic imbalanced learning methods in the sentiment classification task of hotel reviews.

2021 ◽  
Vol 25 (3) ◽  
pp. 555-570
Author(s):  
Chuantao Wang ◽  
Xuexin Yang ◽  
Linkai Ding

Sentiment classification aims to solve the problem of automatic judgment of sentiment polarity. In the sentiment classification task of text data, such as online reviews, traditional deep learning models are dedicated to algorithm optimization but ignore the characteristics of imbalanced distribution of the number of classified samples and the inclusion of weak tagging information such as ratings and tags. Based on the traditional deep learning model, the method of random oversampling and cost sensitivity is used to increase the contribution of a minority of samples to the model loss function and avoid the model biasing to the majority of samples. The model training is divided into two stages. In the first stage, a large amount of weak tagging data is used to train the model, therefore a model that captures the sentiment semantics of the data is obtained. After that, the model parameters trained in the first stage are used as the initial parameters of the second stage model training, and only a small amount of tagging data is used to continue training the model to reduce the impact of noise, thus reducing the use of manual tagging samples. The experimental results show that the method is considerably better than traditional deep learning models in the sentiment classification task of hotel review data.


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 54
Author(s):  
Chen Fu ◽  
Jianhua Yang

The problem of classification for imbalanced datasets is frequently encountered in practical applications. The data to be classified in this problem are skewed, i.e., the samples of one class (the minority class) are much less than those of other classes (the majority class). When dealing with imbalanced datasets, most classifiers encounter a common limitation, that is, they often obtain better classification performances on the majority classes than those on the minority class. To alleviate the limitation, in this study, a fuzzy rule-based modeling approach using information granules is proposed. Information granules, as some entities derived and abstracted from data, can be used to describe and capture the characteristics (distribution and structure) of data from both majority and minority classes. Since the geometric characteristics of information granules depend on the distance measures used in the granulation process, the main idea of this study is to construct information granules on each class of imbalanced data using Minkowski distance measures and then to establish the classification models by using “If-Then” rules. The experimental results involving synthetic and publicly available datasets reflect that the proposed Minkowski distance-based method can produce information granules with a series of geometric shapes and construct granular models with satisfying classification performance for imbalanced datasets.


2019 ◽  
Vol 9 (4) ◽  
pp. 1-20 ◽  
Author(s):  
Nicola Burns ◽  
Yaxin Bi ◽  
Hui Wang ◽  
Terry Anderson

There is a need to automatically classify information from online reviews. Customers want to know useful information about different aspects of a product or service and also the sentiment expressed towards each aspect. This article proposes an Enhanced Twofold-LDA model (Latent Dirichlet Allocation), in which one LDA is used for aspect assignment and another is used for sentiment classification, aiming to automatically determine aspect and sentiment. The enhanced model incorporates domain knowledge (i.e., seed words) to produce more focused topics and has the ability to handle two aspects in at the sentence level simultaneously. The experiment results show that the Enhanced Twofold-LDA model is able to produce topics more related to aspects in comparison to the state of arts method ASUM (Aspect and Sentiment Unification Model), whereas comparable with ASUM on sentiment classification performance.


Entropy ◽  
2021 ◽  
Vol 23 (2) ◽  
pp. 204
Author(s):  
Yuchai Wan ◽  
Hongen Zhou ◽  
Xun Zhang

The Coronavirus disease 2019 (COVID-19) has become one of the threats to the world. Computed tomography (CT) is an informative tool for the diagnosis of COVID-19 patients. Many deep learning approaches on CT images have been proposed and brought promising performance. However, due to the high complexity and non-transparency of deep models, the explanation of the diagnosis process is challenging, making it hard to evaluate whether such approaches are reliable. In this paper, we propose a visual interpretation architecture for the explanation of the deep learning models and apply the architecture in COVID-19 diagnosis. Our architecture designs a comprehensive interpretation about the deep model from different perspectives, including the training trends, diagnostic performance, learned features, feature extractors, the hidden layers, the support regions for diagnostic decision, and etc. With the interpretation architecture, researchers can make a comparison and explanation about the classification performance, gain insight into what the deep model learned from images, and obtain the supports for diagnostic decisions. Our deep model achieves the diagnostic result of 94.75%, 93.22%, 96.69%, 97.27%, and 91.88% in the criteria of accuracy, sensitivity, specificity, positive predictive value, and negative predictive value, which are 8.30%, 4.32%, 13.33%, 10.25%, and 6.19% higher than that of the compared traditional methods. The visualized features in 2-D and 3-D spaces provide the reasons for the superiority of our deep model. Our interpretation architecture would allow researchers to understand more about how and why deep models work, and can be used as interpretation solutions for any deep learning models based on convolutional neural network. It can also help deep learning methods to take a step forward in the clinical COVID-19 diagnosis field.


Symmetry ◽  
2019 ◽  
Vol 12 (1) ◽  
pp. 8
Author(s):  
Jing Chen ◽  
Jun Feng ◽  
Xia Sun ◽  
Yang Liu

Sentiment classification of forum posts of massive open online courses is essential for educators to make interventions and for instructors to improve learning performance. Lacking monitoring on learners’ sentiments may lead to high dropout rates of courses. Recently, deep learning has emerged as an outstanding machine learning technique for sentiment classification, which extracts complex features automatically with rich representation capabilities. However, deep neural networks always rely on a large amount of labeled data for supervised training. Constructing large-scale labeled training datasets for sentiment classification is very laborious and time consuming. To address this problem, this paper proposes a co-training, semi-supervised deep learning model for sentiment classification, leveraging limited labeled data and massive unlabeled data simultaneously to achieve performance comparable to those methods trained on massive labeled data. To satisfy the condition of two views of co-training, we encoded texts into vectors from views of word embedding and character-based embedding independently, considering words’ external and internal information. To promote the classification performance with limited data, we propose a double-check strategy sample selection method to select samples with high confidence to augment the training set iteratively. In addition, we propose a mixed loss function both considering the labeled data with asymmetric and unlabeled data. Our proposed method achieved a 89.73% average accuracy and an 93.55% average F1-score, about 2.77% and 3.2% higher than baseline methods. Experimental results demonstrate the effectiveness of the proposed model trained on limited labeled data, which performs much better than those trained on massive labeled data.


2021 ◽  
Author(s):  
Arousha Haghighian Roudsari ◽  
Jafar Afshar ◽  
Wookey Lee ◽  
Suan Lee

AbstractPatent classification is an expensive and time-consuming task that has conventionally been performed by domain experts. However, the increase in the number of filed patents and the complexity of the documents make the classification task challenging. The text used in patent documents is not always written in a way to efficiently convey knowledge. Moreover, patent classification is a multi-label classification task with a large number of labels, which makes the problem even more complicated. Hence, automating this expensive and laborious task is essential for assisting domain experts in managing patent documents, facilitating reliable search, retrieval, and further patent analysis tasks. Transfer learning and pre-trained language models have recently achieved state-of-the-art results in many Natural Language Processing tasks. In this work, we focus on investigating the effect of fine-tuning the pre-trained language models, namely, BERT, XLNet, RoBERTa, and ELECTRA, for the essential task of multi-label patent classification. We compare these models with the baseline deep-learning approaches used for patent classification. We use various word embeddings to enhance the performance of the baseline models. The publicly available USPTO-2M patent classification benchmark and M-patent datasets are used for conducting experiments. We conclude that fine-tuning the pre-trained language models on the patent text improves the multi-label patent classification performance. Our findings indicate that XLNet performs the best and achieves a new state-of-the-art classification performance with respect to precision, recall, F1 measure, as well as coverage error, and LRAP.


Author(s):  
Fo Hu ◽  
Hong Wang ◽  
Qiaoxiu Wang ◽  
Naishi Feng ◽  
Jichi Chen ◽  
...  

The aim of this study is to quantify acrophobia and provide safety advices for high-altitude workers. Considering that acrophobia is a fuzzy quantity that cannot be accurately evaluated by conventional detection methods, we propose a comprehensive solution to quantify acrophobia. Specifically, this study simulates a virtual reality environment called High-altitude Plank Walking Challenge, which provides a safe and controlled experimental environment for subjects. Besides, a method named Granger Causality Convolutional Neural Network (GCCNN) combining convolutional neural network and Granger causality functional brain network is proposed to analyze the subjects’ noninvasive scalp EEG signals. Here, the GCCNN method is used to distinguish the subjects with severe acrophobia, moderate acrophobia, and no acrophobia in a three-class classification task or no acrophobia and acrophobia in a two-class classification task. Compared with the mainstream methods, the GCCNN method achieves better classification performance, with an accuracy of 98.74% for the two-class classification task (no acrophobia versus acrophobia) and of 98.47% for the three-class classification task (no acrophobia versus moderate acrophobia versus severe acrophobia). Consequently, our proposed GCCNN method can provide more accurate quantitative results than the comparative methods, making it to be more competitive in further practical applications.


Symmetry ◽  
2019 ◽  
Vol 11 (12) ◽  
pp. 1440 ◽  
Author(s):  
Erhu Zhang ◽  
Bo Li ◽  
Peilin Li ◽  
Yajun Chen

Deep learning has been successfully applied to classification tasks in many fields due to its good performance in learning discriminative features. However, the application of deep learning to printing defect classification is very rare, and there is almost no research on the classification method for printing defects with imbalanced samples. In this paper, we present a deep convolutional neural network model to extract deep features directly from printed image defects. Furthermore, considering the asymmetry in the number of different types of defect samples—that is, the number of different kinds of defect samples is unbalanced—seven types of over-sampling methods were investigated to determine the best method. To verify the practical applications of the proposed deep model and the effectiveness of the extracted features, a large dataset of printing detect samples was built. All samples were collected from practical printing products in the factory. The dataset includes a coarse-grained dataset with four types of printing samples and a fine-grained dataset with eleven types of printing samples. The experimental results show that the proposed deep model achieves a 96.86% classification accuracy rate on the coarse-grained dataset without adopting over-sampling, which is the highest accuracy compared to the well-known deep models based on transfer learning. Moreover, by adopting the proposed deep model combined with the SVM-SMOTE over-sampling method, the accuracy rate is improved by more than 20% in the fine-grained dataset compared to the method without over-sampling.


Sign in / Sign up

Export Citation Format

Share Document