Text Classification Using Self-Structure Extended Multinomial Naive Bayes

Author(s):  
Arun Solanki ◽  
Rajat Saxena

With the advent of neural networks and its subfields like deep neural networks and convolutional neural networks, it is possible to make text classification predictions with high accuracy. Among the many subtypes of naive Bayes, multinomial naive Bayes is used for text classification. Many attempts have been made to somehow develop an algorithm that uses the simplicity of multinomial naive Bayes and at the same time incorporates feature dependency. One such effort was put in structure extended multinomial naive Bayes, which uses one-dependence estimators to inculcate dependencies. Basically, one-dependence estimators take one of the attributes as features and all other attributes as its child. This chapter proposes self structure extended multinomial naïve Bayes, which presents a hybrid model, a combination of the multinomial naive Bayes and structure extended multinomial naive Bayes. Basically, it tries to classify the instances that were misclassified by structure extended multinomial naive Bayes as there was no direct dependency between attributes.

Author(s):  
Ivan Nathaniel Husada ◽  
Hapnes Toba

Nowadays internet access is getting easier to get. Because of the ease of access to the internet, almost all internet users have social media. Social media is widely used by users to call out their opinions or even to make complaints about a matter and also discuss a topic with other social media users. From many existing social media, one that is popularly used for that activity is Twitter. Sentiment analysis on Twitter has become possible because of the activities of these Twitter users. In this research, the authors explore sentiment analysis with bag-of-words and Term Frequency Inverse Document Frequency (TF-IDF) features extraction based on tweets from Indonesian Twitter users. The data obtained is in imbalanced condition, so that it requires a method to overcome them. The method for overcoming imbalanced dataset uses a resampling approach which combines over and under sampling strategies. The results of sentiment analysis accuracies with Naïve Bayes and neural networks before and after input data resampling are also compared. Naïve Bayes methods that will be used are Multinomial Naïve Bayes and Complement Naïve Bayes, while the Neural Network architecture that will be used as a comparison are Recurrent Neural Networks, Long Short-Term Memory, Gated Recurrent Units, Convolutional Neural Networks, and a combination of Convolutional Neural Networks and Long Short-Term Memory. Our experiments show the following harmonic scores (F1) of the sentiment analysis models: the Multinomial Naïve Bayes F1 score is 55.48, Complement Naïve Bayes is 51.33, Recurrent Neural Network  is 75.70, Long Short-Term Memory is 78.36, Gated Recurrent Unit is 77.96, Convolutional Neural Network is 76.12, and finally the combination of Convolutional Neural Networks and Long Short-Term Memory achieves 81.14.


2012 ◽  
Vol 433-440 ◽  
pp. 2881-2886 ◽  
Author(s):  
Run Zhi Li ◽  
Yang Sen Zhang

In this paper, we study on the problem of how to combine feature selection models in text classification ,and present a method through build the hybrid model for feature selection ,this hybrid model combined with advantage of four feature selection models (DF,MI, IG, CHI), then we use the Naive Bayes model as classifier to verify the effect of the hybrid feature selelction model ,and experiments shows that the hybrid model is correct and effective and get good performance in text classification.


Author(s):  
Zhipeng Tan ◽  
Jing Chen ◽  
Qi Kang ◽  
MengChu Zhou ◽  
Abdullah Abusorrah ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document