Feature Selection Approach for Twitter Sentiment Analysis and Text Classification Based on Chi-Square and Naïve Bayes

Sentiment analysis is the computational study of opinions, sentiments, and emotions expressed in texts. The basic task of sentiment analysis is to classify the polarity of the existing texts in documents, sentences, or opinions. Polarity has meaning if there is text in the document, sentence, or the opinion has a positive or negative aspect. In this study, classification of the polarity in sentiment analysis using machine learning techniques, that is Naïve Bayes classifier. Criteria for text classification decisions, learned automatically from learning the data. The need for manual classification is still required because training the data derived from manually labeling, the label (feature) refers to the process of adding a description of each data according to its category. In the process of labeling, feature selection is used and performed by chi-square feature selection, to reduce the disturbance (noise) in the classification. The results showed that the frequency of occurrences of the expected features in the true category and in the false category have an important role in the chi-square feature selection. Then classification breaking news by Naïve Bayes classifier obtained an accuracy of 83% and a harmonic average of 90.713%.

Download Full-text

A New Feature Selection Approach to Naive Bayes Text Classifiers

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001416500038 ◽

2016 ◽

Vol 30 (02) ◽

pp. 1650003 ◽

Cited By ~ 24

Author(s):

Lungan Zhang ◽

Liangxiao Jiang ◽

Chaoqun Li

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

High Dimensional ◽

Text Data ◽

Selection Approach ◽

Search Feature ◽

Text Classifiers ◽

New Feature ◽

Feature Selection Approach

Handling text data is a challenge for machine learning because text data is high dimensional in many cases. Feature selection has been approved to be an effective approach to handle high-dimensional data. Feature selection approaches can be broadly divided into two categories: filter approaches and wrapper approaches. Generally, wrapper approaches have superior accuracy compared to filters, but filters always run faster than wrapper approaches. In order to integrate the advantages of filter approaches and wrapper approaches, we propose a gain ratio-based hybrid feature selection approach to naive Bayes text classifiers. The hybrid feature selection approach uses base classifiers to evaluate feature subsets like wrapper approaches, but it need not repeatedly search feature subsets and build base classifiers. The experimental results on large suite of benchmark text datasets show that the proposed hybrid feature selection approach significantly improves the classification accuracy of the original naive Bayes text classifiers while does not incur the high time complexity that characterizes wrapper approaches.

Download Full-text

Hybrid feature selection approach using bacterial foraging algorithm guided by Naive Bayes classification

2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) ◽

10.1109/icccnt.2017.8204178 ◽

2017 ◽

Cited By ~ 1

Author(s):

Divya Mittal ◽

Manju Bala

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Bacterial Foraging ◽

Selection Approach ◽

Naive Bayes Classification ◽

Bacterial Foraging Algorithm ◽

Naïve Bayes Classification ◽

Feature Selection Approach

Download Full-text

A comparison of feature selection approach between greedy, IG-ratio, Chi-square, and mRMR in educational mining

2015 7th International Conference on Information Technology and Electrical Engineering (ICITEE) ◽

10.1109/iciteed.2015.7408983 ◽

2015 ◽

Cited By ~ 9

Author(s):

Nachirat Rachburee ◽

Wattana Punlumjeak

Keyword(s):

Feature Selection ◽

Chi Square ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

A Novel Feature Selection Approach and Feature Weight Adjustment Technique in Text Classification

2009 Seventh ACIS International Conference on Software Engineering Research, Management and Applications ◽

10.1109/sera.2009.14 ◽

2009 ◽

Author(s):

Yixing Liao ◽

Xuezeng Pan

Keyword(s):

Feature Selection ◽

Text Classification ◽

Selection Approach ◽

Feature Weight ◽

Feature Selection Approach

Download Full-text

KLASIFIKASI TEKS MENGGUNAKAN CHI SQUARE FEATURE SELECTION UNTUK MENENTUKAN KOMIK BERDASARKAN PERIODE, MATERI DAN FISIKDENGAN ALGORITMA NAIVEBAYES

Compiler ◽

10.28989/compiler.v5i2.171 ◽

2016 ◽

Vol 5 (2) ◽

Author(s):

Siti Anisah ◽

Anton Setiawan Honggowibowo ◽

Asih Pujiastuti

Keyword(s):

Feature Selection ◽

Error Rate ◽

Classification System ◽

Naive Bayes ◽

Naïve Bayes ◽

Chi Square ◽

Oracle Database ◽

Category O ◽

The Difference ◽

Bayes Algorithm

A comic has its own characteristics compared the other types of books. The difference between comic and other books can be seen from the category o f period, material and physical. Comicand other booksneeded an application o f classification system. Looking for the problem, classification system was made using Chi Square Feature Selection and Naive Bayes algorithm to determine the comic based on the period, material and physical. Delphi programming language and Oracle Database are used to build the Classification System. Chi Square Feature Selection acquired trait a comic is in 0.10347 and which not comic is in 1.9531. Furthermore, data is classified by the Naive Bayes algorithm. From 120 titles o f comic that consists 60 titles o f comic and non comicused to build classesfor trainand 60 titles o f comic and non comic used to test. The results o f Naive Bayesalgorithm for comic is 96,67%with 3.33% error rate, and non comic is 90% with 10% error rate. The classification to determine comic is good.

Download Full-text

RFE and Chi-Square Based Feature Selection Approach for Detection of Diabetic Retinopathy

Proceedings of the International Joint Conference on Science and Engineering (IJCSE 2020) ◽

10.2991/aer.k.201124.069 ◽

2020 ◽

Author(s):

Alifah ◽

Titin Siswantining ◽

Devvi Sarwinda ◽

Alhadi Bustamam

Keyword(s):

Feature Selection ◽

Diabetic Retinopathy ◽

Chi Square ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text