Chi-Square Feature Selection Effect On Naive Bayes Classifier Algorithm Performance For Sentiment Analysis Document

Sentiment analysis is the computational study of opinions, sentiments, and emotions expressed in texts. The basic task of sentiment analysis is to classify the polarity of the existing texts in documents, sentences, or opinions. Polarity has meaning if there is text in the document, sentence, or the opinion has a positive or negative aspect. In this study, classification of the polarity in sentiment analysis using machine learning techniques, that is Naïve Bayes classifier. Criteria for text classification decisions, learned automatically from learning the data. The need for manual classification is still required because training the data derived from manually labeling, the label (feature) refers to the process of adding a description of each data according to its category. In the process of labeling, feature selection is used and performed by chi-square feature selection, to reduce the disturbance (noise) in the classification. The results showed that the frequency of occurrences of the expected features in the true category and in the false category have an important role in the chi-square feature selection. Then classification breaking news by Naïve Bayes classifier obtained an accuracy of 83% and a harmonic average of 90.713%.

Download Full-text

Intrusion Detection Model Using Chi Square Feature Selection and Modified Naïve Bayes Classifier

Proceedings of the 3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC – 16’) - Smart Innovation, Systems and Technologies ◽

10.1007/978-3-319-30348-2_7 ◽

2016 ◽

pp. 81-91 ◽

Cited By ~ 3

Author(s):

I. Sumaiya Thaseen ◽

Ch. Aswani Kumar

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Chi Square ◽

Detection Model

Download Full-text

Analisis Sentiment Tweets Berbahasa Sunda Menggunakan Naive Bayes Classifier dengan Seleksi Feature Chi Squared Statistic

Jurnal Informatika Universitas Pamulang ◽

10.32493/informatika.v4i3.3186 ◽

2019 ◽

Vol 4 (3) ◽

pp. 87

Author(s):

Yono Cahyono ◽

Saprudin Saprudin

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Chi Square ◽

Chi Squared ◽

Use Of Social Media

At present the development of the use of social media in Indonesia is very rapid, in Indonesia there are a variety of regional languages, one of which is the Sundanese language, where some people especially those living in West Java use Sundanese language to express comments, opinions, suggestions, criticisms and others in social media. This information can be used as valuable data for individuals or organizations in decision making. The huge amount of data makes it impossible for humans to read and analyze it manually. Sentiment analysis is the process of classifying opinions, analyzing, understanding, evaluating, emotions and attitudes towards a particular entity such as individuals, organizations, products or services, topics, events, in order to obtain information. The purpose of this research is the Naїve Bayes Classifier (NBC) classification algorithm and Feature Chi Squared Statistics selection method can be used in Sundanese-language tweets sentiment analysis on Twitter social media into positive, negative and neutral categories. Chi Square Statistic feature test results can reduce irrelevant features in the Naïve Bayes Classifier classification process on Sundanese-language tweets with an accuracy of 78.48%.

Download Full-text

SENTIMENT ANALYSIS ON CLOSURE OF ILLEGAL MOVIE STREAMING SITES USING NAÏVE BAYES ALGORITHM

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v16i1.1306 ◽

2020 ◽

Vol 16 (1) ◽

pp. 123-128

Author(s):

Dinda Ayu Muthia

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Selection Methods ◽

Naïve Bayes Classifier ◽

Advantages And Disadvantages

The closure of illegal movie streaming sites IndoXXI has been a trending topic on Twitter at the end of 2019. The reaction of netizens on Twitter shows positive and negative sentiments. Until now, there have been many studies in the field of Sentiment Analysis using data in the form of Tweets from Twitter users. In sentiment analysis research, there are so many method used, and Naïve Bayes is one of it, because it is very simple and efficient. The method has advantages and disadvantages. Naïve Bayes is so sensitive in feature selection. Too many features not only increase calculation time but also reduce classification accuracy. In order to solve the disadvantages and increase the performance of the Naïve Bayes classifier, this method often being combined with many kind of feature selection methods. This research aims to classify tweets into positive and negative using the Naïve Bayes classifier combined with the Genetic Algorithm. The accuracy of Naïve Bayes before using the combination of feature selection methods reaches 79.55%. While after using feature selection methods, which is the Genetic Algorithm, accuracy increased up to 88.64%. The accuracy improved by up to 9.09%.

Download Full-text