SENTIMEN ANALISIS KEBIJAKAN GANJIL GENAP DI TOL BEKASI MENGGUNAKAN ALGORITMA NAIVE BAYES DENGAN OPTIMALISASI INFORMATION GAIN

Analysis of the odd even-numbered sentiment systems in Bekasi toll using the Naïve Bayes Algorithm, is a process of understanding, extracting, and processing textual data automatically from social media. The purpose of this study was to determine the level of accuracy, recall and precision of opinion mining generated using the Naïve Bayes algorithm to provide information community sentiment towards the effectiveness of the odd system of Bekasi tiolls on social media. The research method used in this study was to do text mining in comments-comments regarding posts regarding even odd oddities on Bekasi toll on Twitter, Instagram, Youtube and Facebook. The steps taken are starting from preprocessing, transformation, datamining and evaluation, followed by information gaon feature selection, select by weight and applying NB Algorithm model. The results obtained from the study using the NB model are obtained Confusion Matrix result, namely accuracy of 79,55%, Precision of 80,51%, and Sensitivity or Recall of 80,91%. Thus this study concludes that the use of Support Vector Machine Algorithms can analyze even odd sentiments on the Bekasi toll road.

Download Full-text

Analisis Sentimen Sistem Ganjil Genap di Tol Bekasi Menggunakan Algoritma Support Vector Machine

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v3i2.1050 ◽

2019 ◽

Vol 3 (2) ◽

pp. 243-250

Author(s):

Heru Sukma Utama ◽

Didi Rosiyadi ◽

Bobby Suryo Prakoso ◽

Dedi Ariadarma

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Opinion Mining ◽

Confusion Matrix ◽

Support Vector ◽

Support Vector Machine Algorithm ◽

Toll Road ◽

Svm Algorithm ◽

Svm Model ◽

Textual Data

Analysis of the odd even-numbered sentiment systems in Bekasi toll using the Support Vector Machine Algorithm, is a process of understanding, extracting, and processing textual data automatically from social media. The purpose of this study was to determine the level of accuracy, recall and precision of opinion mining generated using the Support Vector Machine algorithm to provide information community sentiment towards the effectiveness of the odd system of Bekasi tiolls on social media. The research method used in this study was to do text mining in comments-comments regarding posts regarding even odd oddities on Bekasi toll on Twitter, Instagram, Youtube and Facebook. The steps taken are starting from preprocessing, transformation, datamining and evaluation, followed by information gaon feature selection, select by weight and applying SVM Algorithm model. The results obtained from the study using the SVM model are obtained Confusion Matrix result, namely accuracyof 78.18%, Precision of 74.03%, and Sensitivity or Recall of 86.82%. Thus this study concludes that the use of Support Vector Machine Algorithms can analyze even odd sentiments on the Bekasi toll road.

Download Full-text

Opinion Mining on Social Media Transit Tweets using Text Pre-Processing and Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4631.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 1015-1025

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Feature Representation ◽

Machine Learning Techniques ◽

Support Vector ◽

The Impact

Capturing public insights related to transit systems in social media has gained huge popularity presently. The regional transportation agencies use social media as a tool to provide information to the public and seek their inputs and ideas for meaningful decision making in transportation activities. This exploratory study attempts to gauge the impact of social media use in transportation planning that in turn would help transportation administration in identifying the day-to-day challenges faced by the customers and to suggest a suitable solution. This paper presents the effect of pre-processing techniques on transit opinion analysis to improve the performance. Performance of different pre-processing methods namely stop word removal, stemming, lemmatization, negation handling and URL removal using feature representation models namely TF-IDF with unigram, TF-IDF with bigram on three feature selection techniques including information gain, standard deviation and chi-square on social media transit rider’s opinion is carried out. The experimental results are evaluated using four different classifiers such as Support vector machine, Naïve Bayes, Decision Tree, K-Nearest Neighborhood in terms of accuracy, precision, recall, and f-measure. On analyzing the social media related transit opinion data, it is observed that pre-processing with bigram technique performs better than the other approaches specifically with Support Vector Machine and Naïve Bayes.

Download Full-text

Analysis of Sentiment of Moving a National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i3.1942 ◽

2020 ◽

Vol 4 (3) ◽

pp. 504-512

Author(s):

Faried Zamachsari ◽

Gabriel Vangeran Saragih ◽

Susafa'ati ◽

Windu Gata

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Feature Selection ◽

Public Opinion ◽

Naive Bayes ◽

Naïve Bayes ◽

Capital City ◽

Support Vector ◽

National Capital ◽

Bayes Algorithm

The decision to move Indonesia's capital city to East Kalimantan received mixed responses on social media. When the poverty rate is still high and the country's finances are difficult to be a factor in disapproval of the relocation of the national capital. Twitter as one of the popular social media, is used by the public to express these opinions. How is the tendency of community responses related to the move of the National Capital and how to do public opinion sentiment analysis related to the move of the National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine to get the highest accuracy value is the goal in this study. Sentiment analysis data will take from public opinion using Indonesian from Twitter social media tweets in a crawling manner. Search words used are #IbuKotaBaru and #PindahIbuKota. The stages of the research consisted of collecting data through social media Twitter, polarity, preprocessing consisting of the process of transform case, cleansing, tokenizing, filtering and stemming. The use of feature selection to increase the accuracy value will then enter the ratio that has been determined to be used by data testing and training. The next step is the comparison between the Support Vector Machine and Naive Bayes methods to determine which method is more accurate. In the data period above it was found 24.26% positive sentiment 75.74% negative sentiment related to the move of a new capital city. Accuracy results using Rapid Miner software, the best accuracy value of Naive Bayes with Feature Selection is at a ratio of 9:1 with an accuracy of 88.24% while the best accuracy results Support Vector Machine with Feature Selection is at a ratio of 5:5 with an accuracy of 78.77%.

Download Full-text

Application of Naïve Bayes Algorithm in Sentiment Analysis of Filipino, English and Taglish Facebook Comments

Regular issue - International Journal of Management and Humanities ◽

10.35940/ijmh.e0524.014520 ◽

2020 ◽

Vol 4 (5) ◽

pp. 73-77

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Naive Bayes ◽

Naïve Bayes ◽

Product Reviews ◽

Documentary Data ◽

The Social ◽

Bayes Algorithm

The World Wide Web has boosted its content for the past years, it has a vast amount of multimedia resources that continuously grow specifically in documentary data. One of the major contributors of documentary contents can be evidently found on the social media called Facebook. People or netizens on Facebook are actively sharing their opinion about a certain topic or posts that can be related to them or not. With the huge amount of accessible documentary data that are seen on the so-called social media, there are research trends that can be made by the researchers in the field of opinion mining. A netizen’s comment on a particular post can either be a negative or a positive one. This study will discuss the opinion or comment of a netizen whether it is positive or negative or how she/he feels about a specific topic posted on Facebook; this is can be measured by the use of Sentiment Analysis. The combination of the Natural Language Processing and the analytics in textual form is also known as Sentiment Analysis that is use to the extraction of data in a useful manner. This study will be based on the product reviews of Filipinos in Filipino, English and Taglish (mixed Filipino and English) languages. To categorize a comment effectively, the Naïve Bayes Algorithm was implemented to the developed web system.

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

<p>Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.</p>

Download Full-text

Analisis Sentimen Sistem E-Tilang Menggunakan Algoritma Naive Bayes Dengan Optimalisasi Information Gain

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.137 ◽

2020 ◽

Vol 1 (1) ◽

pp. 19-26

Author(s):

Rakhmi Khalida ◽

Siti Setiawati

Keyword(s):

Sentiment Analysis ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Traffic Violations ◽

The Government ◽

Bayes Algorithm ◽

User Friendly

Abstract The Government of Indonesia took steps to change the system to improve public services in traffic violations by implementing the e-ticketing system. This system is a solution for disciplining motorized motorists from committing traffic violations. The existence of e-ticketing is also a solution to prevent the delinquency of law enforcers from illegal levies, peace terms in place, to accountability of fines. In this study, sentiment analysis of the e-ticketing system or opinion mining to classify the variety of public comments that give a positive, negative or neutral impression. Twitter social media is one of the objects to express opinions because it is user friendly, updated topics, and openly accesses tweets. Opinions on Twitter are collected, then the preprocessing stage is performed, then the selection of information gain features helps reduce noise caused by irrelevant labels, the next step is the classification of sentiments with the Naïve Bayes algorithm and finally polarity sentiments. This research resulted in an accuracy of 41.82%, a precision of 50.51% and a recall of 45.45%. Keywords: Sentiment analysis, E-ticketing, Information Gain, Naive Bayes Abstrak Pemerintah Indonesia melakukan langkah perubahan untuk memperbaiki sistem pelayanan publik dalam pelanggaran berlalu-lintas yaitu dengan menerapkan sistem e-Tilang. Sistem ini menjadi solusi mendisiplinkan para pengendara kendaraan bermotor dari banyaknya melakukan pelanggaran berlalu-lintas. Keberadaan e-Tilang juga menjadi solusi mencegah kenakalan penegak hukum dari pungutan liar, istilah damai ditempat, hingga akuntabilitas uang denda. Dalam penelitian ini melakukan analisis sentimen tentang sistem e-Tilang atau opinion mining untuk mengelompokan ragam komentar masyarakat yang memberikan kesan positif, negatif atau netral. Media sosial Twitter menjadi salah satu objek untuk menyampaikan opini karena user friendly, topik ter-update, dan terbuka mengakses tweet. Opini pada twitter dikumpulkan, lalu dilakukan tahapan preprocessing, selanjutnya dengan seleksi fitur information gain membantu mengurangi noise yang disebabkan oleh label-label yang tidak relevan, tahap selanjutnya adalah klasifikasi sentimen dengan algoritma Naïve Bayes dan terakhir sentimen polarity. Penelitian ini menghasilkan accuracy 41,82%, presisi 50,51% dan recall 45,45%. Kata kunci: Analisis sentimen, E-Tilang, Information Gain, Naive Bayes

Download Full-text

A Machine Learning Framework for Improving Classification Performance on Credit Approval

IJID (International Journal on Informatics for Development) ◽

10.14421/ijid.2021.2384 ◽

2021 ◽

Vol 10 (1) ◽

pp. 47-52

Author(s):

Pulung Hendro Prastyo ◽

Septian Eko Prasetyo ◽

Shindy Arti

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Information Gain ◽

Learning Algorithm ◽

Confusion Matrix ◽

Credit Scoring ◽

Research Work ◽

Classification Performance ◽

Naïve Bayes ◽

Bayes Algorithm

Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.

Download Full-text

Cyberbully Detection Using Term Weighting Scheme and Naïve Bayes Classifier

International Journal of Innovative Computing ◽

10.11113/ijic.v10n1.254 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Rafeena Mohamad Rabii ◽

Maheyzah Md Siraj

Keyword(s):

Social Media ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Term Weighting ◽

Weighting Schemes ◽

Bayes Algorithm

The internet especially social media has been a major platform where people interact with each other. We are able to interact with each other regardless of time and place because of the advancement of technology. Unfortunately, not all of the interaction that goes on are good or positive. One of the negative interaction that can happen online is cyberbullying which has rapidly increase throughout the years, whether it be through social media, emails or texting. Therefore, it is important to prevent cyberbullying from occurring which is why this research is done. Detection the presence of cyberbullying is one if the main issue in avoiding it from happening. Cyberbullying detection can be challenging because the many languages used in the world, most of the time slangs and informal languages are used and special characters like emoji are also used during online conversation. The aim of this research is to detect the presence of text cyberbullying from online post. Two term weighting schemes and two classification algorithms are compared in this research. The weighting schemes used namely Entropy and Term Frequency - Inverse Document Frequency (TF-IDF) for feature selection and Naïve Bayes algorithm is used and compared with Support Vector Machine (SVM) algorithm. As a result, it shows that Naïve Bayes classifier yields a better accuracy when used with TF-IDF which is 97.60%. Hopefully this research is able give other researchers an insight, particularly to those who are interested in a similar area.

Download Full-text

Determining Bullying Text Classification Using Naive Bayes Classification on Social Media

Jurnal Varian ◽

10.30812/varian.v4i2.1086 ◽

2021 ◽

Vol 4 (2) ◽

pp. 133-140

Author(s):

Ade Clinton Sitepu ◽

Wanayumini Wanayumini ◽

Zakarias Situmorang

Keyword(s):

Social Media ◽

Naive Bayes ◽

Rapid Development ◽

Confusion Matrix ◽

Area Under The Curve ◽

Naïve Bayes ◽

Cyber Bullying ◽

Training Data ◽

The Media ◽

Bayes Algorithm

Cyber-bullying includes repeated acts with the aim of scaring, angering, or embarrassing those who are targeted Cyber-bullying is happening along with the rapid development of technology and social media in society. The media and users need to filter out bully comments because they can indirectly affect the mental psychology that reads them especially directly aimed at that person. By utilizing information mining, the system is expected to be able to classify information circulating in the community. One of the classification techniques that can be applied to text-based classification is Naïve Bayes. The algorithm is good at performing the classification process. In this research, the precision of the algorithm's has been carried out on 1000 comment datasets. The data is grouped manually first into the labels "bully" and "not bully" then the data is divided into training data and test data. To test the system's ability, the classified data is analyzed using the confusion matrix method. The results showed that the Naïve Bayes Algorithm got the level of precision at 87%. and the level of area under the curve (AUC) at 88%. In terms of speed of completing the system, the Naïve Bayes Algorithm has a very good rate of speed with completion time of 0.033 seconds.

Download Full-text

Sentimen Analisis Komentar Toxic pada Grup Facebook Game Online Menggunakan Klasifikasi Naïve Bayes

Jurnal Informatika Universitas Pamulang ◽

10.32493/informatika.v5i3.6571 ◽

2020 ◽

Vol 5 (3) ◽

pp. 356

Author(s):

Renaldy Permana Sidiq ◽

Budi Arif Dermawan ◽

Yuyun Umaidah

Keyword(s):

Social Media ◽

Feature Selection ◽

Naive Bayes ◽

Information Gain ◽

Text Processing ◽

Confusion Matrix ◽

Naïve Bayes ◽

Classification Model ◽

Testing Data ◽

F Measure

Toxic comments are comments made by social media users that contain expressions of hatred, condescension, threatening, and insulting. Social media users who are on average still teenagers with a nature that still cannot be controlled completely becomes a matter of great concern when they comment, their comments can be studied as text processing. Sentiment analysis can be used as a solution to identifying toxic comments by dividing them into two classifications. Where the data used amounted to 1,500 taken from social media Facebook in the private group Arena of Valor community. The dataset is divided into 2 classes: toxic and non-toxic. This research uses Naive Bayes with TF-IDF transformation and Information Gain feature selection and use distribution ratio 80:20. It will be compared the results of the evaluation where Naive Bayes without transformation, using TF-IDF transformation, and TF-IDF using Information Gain feature selection. The results of the comparison of evaluations from confusion matrix that have been carried out obtained the best classification model is to use the ratio of training and testing data 80:20 with TF-IDF transformation resulting in an accuracy of 75%, precision of 63%, recall of 67%, and F-measure of 64%.

Download Full-text