scholarly journals Naive Bayes Classifier Optimization on Sentiment Analysis of Hotel Reviews

2020 ◽  
Vol 10 (2) ◽  
pp. 157-168
Author(s):  
Siti Khomsah

Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire  optimal machine learning model,  information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.

2020 ◽  
Vol 10 (2) ◽  
pp. 157
Author(s):  
Siti Khomsah

<p class="JGI-AbstractIsi">Feature extraction plays an important role in the sentiment analysis process, especially of text data. The Naive Bayes Classifier performs well on low feature dimensions. However, the accuracy provided is not optimal. To acquire  optimal machine learning model,  information gain method, evolutionary algorithm, and swarm intelligent algorithm are applied. The objective of this study is to determine the performance of the Particle Swarm Optimization (PSO) to optimize the Naive Bayes Classifier. Vectorization of words is carried out using TF-IDF. In order to produce high PSO performance, the PSO-NBC model is tested with several parameters, namely the number of particles (k = 3), setting of the number of iterations and inertia weight, individual intelligence coefficient (c1 = 1), and social intelligence coefficient (c2 = 2). Inert weight is calculated using the formulation (w = 0.5+ Rand ([- 1,1])). In conclusion, PSO is able to solve the problem space of text-based sentiment analysis. PSO is able to optimize the accuracy of Naive Bayes at a value of 89% to 91.76%. PSO performance is determined by the parameters used, especially the number of particles, the number of iterations, and the weight of inertia. A large number of particles accompanied by an increase in inertia weight can increase accuracy. The number of particles 20-30 has reached the optimal accuracy.</p>


2019 ◽  
Vol 3 (2) ◽  
pp. 227-232
Author(s):  
Bobby Suryo Prakoso ◽  
Didi Rosiyadi ◽  
Heru Sukma Utama ◽  
Dedi Aridarma

Penelitian yang dilakukan ini merupakan bagian dari text mining untuk klasifikasi konten berita yang telah memiliki label berdasarkan katagori berita pada situs detik.com . Proses yang dilakukan adalah melakukan permodelan dan pengolahan data, mulai proses pre-processing, proses seleksi fitur information gain, dan penerapan model algoritma Naive Bayes Classifier dengan Bayesian Boosting. Hasil yang diperoleh atas model tersebut mendapatkan nilai evaluasi terhadap akurasi, recall, dan presisi sebesar 73.2%. Sedangkan dengan model yang lebih ringkas yaitu model algoritma Naive Bayes Classifier, dengan Bayesian Boosting mendapatkan nilai evaluasi yang sama besar yaitu 73.2%. Penilaian atas hasil evaluasi model yang telah terlaksankan berkesimpulan bahwa penerapan seleksi fitur Information Gain tidak berpengaruh besar atas kenaikan hasil performa terhadap kondisi label Polynomial.  


CAUCHY ◽  
2021 ◽  
Vol 7 (1) ◽  
pp. 28-39
Author(s):  
Adri Priadana ◽  
Ahmad Ashril Rizal

The COVID-19 pandemic impact has affected all industries in Indonesia and even the world, including the tourism industry. Researchers have a role in researching to answer the needs of the tourism industry, especially in making tourism and business destination management programs and carrying out activities oriented to meet the needs of the tourism industry. Meanwhile, the government has a role in making policies, especially in the roadmap, for developing the tourism industry. This study aims to track trending topics in social media Instagram since COVID-19 hit. The results of trending topics will be classified by sentiment analysis using a Lexicon-based and Naive Bayes Classifier. Based on Instagram data taken since January 2020, it shows the five highest topics in the tourism sector, namely health protocols, hotels, homes, streets, and beaches. Of the five topics, sentiment analysis was carried out with the Lexicon-based and Naive Bayes classifier, showing that beaches get an incredibly positive sentiment, namely 80.87%, and hotels provide the highest negative sentiment 57.89%. The accuracy of the Confusion matrix's sentiment results shows that the accuracy, precision, and recall are 82.53%, 86.99%, and 83.43%, respectively.


2021 ◽  
Author(s):  
Adhitia Erfina ◽  
Moneyta Dholah Rosita Ndk ◽  
Rahmat Hidayat ◽  
Aris Subagja ◽  
Haerul Ramadhan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document