Solving unbalanced data for Thai sentiment analysis

Analisis Sentimen Berbasis Aspek pada Review Female Daily Menggunakan TF-IDF dan Naïve Bayes

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i2.2845 ◽

2021 ◽

Vol 5 (2) ◽

pp. 422

Author(s):

Clarisa Hasya Yutika ◽

Adiwijaya Adiwijaya ◽

Said Al Faraby

Keyword(s):

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Unbalanced Data ◽

Test Results ◽

Product Review ◽

Beauty Products

The results of a product review will provide considerable benefits for producers or consumers. Female daily is a forum that discusses beauty products. There are many reviews that are obtained every day. Therefore a technique is needed to analyze the results of the review into valuable information. One of the techniques is aspect-based sentiment analysis. Aspect-based sentiment analysis will analyze each text to identify various aspects (attributes or components) then determine the level of sentiment (positive, negative, or neutral) that is appropriate for each aspect. From the results obtained, there are reviews that use multilingual languages. Then the steps taken are to translate the multilingual language into one language only, namely Indonesian. Before the review is processed, preprocessing will be carried out to make it easier to process. Then the word weighting is done using TF-IDF, and the method for classifying sentiments that will be used is Complement Naïve Bayes to overcome unbalanced data. From the test results obtained the best F1-Score of 62,81% for data translated into English and then into Indonesian and not using stopword removal

Download Full-text

Addressing the Problem of Unbalanced Data Sets in Sentiment Analysis

Proceedings of the International Conference on Knowledge Discovery and Information Retrieval ◽

10.5220/0004142603060311 ◽

2012 ◽

Keyword(s):

Sentiment Analysis ◽

Unbalanced Data ◽

Data Sets

Download Full-text

Study of Sentiment of Governor's Election Opinion in 2018

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset21841124 ◽

2018 ◽

pp. 231-238

Author(s):

Agung Eddy Suryo Saputro ◽

Khairil Anwar Notodiputro ◽

Indahwati A

Keyword(s):

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Addition Method ◽

Sentiment Mining ◽

Positive Sentiment ◽

An historical analysis of species references in American English

Corpora ◽

10.3366/cor.2019.0177 ◽

2019 ◽

Vol 14 (3) ◽

pp. 327-349

Author(s):

Craig Frayne

Keyword(s):

Environmental Change ◽

Sentiment Analysis ◽

Quantitative Methods ◽

English Language ◽

Language Use ◽

American English ◽

Historical Analysis ◽

The Past ◽

Corpus Studies ◽

Google Books

This study uses the two largest available American English language corpora, Google Books and the Corpus of Historical American English (coha), to investigate relations between ecology and language. The paper introduces ecolinguistics as a promising theme for corpus research. While some previous ecolinguistic research has used corpus approaches, there is a case to be made for quantitative methods that draw on larger datasets. Building on other corpus studies that have made connections between language use and environmental change, this paper investigates whether linguistic references to other species have changed in the past two centuries and, if so, how. The methodology consists of two main parts: an examination of the frequency of common names of species followed by aspect-level sentiment analysis of concordance lines. Results point to both opportunities and challenges associated with applying corpus methods to ecolinguistc research.

Download Full-text