Solving unbalanced data for Thai sentiment analysis

Author(s):  
Warunya Wunnasri ◽  
Thanaruk Theeramunkong ◽  
Choochart Haruechaiyasak
2021 ◽  
Vol 5 (2) ◽  
pp. 422
Author(s):  
Clarisa Hasya Yutika ◽  
Adiwijaya Adiwijaya ◽  
Said Al Faraby

The results of a product review will provide considerable benefits for producers or consumers. Female daily is a forum that discusses beauty products. There are many reviews that are obtained every day. Therefore a technique is needed to analyze the results of the review into valuable information. One of the techniques is aspect-based sentiment analysis. Aspect-based sentiment analysis will analyze each text to identify various aspects (attributes or components) then determine the level of sentiment (positive, negative, or neutral) that is appropriate for each aspect. From the results obtained, there are reviews that use multilingual languages. Then the steps taken are to translate the multilingual language into one language only, namely Indonesian. Before the review is processed, preprocessing will be carried out to make it easier to process. Then the word weighting is done using TF-IDF, and the method for classifying sentiments that will be used is Complement Naïve Bayes to overcome unbalanced data. From the test results obtained the best F1-Score of 62,81% for data translated into English and then into Indonesian and not using stopword removal


Author(s):  
Agung Eddy Suryo Saputro ◽  
Khairil Anwar Notodiputro ◽  
Indahwati A

In 2018, Indonesia implemented a Governor's Election which included 17 provinces. For several months before the Election, news and opinions regarding the Governor's Election were often trending topics on Twitter. This study aims to describe the results of sentiment mining and determine the best method for predicting sentiment classes. Sentiment mining is based on Lexicon. While the methods used for sentiment analysis are Naive Bayes and C5.0. The results showed that the percentage of positive sentiment in 17 provinces was greater than the negative and neutral sentiments. In addition, method C5.0 produces a better prediction than Naive Bayes.


Corpora ◽  
2019 ◽  
Vol 14 (3) ◽  
pp. 327-349
Author(s):  
Craig Frayne

This study uses the two largest available American English language corpora, Google Books and the Corpus of Historical American English (coha), to investigate relations between ecology and language. The paper introduces ecolinguistics as a promising theme for corpus research. While some previous ecolinguistic research has used corpus approaches, there is a case to be made for quantitative methods that draw on larger datasets. Building on other corpus studies that have made connections between language use and environmental change, this paper investigates whether linguistic references to other species have changed in the past two centuries and, if so, how. The methodology consists of two main parts: an examination of the frequency of common names of species followed by aspect-level sentiment analysis of concordance lines. Results point to both opportunities and challenges associated with applying corpus methods to ecolinguistc research.


Sign in / Sign up

Export Citation Format

Share Document