Studying the Effects of Text Preprocessing and Ensemble Methods on Sentiment Analysis of Brazilian Portuguese Tweets

Sentiment analysis is the process of determining the attitude or the emotional state of a text automatically. Many algorithms are proposed for this task including ensemble methods, which have the potential to decrease error rates of the individual base learners considerably. In many machine learning tasks and especially in sentiment analysis, extracting informative features is as important as developing sophisticated classifiers. In this study, a stacked ensemble method is proposed for sentiment analysis, which systematically combines six feature extraction methods and three classifiers. The proposed method obtains cross-validation accuracies of 89.6%, 90.7% and 67.2% on large movie, Turkish movie and SemEval-2017 datasets, respectively, outperforming the other classifiers. The accuracy improvements are shown to be statistically significant at the 99% confidence level by performing a Z-test.

Download Full-text

Comparative Sentiment Analysis: Great Britain Ver-sus the United States of America Using Ensemble Methods

Mesterséges intelligencia ◽

10.35406/mi.2020.1.45 ◽

2020 ◽

Vol 2 (1) ◽

pp. 45-57

Author(s):

Sirmad Mahmood Hashmi

Keyword(s):

United States ◽

Great Britain ◽

Sentiment Analysis ◽

United States Of America ◽

Ensemble Methods ◽

The United States

Download Full-text

Study Comparison Stemmer to Optimize Text Preprocessing In Sentiment Analysis Indonesian E-Commerce Reviews

10.1109/icdabi53623.2021.9655867 ◽

2021 ◽

Author(s):

Yunita Fatma Faidha ◽

Guruh Fajar Shidik ◽

Ahmad Zainul Fanani

Keyword(s):

Sentiment Analysis ◽

Text Preprocessing

Download Full-text

A Complete VADER-Based Sentiment Analysis of Bitcoin (BTC) Tweets during the Era of COVID-19

Big Data and Cognitive Computing ◽

10.3390/bdcc4040033 ◽

2020 ◽

Vol 4 (4) ◽

pp. 33

Author(s):

Toni Pano ◽

Rasha Kashef

Keyword(s):

Machine Learning ◽

Social Media ◽

Prediction Model ◽

Sentiment Analysis ◽

Significant Role ◽

Prediction Models ◽

Financial Sector ◽

Research Gap ◽

Text Preprocessing ◽

The Impact

During the COVID-19 pandemic, many research studies have been conducted to examine the impact of the outbreak on the financial sector, especially on cryptocurrencies. Social media, such as Twitter, plays a significant role as a meaningful indicator in forecasting the Bitcoin (BTC) prices. However, there is a research gap in determining the optimal preprocessing strategy in BTC tweets to develop an accurate machine learning prediction model for bitcoin prices. This paper develops different text preprocessing strategies for correlating the sentiment scores of Twitter text with Bitcoin prices during the COVID-19 pandemic. We explore the effect of different preprocessing functions, features, and time lengths of data on the correlation results. Out of 13 strategies, we discover that splitting sentences, removing Twitter-specific tags, or their combination generally improve the correlation of sentiment scores and volume polarity scores with Bitcoin prices. The prices only correlate well with sentiment scores over shorter timespans. Selecting the optimum preprocessing strategy would prompt machine learning prediction models to achieve better accuracy as compared to the actual prices.

Download Full-text