scholarly journals A Performance Comparison of Feature Extraction Methods for Sentiment Analysis

Author(s):  
Lai Po Hung ◽  
Rayner Alfred
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
C. Sitaula ◽  
A. Basnet ◽  
A. Mainali ◽  
T. B. Shahi

COVID-19 has claimed several human lives to this date. People are dying not only because of physical infection of the virus but also because of mental illness, which is linked to people’s sentiments and psychologies. People’s written texts/posts scattered on the web could help understand their psychology and the state they are in during this pandemic. In this paper, we analyze people’s sentiment based on the classification of tweets collected from the social media platform, Twitter, in Nepal. For this, we, first, propose to use three different feature extraction methods—fastText-based (ft), domain-specific (ds), and domain-agnostic (da)—for the representation of tweets. Among these three methods, two methods (“ds” and “da”) are the novel methods used in this study. Second, we propose three different convolution neural networks (CNNs) to implement the proposed features. Last, we ensemble such three CNNs models using ensemble CNN, which works in an end-to-end manner, to achieve the end results. For the evaluation of the proposed feature extraction methods and CNN models, we prepare a Nepali Twitter sentiment dataset, called NepCOV19Tweets, with 3 classes (positive, neutral, and negative). The experimental results on such dataset show that our proposed feature extraction methods possess the discriminating characteristics for the sentiment classification. Moreover, the proposed CNN models impart robust and stable performance on the proposed features. Also, our dataset can be used as a benchmark to study the COVID-19-related sentiment analysis in the Nepali language.


2021 ◽  
Vol 24 (3) ◽  
pp. 50-54
Author(s):  
Mohammad W.Habib ◽  
◽  
Zainab N. Sultani ◽  

Twitter is considered a significant source of exchanging information and opinion in today's business. Analysis of this data is critical and complex due to the size of the dataset. Sentiment Analysis is adopted to understand and analyze the sentiment of such data. In this paper, a Machine learning approach is employed for analyzing the data into positive or negative sentiment (opinion). Different arrangements of preprocessing techniques are applied to clean the tweets, and various feature extraction methods are used to extract and reduce the dimension of the tweets' feature vector. Sentiment140 dataset is used, and it consists of sentiment labels and tweets, so supervised machine learning models are used, specifically Logistic Regression, Naive Bayes, and Support Vector Machine. According to the experimental results, Logistic Regression was the best amongst other models with all feature extraction techniques.


Sign in / Sign up

Export Citation Format

Share Document