scholarly journals A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis

2020 ◽  
Vol 17 (1(Suppl.)) ◽  
pp. 0385
Author(s):  
Ahmed Sabah AL-Jumaili

          Sentiment analysis refers to the task of identifying polarity of positive and negative for particular text that yield an opinion. Arabic language has been expanded dramatically in the last decade especially with the emergence of social websites (e.g. Twitter, Facebook, etc.). Several studies addressed sentiment analysis for Arabic language using various techniques. The most efficient techniques according to the literature were the machine learning due to their capabilities to build a training model. Yet, there is still issues facing the Arabic sentiment analysis using machine learning techniques. Such issues are related to employing robust features that have the ability to discriminate the polarity of sentiments. This paper proposes a hybrid method of linguistic and statistical features along with classification methods for Arabic sentiment analysis. Linguistic features contains stemming and POS tagging, while statistical contains the TF-IDF. A benchmark dataset of Arabic tweets have been used in the experiments. In addition, three classifiers have been utilized including SVM, KNN and ME. Results showed that SVM has outperformed the other classifiers by obtaining an f-score of 72.15%. This indicates the usefulness of using SVM with the proposed hybrid features.

Author(s):  
Ensaf Hussein Mohamed ◽  
Mohammed ElSaid Moussa ◽  
Mohamed Hassan Haggag

Sentiment analysis (SA) is a technique that lets people in different fields such as business, economy, research, government, and politics to know about people’s opinions, which greatly affects the process of decision-making. SA techniques are classified into: lexicon-based techniques, machine learning techniques, and a hybrid between both approaches. Each approach has its limitations and drawbacks, the machine learning approach depends on manual feature extraction, lexicon-based approach relies on sentiment lexicons that are usually unscalable, unreliable, and manually annotated by human experts. Nowadays, word-embedding techniques have been commonly used in SA classification. Currently, Word2Vec and GloVe are some of the most accurate and usable word embedding techniques, which can transform words into meaningful semantic vectors. However, these techniques ignore sentiment information of texts and require a huge corpus of texts for training and generating accurate vectors, which are used as inputs of deep learning models. In this paper, we propose an enhanced ensemble classifier framework. Our framework is based on our previously published lexicon-based method, bag-of-words, and pre-trained word embedding, first the sentence is preprocessed by removing stop-words, POS tagging, stemming and lemmatization, shortening exaggerated word. Second, the processed sentence is passed to three modules, our previous lexicon-based method (Sum Votes), bag-of-words module and semantic module (Word2Vec and Glove) and produced feature vectors. Finally, the previous features vectors are fed into 11 different classifiers. The proposed framework is tested and evaluated over four datasets with five different lexicons, the experiment results show that our proposed model outperforms the previous lexicon based and the machine learning methods individually.


2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


2018 ◽  
Vol 7 (2.32) ◽  
pp. 462
Author(s):  
G Krishna Chaitanya ◽  
Dinesh Reddy Meka ◽  
Vakalapudi Surya Vamsi ◽  
M V S Ravi Karthik

Sentiment or emotion behind a tweet from Twitter or a post from Facebook can help us answer what opinions or feedback a person has. With the advent of growing user-generated blogs, posts and reviews across various social media and online retails, calls for an understanding of these afore mentioned user data acts as a catalyst in building Recommender systems and drive business plans. User reviews on online retail stores influence buying behavior of customers and thus complements the ever-growing need of sentiment analysis. Machine Learning helps us to read between the lines of tweets by proving us with various algorithms like Naïve Bayes, SVM, etc. Sentiment Analysis uses Machine Learning and Natural Language Processing (NLP) to extract, classify and analyze tweets for sentiments (emotions). There are various packages and frameworks in R and Python that aid in Sentiment Analysis or Text Mining in general. 


Author(s):  
V Umarani ◽  
A Julian ◽  
J Deepa

Sentiment analysis has gained a lot of attention from researchers in the last year because it has been widely applied to a variety of application domains such as business, government, education, sports, tourism, biomedicine, and telecommunication services. Sentiment analysis is an automated computational method for studying or evaluating sentiments, feelings, and emotions expressed as comments, feedbacks, or critiques. The sentiment analysis process can be automated using machine learning techniques, which analyses text patterns faster. The supervised machine learning technique is the most used mechanism for sentiment analysis. The proposed work discusses the flow of sentiment analysis process and investigates the common supervised machine learning techniques such as multinomial naive bayes, Bernoulli naive bayes, logistic regression, support vector machine, random forest, K-nearest neighbor, decision tree, and deep learning techniques such as Long Short-Term Memory and Convolution Neural Network. The work examines such learning methods using standard data set and the experimental results of sentiment analysis demonstrate the performance of various classifiers taken in terms of the precision, recall, F1-score, RoC-Curve, accuracy, running time and k fold cross validation and helps in appreciating the novelty of the several deep learning techniques and also giving the user an overview of choosing the right technique for their application.


Sign in / Sign up

Export Citation Format

Share Document