Sentiment Analysis of Arabic Documents

The emergence of the Web 2.0 technology generated a massive amount of raw data by enabling Internet users to post their opinions on the web. Processing this raw data to extract useful information can be a very challenging task. An example of important information that can be automatically extracted from the users' posts is their opinions on different issues. This problem of Sentiment Analysis (SA) has been studied well on the English language and two main approaches have been devised: corpus-based and lexicon-based. This work focuses on the later approach due to its various challenges and high potential. The discussions in this paper take the reader through the detailed steps of building the main two components of the lexicon-based SA approach: the lexicon and the SA tool. The experiments show that significant efforts are still needed to reach a satisfactory level of accuracy for the lexicon-based Arabic SA. Nonetheless, they do provide an interesting guide for the researchers in their on-going efforts to improve lexicon-based SA.

Download Full-text

Towards Improving the Lexicon-Based Approach for Arabic Sentiment Analysis

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2014070104 ◽

2014 ◽

Vol 9 (3) ◽

pp. 55-71 ◽

Cited By ~ 39

Author(s):

Nawaf A. Abdulla ◽

Nizar A. Ahmed ◽

Mohammed A. Shehab ◽

Mahmoud Al-Ayyoub ◽

Mohammed N. Al-Kabi ◽

...

Keyword(s):

Web 2.0 ◽

Sentiment Analysis ◽

English Language ◽

High Potential ◽

Raw Data ◽

Satisfactory Level ◽

Web 2.0 Technology ◽

Internet Users ◽

Arabic Sentiment Analysis ◽

The Web

The emergence of the Web 2.0 technology generated a massive amount of raw data by enabling Internet users to post their opinions on the web. Processing this raw data to extract useful information can be a very challenging task. An example of important information that can be automatically extracted from the users' posts is their opinions on different issues. This problem of Sentiment Analysis (SA) has been studied well on the English language and two main approaches have been devised: corpus-based and lexicon-based. This work focuses on the later approach due to its various challenges and high potential. The discussions in this paper take the reader through the detailed steps of building the main two components of the lexicon-based SA approach: the lexicon and the SA tool. The experiments show that significant efforts are still needed to reach a satisfactory level of accuracy for the lexicon-based Arabic SA. Nonetheless, they do provide an interesting guide for the researchers in their on-going efforts to improve lexicon-based SA.

Download Full-text

SentiProdBR: Building Domain-Specific Sentiment Lexicons for the Portuguese Language

10.5753/sbbd.2021.17897 ◽

2021 ◽

Author(s):

Tiago de Melo

Keyword(s):

Decision Making ◽

Sentiment Analysis ◽

Online Reviews ◽

Bayes Theorem ◽

Experimental Results ◽

Product Categories ◽

Domain Specific ◽

Alternative Approaches ◽

The Web

Online reviews are readily available on the Web and widely used for decision-making. However, only a few studies on Portuguese sentiment analysis are reported due to the lack of resources including domain-specific sentiment lexical collections. In this paper, we present an effective methodology using probabilities of the Bayes’ Theorem for building a set of lexicons, called SentiProdBR, for 10 different product categories for the Portuguese language. Experimental results indicate that our methodology significantly outperforms several alternative approaches of building domain-specific sentiment lexicons.

Download Full-text

Web Text Mining

The Oxford Handbook of Computational Linguistics 2nd edition ◽

10.1093/oxfordhb/9780199573691.013.27 ◽

2016 ◽

Author(s):

Ricardo Baeza-Yates ◽

Roi Blanco ◽

Malú Castellanos

Keyword(s):

Social Media ◽

Text Mining ◽

Sentiment Analysis ◽

Web Search ◽

Internet Users ◽

Entity Retrieval ◽

Web Text Mining ◽

Text Content ◽

The Web

Web search has become a ubiquitous commodity for Internet users. This fact puts a large number of documents with plenty of text content at our fingertips. To make good use of this data, we need to mine web text. This triggers the two problems covered here: sentiment analysis and entity retrieval in the context of the Web. The first problem answers the question of what people think about a given product or a topic, in particular sentiment analysis in social media. The second problem addresses the issue of solving certain enquiries precisely by returning a particular object: for instance, where the next concert of my favourite band will be or who the best cooks are in a particular region. Where to find these objects and how to retrieve, rank, and display them are tasks related to the entity retrieval problem.

Download Full-text

Finding Opinion Strength Using Rule-Based Parsing for Arabic Sentiment Analysis

Lecture Notes in Computer Science - Advances in Soft Computing and Its Applications ◽

10.1007/978-3-642-45111-9_44 ◽

2013 ◽

pp. 509-520 ◽

Cited By ~ 10

Author(s):

Shereen Oraby ◽

Yasser El-Sonbaty ◽

Mohamad Abou El-Nasr

Keyword(s):

Sentiment Analysis ◽

Rule Based ◽

Arabic Sentiment Analysis

Download Full-text

Arabic Sentiment Analysis (ASA) Using Deep Learning Approach

Journal of Engineering ◽

10.31026/j.eng.2020.06.07 ◽

2020 ◽

Vol 26 (6) ◽

pp. 85-93

Author(s):

Abdulhakeem Qusay Al-Bayati ◽

Ahmed S. Al-Araji ◽

Saman Hameed Ameen

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Web Sites ◽

Short Term Memory ◽

Morphological Structure ◽

Arabic Language ◽

Feature Representation ◽

Main Task ◽

Arabic Sentiment Analysis

Sentiment analysis is one of the major fields in natural language processing whose main task is to extract sentiments, opinions, attitudes, and emotions from a subjective text. And for its importance in decision making and in people's trust with reviews on web sites, there are many academic researches to address sentiment analysis problems. Deep Learning (DL) is a powerful Machine Learning (ML) technique that has emerged with its ability of feature representation and differentiating data, leading to state-of-the-art prediction results. In recent years, DL has been widely used in sentiment analysis, however, there is scarce in its implementation in the Arabic language field. Most of the previous researches address other languages like English. The proposed model tackles Arabic Sentiment Analysis (ASA) by using a DL approach. ASA is a challenging field where Arabic language has a rich morphological structure more than other languages. In this work, Long Short-Term Memory (LSTM) as a deep neural network has been used for training the model combined with word embedding as a first hidden layer for features extracting. The results show an accuracy of about 82% is achievable using DL method.

Download Full-text

Opinion Mining for the Customer Feedback using TextBlob

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206418 ◽

2020 ◽

pp. 72-76

Author(s):

Praveen Gujjar J ◽

Prasanna Kumar H R

Keyword(s):

Decision Making ◽

Language Processing ◽

Opinion Mining ◽

The Internet ◽

Business Decision ◽

Customer Feedback ◽

Internet Users ◽

Enormous Amount ◽

The Web ◽

Natural Language Processing Task

Evolution in the field of web technology has made an enormous amount of data available in the web for the internet users. These internet users give their useful feedback, comments, suggestion or opinion for the available product or service in the web. User generated data are very essential to analyze for business decision making. TextBlob is one of the simple API offered by python library to perform certain natural language processing task. This paper proposed a method for analyzing the opinion of the customer using TextBlob to understand the customer opinion for decision making. This paper, provide a result for aforesaid data using TextBlob API using python. The paper includes advantages of the proposed technique and concludes with the challenges for the marketers when using this technique in their decision-making.

Download Full-text

A Hybrid Method of Linguistic and Statistical Features for Arabic Sentiment Analysis

Baghdad Science Journal ◽

10.21123/bsj.2020.17.1(suppl.).0385 ◽

2020 ◽

Vol 17 (1(Suppl.)) ◽

pp. 0385

Author(s):

Ahmed Sabah AL-Jumaili

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Hybrid Method ◽

Training Model ◽

Arabic Language ◽

Machine Learning Techniques ◽

Statistical Features ◽

Hybrid Features ◽

Pos Tagging ◽

Arabic Sentiment Analysis

Sentiment analysis refers to the task of identifying polarity of positive and negative for particular text that yield an opinion. Arabic language has been expanded dramatically in the last decade especially with the emergence of social websites (e.g. Twitter, Facebook, etc.). Several studies addressed sentiment analysis for Arabic language using various techniques. The most efficient techniques according to the literature were the machine learning due to their capabilities to build a training model. Yet, there is still issues facing the Arabic sentiment analysis using machine learning techniques. Such issues are related to employing robust features that have the ability to discriminate the polarity of sentiments. This paper proposes a hybrid method of linguistic and statistical features along with classification methods for Arabic sentiment analysis. Linguistic features contains stemming and POS tagging, while statistical contains the TF-IDF. A benchmark dataset of Arabic tweets have been used in the experiments. In addition, three classifiers have been utilized including SVM, KNN and ME. Results showed that SVM has outperformed the other classifiers by obtaining an f-score of 72.15%. This indicates the usefulness of using SVM with the proposed hybrid features.

Download Full-text

Different valuable tools for Arabic sentiment analysis: a comparative evaluation

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i1.pp753-762 ◽

2021 ◽

Vol 11 (1) ◽

pp. 753

Author(s):

Youssra Zahidi ◽

Yacine El Younoussi ◽

Yassine Al-Amrani

Keyword(s):

Sentiment Analysis ◽

Programming Languages ◽

Language Processing ◽

Comparative Evaluation ◽

Research Work ◽

Arabic Language ◽

Arabic Natural Language Processing ◽

Arabic Sentiment Analysis ◽

Python Programming ◽

Research Domain

Arabic Natural language processing (ANLP) is a subfield of artificial intelligence (AI) that tries to build various applications in the Arabic language like Arabic sentiment analysis (ASA) that is the operation of classifying the feelings and emotions expressed for defining the attitude of the writer (neutral, negative or positive). In order to work on ASA, researchers can use various tools in their research projects without explaining the cause behind this use, or they choose a set of libraries according to their knowledge about a specific programming language. Because of their libraries' abundance in the ANLP field, especially in ASA, we are relying on JAVA and Python programming languages in our research work. This paper relies on making an in-depth comparative evaluation of different valuable Python and Java libraries to deduce the most useful ones in Arabic sentiment analysis (ASA). According to a large variety of great and influential works in the domain of ASA, we deduce that the NLTK, Gensim and TextBlob libraries are the most useful for Python ASA task. In connection with Java ASA libraries, we conclude that Weka and CoreNLP tools are the most used, and they have great results in this research domain.

Download Full-text

Syntactic- and morphology-based text augmentation framework for Arabic sentiment analysis

PeerJ Computer Science ◽

10.7717/peerj-cs.469 ◽

2021 ◽

Vol 7 ◽

pp. e469

Author(s):

Rehab Duwairi ◽

Ftoon Abushaqra

Keyword(s):

Sentiment Analysis ◽

Arabic Language ◽

Video Data ◽

High Quality ◽

Novel Approach ◽

Augmentation Techniques ◽

The Rich ◽

Increase In Accuracy ◽

Arabic Sentiment Analysis ◽

The Impact

Arabic language is a challenging language for automatic processing. This is due to several intrinsic reasons such as Arabic multi-dialects, ambiguous syntax, syntactical flexibility and diacritics. Machine learning and deep learning frameworks require big datasets for training to ensure accurate predictions. This leads to another challenge faced by researches using Arabic text; as Arabic textual datasets of high quality are still scarce. In this paper, an intelligent framework for expanding or augmenting Arabic sentences is presented. The sentences were initially labelled by human annotators for sentiment analysis. The novel approach presented in this work relies on the rich morphology of Arabic, synonymy lists, syntactical or grammatical rules, and negation rules to generate new sentences from the seed sentences with their proper labels. Most augmentation techniques target image or video data. This study is the first work to target text augmentation for Arabic language. Using this framework, we were able to increase the size of the initial seed datasets by 10 folds. Experiments that assess the impact of this augmentation on sentiment analysis showed a 42% average increase in accuracy, due to the reliability and the high quality of the rules used to build this framework.

Download Full-text