A Method of Deep Learning Tackles Sentiment Analysis Problem in Arabic Texts

Iraqi Journal of Computer Communication Control and System Engineering ◽

10.33103/uot.ijccce.20.4.2 ◽

2020 ◽

pp. 9-20

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Human Life ◽

Morphological Structure ◽

Arabic Language ◽

Written Text ◽

Hidden Layer ◽

Arabic Sentiment Analysis

Sentiment Analysis (SA) is a field of Natural Language Processing (NLP) whose goal is to extract the emotion, sentiment or more general opinion expressed in a human-written text. Opinions and emotions play a central role in human life. Therefore, there are many academic researches in this field for processing many languages like English However, there is scarce in its implementation with addressing Arabic Sentiment Analysis (ASA). It is a challenging field where Arabic language has a rich morphological structure and there are many other defies more than in other languages. For that, the proposed model tackles ASA by using a Deep Learning approach. In this work, one of word embedding methods, such as a first hidden layer for features extracting from the input dataset and Long Short-Term Memory (LSTM) as a deep neural network, has been used for training. The model combined with Softmax layer is applied to turn numeric outputs from LSTM layer into probabilities to classify the outputs to positive or negative. There are two datasets that are used for training the model separately with each one. The first one is ASTD dataset as a dialectal Arabic type about different tweets from internet, the results with this dataset is compared with another academic work that used the same one. The results from this work outperforms through accuracy about 14.95% and F-score about 15.14% more than what performed in the previous work. The second one is HTL dataset as a modern standard Arabic type about opinions of reviewers on different hotels from several countries. This dataset is bigger in size than the first one to show the size effect on the results of this model. So, the accuracy increased about 11% and F-score about 10.8% more than what performed with the first dataset.

Download Full-text

Arabic Sentiment Analysis (ASA) Using Deep Learning Approach

Journal of Engineering ◽

10.31026/j.eng.2020.06.07 ◽

2020 ◽

Vol 26 (6) ◽

pp. 85-93

Author(s):

Abdulhakeem Qusay Al-Bayati ◽

Ahmed S. Al-Araji ◽

Saman Hameed Ameen

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Web Sites ◽

Short Term Memory ◽

Morphological Structure ◽

Arabic Language ◽

Feature Representation ◽

Main Task ◽

Arabic Sentiment Analysis

Sentiment analysis is one of the major fields in natural language processing whose main task is to extract sentiments, opinions, attitudes, and emotions from a subjective text. And for its importance in decision making and in people's trust with reviews on web sites, there are many academic researches to address sentiment analysis problems. Deep Learning (DL) is a powerful Machine Learning (ML) technique that has emerged with its ability of feature representation and differentiating data, leading to state-of-the-art prediction results. In recent years, DL has been widely used in sentiment analysis, however, there is scarce in its implementation in the Arabic language field. Most of the previous researches address other languages like English. The proposed model tackles Arabic Sentiment Analysis (ASA) by using a DL approach. ASA is a challenging field where Arabic language has a rich morphological structure more than other languages. In this work, Long Short-Term Memory (LSTM) as a deep neural network has been used for training the model combined with word embedding as a first hidden layer for features extracting. The results show an accuracy of about 82% is achievable using DL method.

Download Full-text

ASA: A framework for Arabic sentiment analysis

Journal of Information Science ◽

10.1177/0165551519849516 ◽

2019 ◽

Vol 46 (4) ◽

pp. 544-559 ◽

Cited By ~ 4

Author(s):

Ahmed Oussous ◽

Fatima-Zahra Benjelloun ◽

Ayoub Ait Lahcen ◽

Samir Belfkih

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

Short Term Memory ◽

Research Area ◽

Support Vector ◽

Learning Models ◽

Arabic Natural Language Processing ◽

Arabic Sentiment Analysis

Sentiment analysis (SA), also known as opinion mining, is a growing important research area. Generally, it helps to automatically determine if a text expresses a positive, negative or neutral sentiment. It enables to mine the huge increasing resources of shared opinions such as social networks, review sites and blogs. In fact, SA is used by many fields and for various languages such as English and Arabic. However, since Arabic is a highly inflectional and derivational language, it raises many challenges. In fact, SA of Arabic text should handle such complex morphology. To better handle these challenges, we decided to provide the research community and Arabic users with a new efficient framework for Arabic Sentiment Analysis (ASA). Our primary goal is to improve the performance of ASA by exploiting deep learning while varying the preprocessing techniques. For that, we implement and evaluate two deep learning models namely convolutional neural network (CNN) and long short-term memory (LSTM) models. The framework offers various preprocessing techniques for ASA (including stemming, normalisation, tokenization and stop words). As a result of this work, we first provide a new rich and publicly available Arabic corpus called Moroccan Sentiment Analysis Corpus (MSAC). Second, the proposed framework demonstrates improvement in ASA. In fact, the experimental results prove that deep learning models have a better performance for ASA than classical approaches (support vector machines, naive Bayes classifiers and maximum entropy). They also show the key role of morphological features in Arabic Natural Language Processing (NLP).

Download Full-text

Empirical Evaluation of Shallow and Deep Learning Classifiers for Arabic Sentiment Analysis

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3466171 ◽

2022 ◽

Vol 21 (1) ◽

pp. 1-25

Author(s):

Ali Bou Nassif ◽

Abdollah Masoud Darya ◽

Ashraf Elnagar

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Empirical Evaluation ◽

Detailed Comparison ◽

Learning Models ◽

Similar Work ◽

Learning Classifiers ◽

Arabic Sentiment Analysis ◽

Gated Recurrent Units

This work presents a detailed comparison of the performance of deep learning models such as convolutional neural networks, long short-term memory, gated recurrent units, their hybrids, and a selection of shallow learning classifiers for sentiment analysis of Arabic reviews. Additionally, the comparison includes state-of-the-art models such as the transformer architecture and the araBERT pre-trained model. The datasets used in this study are multi-dialect Arabic hotel and book review datasets, which are some of the largest publicly available datasets for Arabic reviews. Results showed deep learning outperforming shallow learning for binary and multi-label classification, in contrast with the results of similar work reported in the literature. This discrepancy in outcome was caused by dataset size as we found it to be proportional to the performance of deep learning models. The performance of deep and shallow learning techniques was analyzed in terms of accuracy and F1 score. The best performing shallow learning technique was Random Forest followed by Decision Tree, and AdaBoost. The deep learning models performed similarly using a default embedding layer, while the transformer model performed best when augmented with araBERT.

Download Full-text

A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique

Information ◽

10.3390/info12090374 ◽

2021 ◽

Vol 12 (9) ◽

pp. 374

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Accuracy Score ◽

Learning Models ◽

Proposed Model

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.

Download Full-text

A powerful comparison of deep learning frameworks for Arabic sentiment analysis

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i1.pp745-752 ◽

2021 ◽

Vol 11 (1) ◽

pp. 745

Author(s):

Youssra Zahidi ◽

Yacine El Younoussi ◽

Yassine Al-Amrani

Keyword(s):

Deep Learning ◽

Comparative Analysis ◽

Sentiment Analysis ◽

Programming Languages ◽

Language Processing ◽

Large Set ◽

Language Familiarity ◽

Arabic Natural Language Processing ◽

Arabic Sentiment Analysis ◽

Python Programming

Deep learning (DL) is a machine learning (ML) subdomain that involves algorithms taken from the brain function named artificial neural networks (ANNs). Recently, DL approaches have gained major accomplishments across various Arabic natural language processing (ANLP) tasks, especially in the domain of Arabic sentiment analysis (ASA). For working on Arabic SA, researchers can use various DL libraries in their projects, but without justifying their choice or they choose a group of libraries relying on their particular programming language familiarity. We are basing in this work on Java and Python programming languages because they have a large set of deep learning libraries that are very useful in the ASA domain. This paper focuses on a comparative analysis of different valuable Python and Java libraries to conclude the most relevant and robust DL libraries for ASA. Throw this comparative analysis, and we find that: TensorFlow, Theano, and Keras Python frameworks are very popular and very used in this research domain.

Download Full-text

Deep Learning Application to Ensemble Learning—The Simple, but Effective, Approach to Sentiment Classifying

Applied Sciences ◽

10.3390/app9132760 ◽

2019 ◽

Vol 9 (13) ◽

pp. 2760 ◽

Cited By ~ 4

Author(s):

Khai Tran ◽

Thi Phan

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Ensemble Learning ◽

Language Processing ◽

Short Term Memory ◽

Learning Model ◽

Sentiment Classification ◽

Machine Learning Techniques ◽

Support Vector ◽

Deep Learning Model

Sentiment analysis is an active research area in natural language processing. The task aims at identifying, extracting, and classifying sentiments from user texts in post blogs, product reviews, or social networks. In this paper, the ensemble learning model of sentiment classification is presented, also known as CEM (classifier ensemble model). The model contains various data feature types, including language features, sentiment shifting, and statistical techniques. A deep learning model is adopted with word embedding representation to address explicit, implicit, and abstract sentiment factors in textual data. The experiments conducted based on different real datasets found that our sentiment classification system is better than traditional machine learning techniques, such as Support Vector Machines and other ensemble learning systems, as well as the deep learning model, Long Short-Term Memory network, which has shown state-of-the-art results for sentiment analysis in almost corpuses. Our model’s distinguishing point consists in its effective application to different languages and different domains.

Download Full-text

An Assessment of Deep Learning Models and Word Embeddings for Toxicity Detection within Online Textual Comments

Electronics ◽

10.3390/electronics10070779 ◽

2021 ◽

Vol 10 (7) ◽

pp. 779

Author(s):

Danilo Dessì ◽

Diego Reforgiato Recupero ◽

Harald Sack

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Online Communication ◽

Good Choice ◽

Learning Technologies ◽

Word Embeddings ◽

Emotion Detection ◽

Digital Platforms

Today, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in general, comments that may hurt someone’s feelings. In this scenario, the detection of this kind of toxicity has an important role to moderate online communication. Deep learning technologies have recently delivered impressive performance within Natural Language Processing applications encompassing Sentiment Analysis and emotion detection across numerous datasets. Such models do not need any pre-defined hand-picked features, but they learn sophisticated features from the input datasets by themselves. In such a domain, word embeddings have been widely used as a way of representing words in Sentiment Analysis tasks, proving to be very effective. Therefore, in this paper, we investigated the use of deep learning and word embeddings to detect six different types of toxicity within online comments. In doing so, the most suitable deep learning layers and state-of-the-art word embeddings for identifying toxicity are evaluated. The results suggest that Long-Short Term Memory layers in combination with mimicked word embeddings are a good choice for this task.

Download Full-text

Different valuable tools for Arabic sentiment analysis: a comparative evaluation

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i1.pp753-762 ◽

2021 ◽

Vol 11 (1) ◽

pp. 753

Author(s):

Youssra Zahidi ◽

Yacine El Younoussi ◽

Yassine Al-Amrani

Keyword(s):

Sentiment Analysis ◽

Programming Languages ◽

Language Processing ◽

Comparative Evaluation ◽

Research Work ◽

Arabic Language ◽

Arabic Natural Language Processing ◽

Arabic Sentiment Analysis ◽

Python Programming ◽

Research Domain

Arabic Natural language processing (ANLP) is a subfield of artificial intelligence (AI) that tries to build various applications in the Arabic language like Arabic sentiment analysis (ASA) that is the operation of classifying the feelings and emotions expressed for defining the attitude of the writer (neutral, negative or positive). In order to work on ASA, researchers can use various tools in their research projects without explaining the cause behind this use, or they choose a set of libraries according to their knowledge about a specific programming language. Because of their libraries' abundance in the ANLP field, especially in ASA, we are relying on JAVA and Python programming languages in our research work. This paper relies on making an in-depth comparative evaluation of different valuable Python and Java libraries to deduce the most useful ones in Arabic sentiment analysis (ASA). According to a large variety of great and influential works in the domain of ASA, we deduce that the NLTK, Gensim and TextBlob libraries are the most useful for Python ASA task. In connection with Java ASA libraries, we conclude that Weka and CoreNLP tools are the most used, and they have great results in this research domain.

Download Full-text

A Hybrid CNN-LSTM: A Deep Learning Approach for Consumer Sentiment Analysis Using Qualitative User-Generated Contents

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3457206 ◽

2021 ◽

Vol 20 (5) ◽

pp. 1-15

Author(s):

Praphula Kumar Jain ◽

Vijayalakshmi Saravanan ◽

Rajendra Pamula

Keyword(s):

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Online Reviews ◽

Web Content ◽

Proposed Model ◽

Social Media Platforms ◽

Information And Communication ◽

Better Than

With the fastest growth of information and communication technology (ICT), the availability of web content on social media platforms is increasing day by day. Sentiment analysis from online reviews drawing researchers’ attention from various organizations such as academics, government, and private industries. Sentiment analysis has been a hot research topic in Machine Learning (ML) and Natural Language Processing (NLP). Currently, Deep Learning (DL) techniques are implemented in sentiment analysis to get excellent results. This study proposed a hybrid convolutional neural network-long short-term memory (CNN-LSTM) model for sentiment analysis. Our proposed model is being applied with dropout, max pooling, and batch normalization to get results. Experimental analysis carried out on Airlinequality and Twitter airline sentiment datasets. We employed the Keras word embedding approach, which converts texts into vectors of numeric values, where similar words have small vector distances between them. We calculated various parameters, such as accuracy, precision, recall, and F1-measure, to measure the model’s performance. These parameters for the proposed model are better than the classical ML models in sentiment analysis. Our results analysis demonstrates that the proposed model outperforms with 91.3% accuracy in sentiment analysis.

Download Full-text

A COMBINED DEEP LEARNING MODEL FOR PERSIAN SENTIMENT ANALYSIS

IIUM Engineering Journal ◽

10.31436/iiumej.v20i1.1036 ◽

2019 ◽

Vol 20 (1) ◽

pp. 129-139 ◽

Cited By ~ 2

Author(s):

Zahra Bokaee Nezhad ◽

Mohammad Ali Deihimi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory ◽

Deep Learning Model

With increasing members in social media sites today, people tend to share their views about everything online. It is a convenient way to convey their messages to end users on a specific subject. Sentiment Analysis is a subfield of Natural Language Processing (NLP) that refers to the identification of users’ opinions toward specific topics. It is used in several fields such as marketing, customer services, etc. However, limited works have been done on Persian Sentiment Analysis. On the other hand, deep learning has recently become popular because of its successful role in several Natural Language Processing tasks. The objective of this paper is to propose a novel hybrid deep learning architecture for Persian Sentiment Analysis. According to the proposed model, local features are extracted by Convolutional Neural Networks (CNN) and long-term dependencies are learned by Long Short Term Memory (LSTM). Therefore, the model can harness both CNN's and LSTM's abilities. Furthermore, Word2vec is used for word representation as an unsupervised learning step. To the best of our knowledge, this is the first attempt where a hybrid deep learning model is used for Persian Sentiment Analysis. We evaluate the model on a Persian dataset that is introduced in this study. The experimental results show the effectiveness of the proposed model with an accuracy of 85%. ABSTRAK: Hari ini dengan ahli yang semakin meningkat di laman media sosial, orang cenderung untuk berkongsi pandangan mereka tentang segala-galanya dalam talian. Ini adalah cara mudah untuk menyampaikan mesej mereka kepada pengguna akhir mengenai subjek tertentu. Analisis Sentimen adalah subfield Pemprosesan Bahasa Semula Jadi yang merujuk kepada pengenalan pendapat pengguna ke arah topik tertentu. Ia digunakan dalam beberapa bidang seperti pemasaran, perkhidmatan pelanggan, dan sebagainya. Walau bagaimanapun, kerja-kerja terhad telah dilakukan ke atas Analisis Sentimen Parsi. Sebaliknya, pembelajaran mendalam baru menjadi popular kerana peranannya yang berjaya dalam beberapa tugas Pemprosesan Bahasa Asli (NLP). Objektif makalah ini adalah mencadangkan senibina pembelajaran hibrid yang baru dalam Analisis Sentimen Parsi. Menurut model yang dicadangkan, ciri-ciri tempatan ditangkap oleh Rangkaian Neural Convolutional (CNN) dan ketergantungan jangka panjang dipelajari oleh Long Short Term Memory (LSTM). Oleh itu, model boleh memanfaatkan kebolehan CNN dan LSTM. Selain itu, Word2vec digunakan untuk perwakilan perkataan sebagai langkah pembelajaran tanpa pengawasan. Untuk pengetahuan yang terbaik, ini adalah percubaan pertama di mana model pembelajaran mendalam hibrid digunakan untuk Analisis Sentimen Persia. Kami menilai model pada dataset Persia yang memperkenalkan dalam kajian ini. Keputusan eksperimen menunjukkan keberkesanan model yang dicadangkan dengan ketepatan 85%.

Download Full-text