scholarly journals A Method of Deep Learning Tackles Sentiment Analysis Problem in Arabic Texts

Sentiment Analysis (SA) is a field of Natural Language Processing (NLP) whose goal is to extract the emotion, sentiment or more general opinion expressed in a human-written text. Opinions and emotions play a central role in human life. Therefore, there are many academic researches in this field for processing many languages like English However, there is scarce in its implementation with addressing Arabic Sentiment Analysis (ASA). It is a challenging field where Arabic language has a rich morphological structure and there are many other defies more than in other languages. For that, the proposed model tackles ASA by using a Deep Learning approach. In this work, one of word embedding methods, such as a first hidden layer for features extracting from the input dataset and Long Short-Term Memory (LSTM) as a deep neural network, has been used for training. The model combined with Softmax layer is applied to turn numeric outputs from LSTM layer into probabilities to classify the outputs to positive or negative. There are two datasets that are used for training the model separately with each one. The first one is ASTD dataset as a dialectal Arabic type about different tweets from internet, the results with this dataset is compared with another academic work that used the same one. The results from this work outperforms through accuracy about 14.95% and F-score about 15.14% more than what performed in the previous work. The second one is HTL dataset as a modern standard Arabic type about opinions of reviewers on different hotels from several countries. This dataset is bigger in size than the first one to show the size effect on the results of this model. So, the accuracy increased about 11% and F-score about 10.8% more than what performed with the first dataset.

2020 ◽  
Vol 26 (6) ◽  
pp. 85-93
Author(s):  
Abdulhakeem Qusay Al-Bayati ◽  
Ahmed S. Al-Araji ◽  
Saman Hameed Ameen

Sentiment analysis is one of the major fields in natural language processing whose main task is to extract sentiments, opinions, attitudes, and emotions from a subjective text. And for its importance in decision making and in people's trust with reviews on web sites, there are many academic researches to address sentiment analysis problems. Deep Learning (DL) is a powerful Machine Learning (ML) technique that has emerged with its ability of feature representation and differentiating data, leading to state-of-the-art prediction results. In recent years, DL has been widely used in sentiment analysis, however, there is scarce in its implementation in the Arabic language field. Most of the previous researches address other languages like English. The proposed model tackles Arabic Sentiment Analysis (ASA) by using a DL approach. ASA is a challenging field where Arabic language has a rich morphological structure more than other languages. In this work, Long Short-Term Memory (LSTM) as a deep neural network has been used for training the model combined with word embedding as a first hidden layer for features extracting. The results show an accuracy of about 82% is achievable using DL method.


2019 ◽  
Vol 46 (4) ◽  
pp. 544-559 ◽  
Author(s):  
Ahmed Oussous ◽  
Fatima-Zahra Benjelloun ◽  
Ayoub Ait Lahcen ◽  
Samir Belfkih

Sentiment analysis (SA), also known as opinion mining, is a growing important research area. Generally, it helps to automatically determine if a text expresses a positive, negative or neutral sentiment. It enables to mine the huge increasing resources of shared opinions such as social networks, review sites and blogs. In fact, SA is used by many fields and for various languages such as English and Arabic. However, since Arabic is a highly inflectional and derivational language, it raises many challenges. In fact, SA of Arabic text should handle such complex morphology. To better handle these challenges, we decided to provide the research community and Arabic users with a new efficient framework for Arabic Sentiment Analysis (ASA). Our primary goal is to improve the performance of ASA by exploiting deep learning while varying the preprocessing techniques. For that, we implement and evaluate two deep learning models namely convolutional neural network (CNN) and long short-term memory (LSTM) models. The framework offers various preprocessing techniques for ASA (including stemming, normalisation, tokenization and stop words). As a result of this work, we first provide a new rich and publicly available Arabic corpus called Moroccan Sentiment Analysis Corpus (MSAC). Second, the proposed framework demonstrates improvement in ASA. In fact, the experimental results prove that deep learning models have a better performance for ASA than classical approaches (support vector machines, naive Bayes classifiers and maximum entropy). They also show the key role of morphological features in Arabic Natural Language Processing (NLP).


Author(s):  
Ali Bou Nassif ◽  
Abdollah Masoud Darya ◽  
Ashraf Elnagar

This work presents a detailed comparison of the performance of deep learning models such as convolutional neural networks, long short-term memory, gated recurrent units, their hybrids, and a selection of shallow learning classifiers for sentiment analysis of Arabic reviews. Additionally, the comparison includes state-of-the-art models such as the transformer architecture and the araBERT pre-trained model. The datasets used in this study are multi-dialect Arabic hotel and book review datasets, which are some of the largest publicly available datasets for Arabic reviews. Results showed deep learning outperforming shallow learning for binary and multi-label classification, in contrast with the results of similar work reported in the literature. This discrepancy in outcome was caused by dataset size as we found it to be proportional to the performance of deep learning models. The performance of deep and shallow learning techniques was analyzed in terms of accuracy and F1 score. The best performing shallow learning technique was Random Forest followed by Decision Tree, and AdaBoost. The deep learning models performed similarly using a default embedding layer, while the transformer model performed best when augmented with araBERT.


Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 374
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.


Author(s):  
Youssra Zahidi ◽  
Yacine El Younoussi ◽  
Yassine Al-Amrani

Deep learning (DL) is a machine learning (ML) subdomain that involves algorithms taken from the brain function named artificial neural networks (ANNs). Recently, DL approaches have gained major accomplishments across various Arabic natural language processing (ANLP) tasks, especially in the domain of Arabic sentiment analysis (ASA). For working on Arabic SA, researchers can use various DL libraries in their projects, but without justifying their choice or they choose a group of libraries relying on their particular programming language familiarity. We are basing in this work on Java and Python programming languages because they have a large set of deep learning libraries that are very useful in the ASA domain. This paper focuses on a comparative analysis of different valuable Python and Java libraries to conclude the most relevant and robust DL libraries for ASA. Throw this comparative analysis, and we find that: TensorFlow, Theano, and Keras Python frameworks are very popular and very used in this research domain.


2019 ◽  
Vol 9 (13) ◽  
pp. 2760 ◽  
Author(s):  
Khai Tran ◽  
Thi Phan

Sentiment analysis is an active research area in natural language processing. The task aims at identifying, extracting, and classifying sentiments from user texts in post blogs, product reviews, or social networks. In this paper, the ensemble learning model of sentiment classification is presented, also known as CEM (classifier ensemble model). The model contains various data feature types, including language features, sentiment shifting, and statistical techniques. A deep learning model is adopted with word embedding representation to address explicit, implicit, and abstract sentiment factors in textual data. The experiments conducted based on different real datasets found that our sentiment classification system is better than traditional machine learning techniques, such as Support Vector Machines and other ensemble learning systems, as well as the deep learning model, Long Short-Term Memory network, which has shown state-of-the-art results for sentiment analysis in almost corpuses. Our model’s distinguishing point consists in its effective application to different languages and different domains.


Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 779
Author(s):  
Danilo Dessì ◽  
Diego Reforgiato Recupero ◽  
Harald Sack

Today, increasing numbers of people are interacting online and a lot of textual comments are being produced due to the explosion of online communication. However, a paramount inconvenience within online environments is that comments that are shared within digital platforms can hide hazards, such as fake news, insults, harassment, and, more in general, comments that may hurt someone’s feelings. In this scenario, the detection of this kind of toxicity has an important role to moderate online communication. Deep learning technologies have recently delivered impressive performance within Natural Language Processing applications encompassing Sentiment Analysis and emotion detection across numerous datasets. Such models do not need any pre-defined hand-picked features, but they learn sophisticated features from the input datasets by themselves. In such a domain, word embeddings have been widely used as a way of representing words in Sentiment Analysis tasks, proving to be very effective. Therefore, in this paper, we investigated the use of deep learning and word embeddings to detect six different types of toxicity within online comments. In doing so, the most suitable deep learning layers and state-of-the-art word embeddings for identifying toxicity are evaluated. The results suggest that Long-Short Term Memory layers in combination with mimicked word embeddings are a good choice for this task.


Author(s):  
Youssra Zahidi ◽  
Yacine El Younoussi ◽  
Yassine Al-Amrani

Arabic Natural language processing (ANLP) is a subfield of artificial intelligence (AI) that tries to build various applications in the Arabic language like Arabic sentiment analysis (ASA) that is the operation of classifying the feelings and emotions expressed for defining the attitude of the writer (neutral, negative or positive). In order to work on ASA, researchers can use various tools in their research projects without explaining the cause behind this use, or they choose a set of libraries according to their knowledge about a specific programming language. Because of their libraries' abundance in the ANLP field, especially in ASA, we are relying on JAVA and Python programming languages in our research work. This paper relies on making an in-depth comparative evaluation of different valuable Python and Java libraries to deduce the most useful ones in Arabic sentiment analysis (ASA). According to a large variety of great and influential works in the domain of ASA, we deduce that the NLTK, Gensim and TextBlob libraries are the most useful for Python ASA task. In connection with Java ASA libraries, we conclude that Weka and CoreNLP tools are the most used, and they have great results in this research domain.


Author(s):  
Praphula Kumar Jain ◽  
Vijayalakshmi Saravanan ◽  
Rajendra Pamula

With the fastest growth of information and communication technology (ICT), the availability of web content on social media platforms is increasing day by day. Sentiment analysis from online reviews drawing researchers’ attention from various organizations such as academics, government, and private industries. Sentiment analysis has been a hot research topic in Machine Learning (ML) and Natural Language Processing (NLP). Currently, Deep Learning (DL) techniques are implemented in sentiment analysis to get excellent results. This study proposed a hybrid convolutional neural network-long short-term memory (CNN-LSTM) model for sentiment analysis. Our proposed model is being applied with dropout, max pooling, and batch normalization to get results. Experimental analysis carried out on Airlinequality and Twitter airline sentiment datasets. We employed the Keras word embedding approach, which converts texts into vectors of numeric values, where similar words have small vector distances between them. We calculated various parameters, such as accuracy, precision, recall, and F1-measure, to measure the model’s performance. These parameters for the proposed model are better than the classical ML models in sentiment analysis. Our results analysis demonstrates that the proposed model outperforms with 91.3% accuracy in sentiment analysis.


2019 ◽  
Vol 20 (1) ◽  
pp. 129-139 ◽  
Author(s):  
Zahra Bokaee Nezhad ◽  
Mohammad Ali Deihimi

With increasing members in social media sites today, people tend to share their views about everything online. It is a convenient way to convey their messages to end users on a specific subject. Sentiment Analysis is a subfield of Natural Language Processing (NLP) that refers to the identification of users’ opinions toward specific topics. It is used in several fields such as marketing, customer services, etc. However, limited works have been done on Persian Sentiment Analysis. On the other hand, deep learning has recently become popular because of its successful role in several Natural Language Processing tasks. The objective of this paper is to propose a novel hybrid deep learning architecture for Persian Sentiment Analysis. According to the proposed model, local features are extracted by Convolutional Neural Networks (CNN) and long-term dependencies are learned by Long Short Term Memory (LSTM). Therefore, the model can harness both CNN's and LSTM's abilities. Furthermore, Word2vec is used for word representation as an unsupervised learning step. To the best of our knowledge, this is the first attempt where a hybrid deep learning model is used for Persian Sentiment Analysis. We evaluate the model on a Persian dataset that is introduced in this study. The experimental results show the effectiveness of the proposed model with an accuracy of 85%. ABSTRAK: Hari ini dengan ahli yang semakin meningkat di laman media sosial, orang cenderung untuk berkongsi pandangan mereka tentang segala-galanya dalam talian. Ini adalah cara mudah untuk menyampaikan mesej mereka kepada pengguna akhir mengenai subjek tertentu. Analisis Sentimen adalah subfield Pemprosesan Bahasa Semula Jadi yang merujuk kepada pengenalan pendapat pengguna ke arah topik tertentu. Ia digunakan dalam beberapa bidang seperti pemasaran, perkhidmatan pelanggan, dan sebagainya. Walau bagaimanapun, kerja-kerja terhad telah dilakukan ke atas Analisis Sentimen Parsi. Sebaliknya, pembelajaran mendalam baru menjadi popular kerana peranannya yang berjaya dalam beberapa tugas Pemprosesan Bahasa Asli (NLP). Objektif makalah ini adalah mencadangkan senibina pembelajaran hibrid yang baru dalam Analisis Sentimen Parsi. Menurut model yang dicadangkan, ciri-ciri tempatan ditangkap oleh Rangkaian Neural Convolutional (CNN) dan ketergantungan jangka panjang dipelajari oleh Long Short Term Memory (LSTM). Oleh itu, model boleh memanfaatkan kebolehan CNN dan LSTM. Selain itu, Word2vec digunakan untuk perwakilan perkataan sebagai langkah pembelajaran tanpa pengawasan. Untuk pengetahuan yang terbaik, ini adalah percubaan pertama di mana model pembelajaran mendalam hibrid digunakan untuk Analisis Sentimen Persia. Kami menilai model pada dataset Persia yang memperkenalkan dalam kajian ini. Keputusan eksperimen menunjukkan keberkesanan model yang dicadangkan dengan ketepatan 85%.


Sign in / Sign up

Export Citation Format

Share Document