scholarly journals Arabic Sentiment Analysis on Chewing Khat Leaves using Machine Learning and Ensemble Methods

2021 ◽  
Vol 11 (2) ◽  
pp. 6845-6848
Author(s):  
W. M. S. Yafooz ◽  
E. A. Hizam ◽  
W. A. Alromema

Sentiment analysis plays an important role in obtaining speakers' opinions or feelings towards events, products, topics, or services, helping businesses to improve their products. Moreover, governments and organizations investigate and solve current social issues by analyzing perspectives and feelings. This study evaluated the habit of chewing Khat (qat) leaves among the Yemeni society. Chewing Khat plant leaves, is a common habit in Yemen and East Africa. This paper proposes a model to detect information about the Khat chewing habit, how people explore it, and the preference for Khat leaves among Arabic people. A dataset consisting of user comments on 18 youtube videos was prepared through several natural language processing techniques. Several experiments were conducted using six machine learning classifiers and four ensemble methods. Support Vector Machine and Linear Regression had almost 80% accuracy, whereas xgboot was the most accurate ensemble method reaching 77%.

Sentiment Analysis is individuals' opinions and feedbacks study towards a substance, which can be items, services, movies, people or events. The opinions are mostly expressed as remarks or reviews. With the social network, gatherings and websites, these reviews rose as a significant factor for the client’s decision to buy anything or not. These days, a vast scalable computing environment provides us with very sophisticated way of carrying out various data-intensive natural language processing (NLP) and machine-learning tasks to examine these reviews. One such example is text classification, a compelling method for predicting the clients' sentiment. In this paper, we attempt to center our work of sentiment analysis on movie review database. We look at the sentiment expression to order the extremity of the movie reviews on a size of 0(highly disliked) to 4(highly preferred) and perform feature extraction and ranking and utilize these features to prepare our multilabel classifier to group the movie review into its right rating. This paper incorporates sentiment analysis utilizing feature-based opinion mining and managed machine learning. The principle center is to decide the extremity of reviews utilizing nouns, verbs, and adjectives as opinion words. In addition, a comparative study on different classification approaches has been performed to determine the most appropriate classifier to suit our concern problem space. In our study, we utilized six distinctive machine learning algorithms – Naïve Bayes, Logistic Regression, SVM (Support Vector Machine), RF (Random Forest) KNN (K nearest neighbors) and SoftMax Regression.


Author(s):  
Subhadip Chandra ◽  
Randrita Sarkar ◽  
Sayon Islam ◽  
Soham Nandi ◽  
Avishto Banerjee ◽  
...  

Sentiment analysis is the methodical recognition, extraction, quantification, and learning of affective states and subjective information using natural language processing, text analysis, computational linguistics, and biometrics. People frequently use Twitter, one of numerous popular social media platforms, to convey their thoughts and opinions about a business, a product, or a service. Analysis of tweet sentiments is particularly useful in detecting if people have a good, negative, or neutral opinion. This study assesses public opinion about an individual, activity, commodity, or organization. The Twitter API is utilised in this article to directly get tweets from Twitter and develop a sentiment categorization for the tweets. This paper has used Twitter data for two separate approaches, viz., Lexicon & Machine Learning. Lexicon based approach further categorized in Corpus-based and Dictionary-based. And various Machine learning-based approaches like Support Vector Machine (SVM), Naïve Bayes, Maximum entropy are used to analyse Twitter data. Neural Network (NN), Decision tree-based sentiment analysis is also covered in this research work, to find out better accuracy of the approaches in the various data range. Graphs and confusion matrices are used to visualise the results of the analysis for positive, negative, and neutral remarks regarding their opinions.


The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.


2021 ◽  
Vol 11 (19) ◽  
pp. 9292
Author(s):  
Noman Islam ◽  
Asadullah Shaikh ◽  
Asma Qaiser ◽  
Yousef Asiri ◽  
Sultan Almakdi ◽  
...  

In recent years, the consumption of social media content to keep up with global news and to verify its authenticity has become a considerable challenge. Social media enables us to easily access news anywhere, anytime, but it also gives rise to the spread of fake news, thereby delivering false information. This also has a negative impact on society. Therefore, it is necessary to determine whether or not news spreading over social media is real. This will allow for confusion among social media users to be avoided, and it is important in ensuring positive social development. This paper proposes a novel solution by detecting the authenticity of news through natural language processing techniques. Specifically, this paper proposes a novel scheme comprising three steps, namely, stance detection, author credibility verification, and machine learning-based classification, to verify the authenticity of news. In the last stage of the proposed pipeline, several machine learning techniques are applied, such as decision trees, random forest, logistic regression, and support vector machine (SVM) algorithms. For this study, the fake news dataset was taken from Kaggle. The experimental results show an accuracy of 93.15%, precision of 92.65%, recall of 95.71%, and F1-score of 94.15% for the support vector machine algorithm. The SVM is better than the second best classifier, i.e., logistic regression, by 6.82%.


Sentiment analysis is an area of natural language processing (NLP) and machine learning where the text is to be categorized into predefined classes i.e. positive and negative. As the field of internet and social media, both are increasing day by day, the product of these two nowadays is having many more feedbacks from the customer than before. Text generated through social media, blogs, post, review on any product, etc. has become the bested suited cases for consumer sentiment, providing a best-suited idea for that particular product. Features are an important source for the classification task as more the features are optimized, the more accurate are results. Therefore, this research paper proposes a hybrid feature selection which is a combination of Particle swarm optimization (PSO) and cuckoo search. Due to the subjective nature of social media reviews, hybrid feature selection technique outperforms the traditional technique. The performance factors like f-measure, recall, precision, and accuracy tested on twitter dataset using Support Vector Machine (SVM) classifier and compared with convolution neural network. Experimental results of this paper on the basis of different parameters show that the proposed work outperforms the existing work


2021 ◽  
Vol 10 (1) ◽  
pp. 13-15
Author(s):  
Kevin Perdana ◽  
Titania Pricillia ◽  
Zulfachmi

Sentiment analysis refers to Natural Language Processing techniques that are classified as Unsupervised Learning to identify positive, negative, or neutral opinions. Many of these opinions come through Twitter, because social media is quite effective and efficient in commenting because it can only write a maximum of 140 characters. From previous research, the value of the accuracy of the sentiment analysis carried out by one of the NLP libraries, namely TextBlob, has shown that Unsupervised Learning does not produce such good scores. With the Telkomsel service case study the writer took the dataset from Twitter and the results of the analysis with TextBlob only showed a value of 58.59%. Optimization is done by adding the Support Vector Machine method which is included in the Supervised Learning category. The best results obtained from this study are values that show 75%.


The process of discovering and analyzing the customer feedback using Natural Language Processing (NLP) is said to be sentiment analysis. Based on the surge over the concept of rating level in sentiment analysis, sentiment is utilized as an attribute for certain aspects or features that get expressed and more attention are provided to the problem of detecting the customer reviews. Despite the wide use and popularity of some methods, a better technique for identifying the polarity of a text data is hard to find. Machine learning has recently attracted attention as an approach for sentiment analysis. This work extends the idea of evaluating the performance of various Machine Learning (ML) classifiers namely logistic regression, Naive Bayes, Support Vector Machine (SVM) and Neural Network (NN).To show their effectiveness in sentiment mining of customer product reviews, the customer feedback has been collected from Grocery and Gourmet Food. Nearly 90 thousands customers feedback reviews of various product related categories namely Product ID, rating, review test, review time reviewer ID and reviewer name are used in this analysis. The performance of the classifiers is measured in terms of accuracy, specificity and sensitivity. From the experimental results, the better machine learning classification algorithm is proposed for sentiment mining using online shopping customer review data.


Author(s):  
Erick Omuya ◽  
George Okeyo ◽  
Michael Kimwele

Social media has been embraced by different people as a convenient and official medium of communication. People write messages and attach images and videos on Twitter, Facebook and other social media which they share. Social media therefore generates a lot of data that is rich in sentiments from these updates. Sentiment analysis has been used to determine opinions of clients, for instance, relating to a particular product or company. Knowledge based approach and Machine learning approach are among the strategies that have been used to analyze these sentiments. The performance of sentiment analysis is however distorted by noise, the curse of dimensionality, the data domains and size of data used for training and testing. This research aims at developing a model for sentiment analysis in which dimensionality reduction and the use of different parts of speech improves sentiment analysis performance. It uses natural language processing for filtering, storing and performing sentiment analysis on the data from social media. The model is tested using Naïve Bayes, Support Vector Machines and K-Nearest neighbor machine learning algorithms and its performance compared with that of two other Sentiment Analysis models. Experimental results show that the model improves sentiment analysis performance using machine learning techniques.


2019 ◽  
Vol 46 (4) ◽  
pp. 544-559 ◽  
Author(s):  
Ahmed Oussous ◽  
Fatima-Zahra Benjelloun ◽  
Ayoub Ait Lahcen ◽  
Samir Belfkih

Sentiment analysis (SA), also known as opinion mining, is a growing important research area. Generally, it helps to automatically determine if a text expresses a positive, negative or neutral sentiment. It enables to mine the huge increasing resources of shared opinions such as social networks, review sites and blogs. In fact, SA is used by many fields and for various languages such as English and Arabic. However, since Arabic is a highly inflectional and derivational language, it raises many challenges. In fact, SA of Arabic text should handle such complex morphology. To better handle these challenges, we decided to provide the research community and Arabic users with a new efficient framework for Arabic Sentiment Analysis (ASA). Our primary goal is to improve the performance of ASA by exploiting deep learning while varying the preprocessing techniques. For that, we implement and evaluate two deep learning models namely convolutional neural network (CNN) and long short-term memory (LSTM) models. The framework offers various preprocessing techniques for ASA (including stemming, normalisation, tokenization and stop words). As a result of this work, we first provide a new rich and publicly available Arabic corpus called Moroccan Sentiment Analysis Corpus (MSAC). Second, the proposed framework demonstrates improvement in ASA. In fact, the experimental results prove that deep learning models have a better performance for ASA than classical approaches (support vector machines, naive Bayes classifiers and maximum entropy). They also show the key role of morphological features in Arabic Natural Language Processing (NLP).


Author(s):  
Hendri Murfi ◽  
Furida Lusi Siagian ◽  
Yudi Satria

Purpose The purpose of this paper is to analyze topics as alternative features for sentiment analysis in Indonesian tweets. Design/methodology/approach Given Indonesian tweets, the processes of sentiment analysis start by extracting features from the tweets. The features are words or topics. The authors use non-negative matrix factorization to extract the topics and apply a support vector machine to classify the tweets into its sentiment class. Findings The authors analyze the accuracy using the two-class and three-class sentiment analysis data sets. Both data sets are about sentiments of candidates for Indonesian presidential election. The experiments show that the standard word features give better accuracies than the topics features for the two-class sentiment analysis. Moreover, the topic features can slightly improve the accuracy of the standard word features. The topic features can also improve the accuracy of the standard word features for the three-class sentiment analysis. Originality/value The standard textual data representation for sentiment analysis using machine learning is bag of word and its extensions mainly created by natural language processing. This paper applies topics as novel features for the machine learning-based sentiment analysis in Indonesian tweets.


Sign in / Sign up

Export Citation Format

Share Document