A Comparative Approach for Opinion Spam Detection Using Sentiment Analysis

Author(s):  
Ashish Singh ◽  
Kakali Chatterjee
2020 ◽  
Vol 14 ◽  
Author(s):  
Meesala Shobha Rani ◽  
Sumathy Subramanian

: With the vast development of internet technology 2.0, millions of people are sharing their opinions on different social networking sites. To obtain the necessary information from the huge volume of user-generated data, the attention on sentiment analysis among the research community is growing. Growth and prominence of sentiment analysis is synchronized with an increase in social media and networking sites. Users generally use natural language for speaking, writing, and expressing their views based on various sentiment orientations, ratings, and the features of different products, topics, and issues. This helps to produce ambiguity at the end of the customer's decision based on criticism to form an opinion based on such comments. To overcome the challenges of user-generated content such as noisy, irrelevant information and fake reviews, there is a significant demand for an effective methodology that emphasizes the need for sentiment analysis. This study presents an exhaustive survey of the existing methodologies and highlights the challenges and performance factors of various approaches of sentiment analysis including text preprocessing, opinion spam detection, and aspect level sentiment analysis. Background: User-generated content is growing all over the globe and people more eagerly express their views on social media towards various aspects. The opinionated text is difficult to interpret and arrive at a conclusion based on the feedback gathered from reviews on various sites. Hence, the significance of sentiment analysis is growing to analyze the usergenerated data. Objective: The paper presents an exhaustive review that provides an overview of the pros and cons of the existing techniques and highlights the current techniques in sentiment analysis namely text pre-processing, opinion spam detection, and aspect level sentiment analysis based on machine learning and deep learning. This will be useful to researchers who focus on the challenges very specifically and identify the most common challenges to work forward for a new solution.


2020 ◽  
Vol 5 (2) ◽  
pp. 76-110
Author(s):  
Ajay Rastogi ◽  
Monica Mehrotra ◽  
Syed Shafat Ali

AbstractPurposeThis paper aims to analyze the effectiveness of two major types of features—metadata-based (behavioral) and content-based (textual)—in opinion spam detection.Design/methodology/approachBased on spam-detection perspectives, our approach works in three settings: review-centric (spam detection), reviewer-centric (spammer detection) and product-centric (spam-targeted product detection). Besides this, to negate any kind of classifier-bias, we employ four classifiers to get a better and unbiased reflection of the obtained results. In addition, we have proposed a new set of features which are compared against some well-known related works. The experiments performed on two real-world datasets show the effectiveness of different features in opinion spam detection.FindingsOur findings indicate that behavioral features are more efficient as well as effective than the textual to detect opinion spam across all three settings. In addition, models trained on hybrid features produce results quite similar to those trained on behavioral features than on the textual, further establishing the superiority of behavioral features as dominating indicators of opinion spam. The features used in this work provide improvement over existing features utilized in other related works. Furthermore, the computation time analysis for feature extraction phase shows the better cost efficiency of behavioral features over the textual.Research limitationsThe analyses conducted in this paper are solely limited to two well-known datasets, viz., YelpZip and YelpNYC of Yelp.com.Practical implicationsThe results obtained in this paper can be used to improve the detection of opinion spam, wherein the researchers may work on improving and developing feature engineering and selection techniques focused more on metadata information.Originality/valueTo the best of our knowledge, this study is the first of its kind which considers three perspectives (review, reviewer and product-centric) and four classifiers to analyze the effectiveness of opinion spam detection using two major types of features. This study also introduces some novel features, which help to improve the performance of opinion spam detection methods.


2018 ◽  
Vol 24 (2) ◽  
pp. 1437-1442
Author(s):  
Shirin Noekhah ◽  
Naomie Binti Salim ◽  
Nor Hawaniah Zakaria

Author(s):  
Subhadip Chandra ◽  
Randrita Sarkar ◽  
Sayon Islam ◽  
Soham Nandi ◽  
Avishto Banerjee ◽  
...  

Sentiment analysis is the methodical recognition, extraction, quantification, and learning of affective states and subjective information using natural language processing, text analysis, computational linguistics, and biometrics. People frequently use Twitter, one of numerous popular social media platforms, to convey their thoughts and opinions about a business, a product, or a service. Analysis of tweet sentiments is particularly useful in detecting if people have a good, negative, or neutral opinion. This study assesses public opinion about an individual, activity, commodity, or organization. The Twitter API is utilised in this article to directly get tweets from Twitter and develop a sentiment categorization for the tweets. This paper has used Twitter data for two separate approaches, viz., Lexicon & Machine Learning. Lexicon based approach further categorized in Corpus-based and Dictionary-based. And various Machine learning-based approaches like Support Vector Machine (SVM), Naïve Bayes, Maximum entropy are used to analyse Twitter data. Neural Network (NN), Decision tree-based sentiment analysis is also covered in this research work, to find out better accuracy of the approaches in the various data range. Graphs and confusion matrices are used to visualise the results of the analysis for positive, negative, and neutral remarks regarding their opinions.


2020 ◽  
Vol 28 (1) ◽  
pp. 83-94
Author(s):  
Enaitz Ezpeleta ◽  
Iñaki Velez de Mendizabal ◽  
José María Gómez Hidalgo ◽  
Urko Zurutuza

Abstract Unsolicited email campaigns remain as one of the biggest threats affecting millions of users per day. During the past years several techniques to detect unsolicited emails have been developed. This work provides means to validate the hypothesis that the identification of the email messages’ intention can be approached by sentiment analysis and personality recognition techniques. These techniques will provide new features that improve current spam classification techniques. We combine personality recognition and sentiment analysis techniques to analyse email content. We enrich a publicly available dataset adding these features, first separately and after in combination, of each message to the dataset, creating new datasets. We apply several combinations of the best email spam classifiers and filters to each dataset in order to compare results.


Sign in / Sign up

Export Citation Format

Share Document