Opinion Spam Detection in Online Reviews

2017 ◽  
Vol 16 (04) ◽  
pp. 1750036 ◽  
Author(s):  
Ajay Rastogi ◽  
Monica Mehrotra

Online reviews are the most valuable sources of information about customer opinions and are considered the pillars on which the reputation of an organisation is built. From a customer’s perspective, review information is key to making a proper decision regarding an online purchase. Reviews are generally considered an unbiased opinion of an individual’s personal experience with a product, but the underlying truth about these reviews tells a different story. Spammers exploit these review platforms illegally because of incentives involved in writing fake reviews, thereby trying to gain an advantage over competitors resulting in an explosive growth of opinion spamming. The present study analyses and categorises the available literature on opinion spamming according to three detection targets: (1) opinion spam, (2) opinion spammers, and (3) collusive opinion spammer groups. The study further highlights and divides opinion spamming into three types based on textual and linguistic, behavioural, and relational features. Moreover, several state-of-the-art machine-learning techniques for opinion spam detection have also been discussed in the study. It concludes with a summary of the research articles on opinion spam detection and some interesting results to assist researchers for further exploration of the domain.

Now days when someone decide to book a hotel, previous online reviews of the hotels play a major role in determining the best hotel within the budget of the customer. Previous Online reviews are the most important motivation for the information that are used to analyse public opinion. Because of the high impact of the reviews on business, hotel owners are always highly concerned and focused about the customer feedback and past online reviews. But all reviews are not true and trustworthy, sometime few people may intentionally generate the fake reviews to make some hotel famous of to defame. Therefore it is essential to develop and propose the techniques for analysis of reviews. With the help of various machine learning techniques viz. Supervised machine learning technique, Text mining, Unsupervised machine learning technique, Semi-supervised learning, Reinforcement learning etc we may detect the fake reviews. This paper gives some notions of using machine learning techniques in analysis of past online reviews of hotels, Based on the observation it also suggest the optimal machine learning technique for a particular situation


2020 ◽  
Vol 3 (2) ◽  
pp. 46-54
Author(s):  
Abhinandan V. ◽  
Aishwarya C. A. ◽  
Arshiya Sultana

Online reviews play a vital role in today's business and commerce. In the world of e-commerce, reviews are the best signs of success and failure. Businesses that have good reviews get a lot of free exposure on websites and pages that have good reviews show up at the top of the search results. Fake reviews are everywhere online. Online fake reviews are the reviews which are written by someone who has not actually used the product or the services. Because of the cut-throat competition, sellers are now willing to resort to unfair means to make their product stand out. This work introduces some supervised machine learning techniques to detect fake online reviews and also be able to block the malicious users who post such reviews.


Author(s):  
Gray Stanton ◽  
Athirai A. Irissappane

Online reviews have become a vital source of information in purchasing a service (product). Opinion spammers manipulate reviews, affecting the overall perception of the service. A key challenge in detecting opinion spam is obtaining ground truth. Though there exists a large set of reviews, only a few of them have been labeled spam or non-spam. We propose spamGAN, a generative adversarial network which relies on limited labeled data as well as unlabeled data for opinion spam detection. spamGAN improves the state-of-the-art GAN based techniques for text classification. Experiments on TripAdvisor data show that spamGAN outperforms existing techniques when labeled data is limited. spamGAN can also generate reviews with reasonable perplexity.


2019 ◽  
Vol 12 (2) ◽  
pp. 87
Author(s):  
Yuanchao Liu ◽  
Bo Pang

Online reviews play an increasingly important role in the purchase decisions of potential customers. Incidentally, driven by the desire to gain profit or publicity, spammers may be hired to write fake reviews and promote or demote the reputation of products or services. Correspondingly, opinion spam detection has attracted attention from both business and research communities in recent years. However, unlike other tasks such as news classification or blog classification, the existing review spam datasets are typically limited due to the expensiveness of human annotation, which may further affect detection performance even if excellent classifiers have been developed. We propose a novel approach in this paper to boost opinion spam detection performance by fully utilizing the existing labelled small-size dataset. We first design an annotation extension scheme that uses extra tree classifiers to train multiple estimators and then iteratively generate reliable labelled samples from unlabeled ones. Subsequently, we examine neural network scenarios on a newly extended dataset to learn the distributed representation. Experimental results suggest that the proposed approach has better generalization capability and improved performance than state-of-the-art methods.


Energies ◽  
2021 ◽  
Vol 14 (16) ◽  
pp. 4776
Author(s):  
Seyed Mahdi Miraftabzadeh ◽  
Michela Longo ◽  
Federica Foiadelli ◽  
Marco Pasetti ◽  
Raul Igual

The recent advances in computing technologies and the increasing availability of large amounts of data in smart grids and smart cities are generating new research opportunities in the application of Machine Learning (ML) for improving the observability and efficiency of modern power grids. However, as the number and diversity of ML techniques increase, questions arise about their performance and applicability, and on the most suitable ML method depending on the specific application. Trying to answer these questions, this manuscript presents a systematic review of the state-of-the-art studies implementing ML techniques in the context of power systems, with a specific focus on the analysis of power flows, power quality, photovoltaic systems, intelligent transportation, and load forecasting. The survey investigates, for each of the selected topics, the most recent and promising ML techniques proposed by the literature, by highlighting their main characteristics and relevant results. The review revealed that, when compared to traditional approaches, ML algorithms can handle massive quantities of data with high dimensionality, by allowing the identification of hidden characteristics of (even) complex systems. In particular, even though very different techniques can be used for each application, hybrid models generally show better performances when compared to single ML-based models.


2021 ◽  
Vol 13 (1) ◽  
pp. 1-16
Author(s):  
Michela Fazzolari ◽  
Francesco Buccafurri ◽  
Gianluca Lax ◽  
Marinella Petrocchi

Over the past few years, online reviews have become very important, since they can influence the purchase decision of consumers and the reputation of businesses. Therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.


Author(s):  
Rashida Ali ◽  
Ibrahim Rampurawala ◽  
Mayuri Wandhe ◽  
Ruchika Shrikhande ◽  
Arpita Bhatkar

Internet provides a medium to connect with individuals of similar or different interests creating a hub. Since a huge hub participates on these platforms, the user can receive a high volume of messages from different individuals creating a chaos and unwanted messages. These messages sometimes contain a true information and sometimes false, which leads to a state of confusion in the minds of the users and leads to first step towards spam messaging. Spam messages means an irrelevant and unsolicited message sent by a known/unknown user which may lead to a sense of insecurity among users. In this paper, the different machine learning algorithms were trained and tested with natural language processing (NLP) to classify whether the messages are spam or ham.


Sign in / Sign up

Export Citation Format

Share Document