scholarly journals Online Fraud Review Detection Using Data Mining

Online reviews have great impact on today’s business and commerce. Decision making for purchase of online products mostly depends on reviews given by the users. Nowadays, there are a number of people using social media opinions to create their call on shopping for product or service. Opinion Spam detection is an exhausting and hard problem as there are many faux or fake reviews that have been created by organizations or by the people for various purposes. They write fake reviews to mislead readers or automated detection system by promoting or demoting target products to promote them or to degrade their reputations, opportunistic individuals or groups try to manipulate product reviews for their own interests. This paper introduces some semi-supervised and supervised text mining models to detect fake online reviews as well as compares the efficiency of both techniques on dataset containing hotel reviews.

2020 ◽  
Vol 17 (12) ◽  
pp. 5464-5468
Author(s):  
Ch. V. Bhargavi ◽  
G. Mani ◽  
G. Jyothi ◽  
K. Venkat Rao ◽  
E. Laxmi Lydia

Most of the people requires genuine information about the online product. Before spending their economy on particular product can analyze the various reviews in the website. In this scenario, they did not identify whether the product may be fake or genuine. In general, some reports in the websites are good, company technical people itself add these for making the product famous. These people belong to media and social organization teams, they give reviews with a good rating by their own firm. Online purchasers did not identify the fake product because of this falsification in the reviews of the website. In this research,the SVM classification mechanism has been used for detect the fake reviews by using IP address. This implementation helpful for users find out the correct review of online product. In this accuracy is improved by 98.79%, F1-Score increases by 10%.


2021 ◽  
Vol 13 (1) ◽  
pp. 1-16
Author(s):  
Michela Fazzolari ◽  
Francesco Buccafurri ◽  
Gianluca Lax ◽  
Marinella Petrocchi

Over the past few years, online reviews have become very important, since they can influence the purchase decision of consumers and the reputation of businesses. Therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.


Author(s):  
Mayuri Manikrao Patil ◽  
Snehal Nimba Nikumbh ◽  
Aparna Parshwanath Parigond

A customer’s decision to purchase a product or service are primarily influenced by online reviews. Customers use online reviews, which are valuable sources of information to understand the public opinion on products and/or services. Dependability on online reviews can give rise to the potential concern that violator could give deceitful reviews in order to synthetically promote or decry products and services. This practice is known as Opinion Spam, where spammers manipulate reviews by making fake, untruthful, or deceptive reviews to get profit and boost their products, and devalue a competitor’s products. In order to tackle this issue, we propose to build a fraud risk management system and removal model. This captures fraudulent transactions based on user behaviors and network, analyses them in real-time using Data Mining, and accurately predicts the suspicious users and transactions. In this system, we use two algorithms NLP and TF-IDF to differentiate between fake and genuine reviews or feedback received by the customers


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Saleh Nagi Alsubari ◽  
Sachin N. Deshmukh ◽  
Mosleh Hmoud Al-Adhaileh ◽  
Fawaz Waselalla Alsaade ◽  
Theyazn H. H. Aldhyani

Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.


Author(s):  
Aneel Narayanapur ◽  
Pavankumar Naik ◽  
Suraksha G ◽  
Pavitra S I ◽  
Shruddha Mudigoudar ◽  
...  

Online reviews have great impact on today's business and commerce. Decision making for purchase of online products mostly depends on reviews given by the users. Hence, opportunistic individuals or groups try to manipulate product reviews for their own interests. This paper introduces some semi-supervised and supervised text mining models to detect fake online reviews as well as compares the efficiency of both techniques on data set containing hotel reviews.


2017 ◽  
Vol 16 (04) ◽  
pp. 1750036 ◽  
Author(s):  
Ajay Rastogi ◽  
Monica Mehrotra

Online reviews are the most valuable sources of information about customer opinions and are considered the pillars on which the reputation of an organisation is built. From a customer’s perspective, review information is key to making a proper decision regarding an online purchase. Reviews are generally considered an unbiased opinion of an individual’s personal experience with a product, but the underlying truth about these reviews tells a different story. Spammers exploit these review platforms illegally because of incentives involved in writing fake reviews, thereby trying to gain an advantage over competitors resulting in an explosive growth of opinion spamming. The present study analyses and categorises the available literature on opinion spamming according to three detection targets: (1) opinion spam, (2) opinion spammers, and (3) collusive opinion spammer groups. The study further highlights and divides opinion spamming into three types based on textual and linguistic, behavioural, and relational features. Moreover, several state-of-the-art machine-learning techniques for opinion spam detection have also been discussed in the study. It concludes with a summary of the research articles on opinion spam detection and some interesting results to assist researchers for further exploration of the domain.


2019 ◽  
Vol 12 (2) ◽  
pp. 87
Author(s):  
Yuanchao Liu ◽  
Bo Pang

Online reviews play an increasingly important role in the purchase decisions of potential customers. Incidentally, driven by the desire to gain profit or publicity, spammers may be hired to write fake reviews and promote or demote the reputation of products or services. Correspondingly, opinion spam detection has attracted attention from both business and research communities in recent years. However, unlike other tasks such as news classification or blog classification, the existing review spam datasets are typically limited due to the expensiveness of human annotation, which may further affect detection performance even if excellent classifiers have been developed. We propose a novel approach in this paper to boost opinion spam detection performance by fully utilizing the existing labelled small-size dataset. We first design an annotation extension scheme that uses extra tree classifiers to train multiple estimators and then iteratively generate reliable labelled samples from unlabeled ones. Subsequently, we examine neural network scenarios on a newly extended dataset to learn the distributed representation. Experimental results suggest that the proposed approach has better generalization capability and improved performance than state-of-the-art methods.


2012 ◽  
Vol 11 (1) ◽  
Author(s):  
Puguh Suharso

Globalisation era is surely passed on and to lead the people of the world into social interactive one another and also economical competitiveness. How far is DKI Jakarta Government preparing to be up against the global competitiveness in the frame-work to manifest improving the standard of living like advanced of society. There are some of indicators to be used as well as criterion to measure an achievement level of effort to be advanced of society, i.e infrastructure which needed by entrepreneur like : permission, taxation, laboract, traffic road, customs and harbor, publics infrastructure servicing, landuse, security condition, business financial access, and business environment condition. It was the research analysis be done by using data gathering from entrepreneur opinion at the operational area. The aim of research analysis is to measure how level of each indicator value has DKI Jakarta Government prepared to be up against the global competitiveness ? The research conclusion says that : DKI Jakarta Government has well enough prepared to be up against the global competitiveness. The weakness indicator is just taxation because its category included in bad (goodless) while the other indicators are well enough. The measuring parameters due to weakness taxationare time necessity for servicing to arrange tax, amount and various of region retribution, amount and various of region tax, and clarity of tax arrangement prucedure.


2021 ◽  
pp. 002224372199110
Author(s):  
Joy Lu ◽  
Eric T. Bradlow ◽  
J. Wesley Hutchinson

Online educational platforms increasingly allow learners to consume content at their own pace with on-demand formats, in contrast to the synchronous content of traditional education. Thus, it is important to understand and model learner engagement within these environments. Using data from four business courses hosted on Coursera, we model learner behavior as a two-stage decision process, with the first stage determining across-day continuation versus quitting and the second stage determining within-day choices among lectures, quizzes, and breaks. By modeling the heterogeneity across learners pursuing lecture and quiz completion goals, we capture different patterns of consumption that correspond to extant theories of goal progress within an empirical field setting. We find that most individuals exhibit a learning style where lecture utility changes as an inverted-U-shaped function of current progress. Our model may also be used as an early detection system to anticipate changes in engagement and allows us to relate learning styles to final performance outcomes and enrollment in additional courses. Finally, we examine the role of quizzes in how consumption patterns vary across learners in different courses and between those who have paid or not paid for the option to earn a course certificate.


Author(s):  
Falk Schwendicke ◽  
Akhilanand Chaurasia ◽  
Lubaina Arsiwala ◽  
Jae-Hong Lee ◽  
Karim Elhennawy ◽  
...  

Abstract Objectives Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs. Methods Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498). Data From 321 identified records, 19 studies (published 2017–2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7–93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (–0.581; 95 CI: –1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824). Conclusions DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed. Clinical significance Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.


Sign in / Sign up

Export Citation Format

Share Document