Online Fraud Review Detection Using Data Mining

International Journal for Research in Engineering Application & Management ◽

10.35291/2454-9150.2020.0416 ◽

2020 ◽

pp. 326-330

Keyword(s):

Detection System ◽

Online Reviews ◽

Automated Detection ◽

Product Reviews ◽

Online Fraud ◽

The People ◽

Or Groups ◽

Opinion Spam ◽

Using Data ◽

Fake Reviews

Online reviews have great impact on today’s business and commerce. Decision making for purchase of online products mostly depends on reviews given by the users. Nowadays, there are a number of people using social media opinions to create their call on shopping for product or service. Opinion Spam detection is an exhausting and hard problem as there are many faux or fake reviews that have been created by organizations or by the people for various purposes. They write fake reviews to mislead readers or automated detection system by promoting or demoting target products to promote them or to degrade their reputations, opportunistic individuals or groups try to manipulate product reviews for their own interests. This paper introduces some semi-supervised and supervised text mining models to detect fake online reviews as well as compares the efficiency of both techniques on dataset containing hotel reviews.

Download Full-text

A Real and Accurate Fake Product Detection System and Generate Original Reviews Using Data Mining Mechanism

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9440 ◽

2020 ◽

Vol 17 (12) ◽

pp. 5464-5468

Author(s):

Ch. V. Bhargavi ◽

G. Mani ◽

G. Jyothi ◽

K. Venkat Rao ◽

E. Laxmi Lydia

Keyword(s):

Data Mining ◽

Social Organization ◽

Detection System ◽

Ip Address ◽

Svm Classification ◽

The People ◽

Using Data ◽

Fake Reviews ◽

Technical People ◽

Good Rating

Most of the people requires genuine information about the online product. Before spending their economy on particular product can analyze the various reviews in the website. In this scenario, they did not identify whether the product may be fake or genuine. In general, some reports in the websites are good, company technical people itself add these for making the product famous. These people belong to media and social organization teams, they give reviews with a good rating by their own firm. Online purchasers did not identify the fake product because of this falsification in the reviews of the website. In this research,the SVM classification mechanism has been used for detect the fake reviews by using IP address. This implementation helpful for users find out the correct review of online product. In this accuracy is improved by 98.79%, F1-Score increases by 10%.

Download Full-text

Experience

Journal of Data and Information Quality ◽

10.1145/3439307 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-16

Author(s):

Michela Fazzolari ◽

Francesco Buccafurri ◽

Gianluca Lax ◽

Marinella Petrocchi

Keyword(s):

Relative Frequency ◽

Service Providers ◽

Real Data ◽

Online Reviews ◽

Purchase Decision ◽

The Past ◽

Supervised Classifiers ◽

Relative Frequency Distribution ◽

Opinion Spam ◽

Fake Reviews

Over the past few years, online reviews have become very important, since they can influence the purchase decision of consumers and the reputation of businesses. Therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.

Download Full-text

Fake Product Monitoring and Removal for Genuine Product Feedback

International Journal of Emerging Science and Engineering ◽

10.35940/ijese.a2494.037121 ◽

2021 ◽

Vol 7 (1) ◽

pp. 1-3

Author(s):

Mayuri Manikrao Patil ◽

Snehal Nimba Nikumbh ◽

Aparna Parshwanath Parigond

Keyword(s):

Online Reviews ◽

Sources Of Information ◽

The Public ◽

Fraud Risk ◽

Network Analyses ◽

User Behaviors ◽

Opinion Spam ◽

Using Data ◽

Fraud Risk Management ◽

Removal Model

A customer’s decision to purchase a product or service are primarily influenced by online reviews. Customers use online reviews, which are valuable sources of information to understand the public opinion on products and/or services. Dependability on online reviews can give rise to the potential concern that violator could give deceitful reviews in order to synthetically promote or decry products and services. This practice is known as Opinion Spam, where spammers manipulate reviews by making fake, untruthful, or deceptive reviews to get profit and boost their products, and devalue a competitor’s products. In order to tackle this issue, we propose to build a fraud risk management system and removal model. This captures fraudulent transactions based on user behaviors and network, analyses them in real-time using Data Mining, and accurately predicts the suspicious users and transactions. In this system, we use two algorithms NLP and TF-IDF to differentiate between fake and genuine reviews or feedback received by the customers

Download Full-text

Development of Integrated Neural Network Model for Identification of Fake Reviews in E-Commerce Using Multidomain Datasets

Applied Bionics and Biomechanics ◽

10.1155/2021/5522574 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Saleh Nagi Alsubari ◽

Sachin N. Deshmukh ◽

Mosleh Hmoud Al-Adhaileh ◽

Fawaz Waselalla Alsaade ◽

Theyazn H. H. Aldhyani

Keyword(s):

Detection System ◽

Input Sequence ◽

Online Reviews ◽

Word Embedding ◽

Input Word ◽

Wrong Decision ◽

Cross Domain ◽

Proposed Model ◽

N Gram ◽

Fake Reviews

Online product reviews play a major role in the success or failure of an E-commerce business. Before procuring products or services, the shoppers usually go through the online reviews posted by previous customers to get recommendations of the details of products and make purchasing decisions. Nevertheless, it is possible to enhance or hamper specific E-business products by posting fake reviews, which can be written by persons called fraudsters. These reviews can cause financial loss to E-commerce businesses and misguide consumers to take the wrong decision to search for alternative products. Thus, developing a fake review detection system is ultimately required for E-commerce business. The proposed methodology has used four standard fake review datasets of multidomains include hotels, restaurants, Yelp, and Amazon. Further, preprocessing methods such as stopword removal, punctuation removal, and tokenization have performed as well as padding sequence method for making the input sequence has fixed length during training, validation, and testing the model. As this methodology uses different sizes of datasets, various input word-embedding matrices of n-gram features of the review’s text are developed and created with help of word-embedding layer that is one component of the proposed model. Convolutional and max-pooling layers of the CNN technique are implemented for dimensionality reduction and feature extraction, respectively. Based on gate mechanisms, the LSTM layer is combined with the CNN technique for learning and handling the contextual information of n-gram features of the review’s text. Finally, a sigmoid activation function as the last layer of the proposed model receives the input sequences from the previous layer and performs binary classification task of review text into fake or truthful. In this paper, the proposed CNN-LSTM model was evaluated in two types of experiments, in-domain and cross-domain experiments. For an in-domain experiment, the model is applied on each dataset individually, while in the case of a cross-domain experiment, all datasets are gathered and put into a single data frame and evaluated entirely. The testing results of the model in-domain experiment datasets were 77%, 85%, 86%, and 87% in the terms of accuracy for restaurant, hotel, Yelp, and Amazon datasets, respectively. Concerning the cross-domain experiment, the proposed model has attained 89% accuracy. Furthermore, comparative analysis of the results of in-domain experiments with existing approaches has been done based on accuracy metric and, it is observed that the proposed model outperformed the compared methods.

Download Full-text

Fake Detection of Online Reviews using Semi-Supervised and Supervised Learning

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit2063112 ◽

2020 ◽

pp. 428-431

Author(s):

Aneel Narayanapur ◽

Pavankumar Naik ◽

Suraksha G ◽

Pavitra S I ◽

Shruddha Mudigoudar ◽

...

Keyword(s):

Decision Making ◽

Text Mining ◽

Supervised Learning ◽

Online Reviews ◽

Product Reviews ◽

Data Set ◽

Or Groups

Online reviews have great impact on today's business and commerce. Decision making for purchase of online products mostly depends on reviews given by the users. Hence, opportunistic individuals or groups try to manipulate product reviews for their own interests. This paper introduces some semi-supervised and supervised text mining models to detect fake online reviews as well as compares the efficiency of both techniques on data set containing hotel reviews.

Download Full-text

Opinion Spam Detection in Online Reviews

Journal of Information & Knowledge Management ◽

10.1142/s0219649217500368 ◽

2017 ◽

Vol 16 (04) ◽

pp. 1750036 ◽

Cited By ~ 6

Author(s):

Ajay Rastogi ◽

Monica Mehrotra

Keyword(s):

Personal Experience ◽

State Of The Art ◽

Online Reviews ◽

Machine Learning Techniques ◽

Spam Detection ◽

Sources Of Information ◽

Online Purchase ◽

Learning Techniques ◽

Opinion Spam ◽

Fake Reviews

Online reviews are the most valuable sources of information about customer opinions and are considered the pillars on which the reputation of an organisation is built. From a customer’s perspective, review information is key to making a proper decision regarding an online purchase. Reviews are generally considered an unbiased opinion of an individual’s personal experience with a product, but the underlying truth about these reviews tells a different story. Spammers exploit these review platforms illegally because of incentives involved in writing fake reviews, thereby trying to gain an advantage over competitors resulting in an explosive growth of opinion spamming. The present study analyses and categorises the available literature on opinion spamming according to three detection targets: (1) opinion spam, (2) opinion spammers, and (3) collusive opinion spammer groups. The study further highlights and divides opinion spamming into three types based on textual and linguistic, behavioural, and relational features. Moreover, several state-of-the-art machine-learning techniques for opinion spam detection have also been discussed in the study. It concludes with a summary of the research articles on opinion spam detection and some interesting results to assist researchers for further exploration of the domain.

Download Full-text

Opinion Spam Detection based on Annotation Extension and Neural Networks

Computer and Information Science ◽

10.5539/cis.v12n2p87 ◽

2019 ◽

Vol 12 (2) ◽

pp. 87

Author(s):

Yuanchao Liu ◽

Bo Pang

Keyword(s):

Online Reviews ◽

Detection Performance ◽

Spam Detection ◽

Purchase Decisions ◽

Novel Approach ◽

Review Spam ◽

Opinion Spam ◽

Improved Performance ◽

Fake Reviews ◽

Potential Customers

Online reviews play an increasingly important role in the purchase decisions of potential customers. Incidentally, driven by the desire to gain profit or publicity, spammers may be hired to write fake reviews and promote or demote the reputation of products or services. Correspondingly, opinion spam detection has attracted attention from both business and research communities in recent years. However, unlike other tasks such as news classification or blog classification, the existing review spam datasets are typically limited due to the expensiveness of human annotation, which may further affect detection performance even if excellent classifiers have been developed. We propose a novel approach in this paper to boost opinion spam detection performance by fully utilizing the existing labelled small-size dataset. We first design an annotation extension scheme that uses extra tree classifiers to train multiple estimators and then iteratively generate reliable labelled samples from unlabeled ones. Subsequently, we examine neural network scenarios on a newly extended dataset to learn the distributed representation. Experimental results suggest that the proposed approach has better generalization capability and improved performance than state-of-the-art methods.

Download Full-text

MODEL UNTUK MENILAI KESIAPAN PEMDA “DKI JAKARTA” DALAM PENGEMBANGAN WILAYAH BERBASIS DUNIA USAHA

Jurnal Sains dan Teknologi Indonesia ◽

10.29122/jsti.v11i1.816 ◽

2012 ◽

Vol 11 (1) ◽

Author(s):

Puguh Suharso

Keyword(s):

Data Gathering ◽

Business Environment ◽

Standard Of Living ◽

Achievement Level ◽

Global Competitiveness ◽

The People ◽

Frame Work ◽

Using Data ◽

Level Of Effort ◽

Research Analysis

Globalisation era is surely passed on and to lead the people of the world into social interactive one another and also economical competitiveness. How far is DKI Jakarta Government preparing to be up against the global competitiveness in the frame-work to manifest improving the standard of living like advanced of society. There are some of indicators to be used as well as criterion to measure an achievement level of effort to be advanced of society, i.e infrastructure which needed by entrepreneur like : permission, taxation, laboract, traffic road, customs and harbor, publics infrastructure servicing, landuse, security condition, business financial access, and business environment condition. It was the research analysis be done by using data gathering from entrepreneur opinion at the operational area. The aim of research analysis is to measure how level of each indicator value has DKI Jakarta Government prepared to be up against the global competitiveness ? The research conclusion says that : DKI Jakarta Government has well enough prepared to be up against the global competitiveness. The weakness indicator is just taxation because its category included in bad (goodless) while the other indicators are well enough. The measuring parameters due to weakness taxationare time necessity for servicing to arrange tax, amount and various of region retribution, amount and various of region tax, and clarity of tax arrangement prucedure.

Download Full-text

EXPRESS: Testing Theories of Goal Progress in Online Learning

Journal of Marketing Research ◽

10.1177/0022243721991100 ◽

2021 ◽

pp. 002224372199110

Author(s):

Joy Lu ◽

Eric T. Bradlow ◽

J. Wesley Hutchinson

Keyword(s):

Learning Styles ◽

Learning Style ◽

Detection System ◽

Traditional Education ◽

Performance Outcomes ◽

Learner Engagement ◽

Goal Progress ◽

Business Courses ◽

Using Data

Online educational platforms increasingly allow learners to consume content at their own pace with on-demand formats, in contrast to the synchronous content of traditional education. Thus, it is important to understand and model learner engagement within these environments. Using data from four business courses hosted on Coursera, we model learner behavior as a two-stage decision process, with the first stage determining across-day continuation versus quitting and the second stage determining within-day choices among lectures, quizzes, and breaks. By modeling the heterogeneity across learners pursuing lecture and quiz completion goals, we capture different patterns of consumption that correspond to extant theories of goal progress within an empirical field setting. We find that most individuals exhibit a learning style where lecture utility changes as an inverted-U-shaped function of current progress. Our model may also be used as an early detection system to anticipate changes in engagement and allows us to relate learning styles to final performance outcomes and enrollment in additional courses. Finally, we examine the role of quizzes in how consumption patterns vary across learners in different courses and between those who have paid or not paid for the option to earn a course certificate.

Download Full-text

Deep learning for cephalometric landmark detection: systematic review and meta-analysis

Clinical Oral Investigations ◽

10.1007/s00784-021-03990-w ◽

2021 ◽

Author(s):

Falk Schwendicke ◽

Akhilanand Chaurasia ◽

Lubaina Arsiwala ◽

Jae-Hong Lee ◽

Karim Elhennawy ◽

...

Keyword(s):

Systematic Review ◽

Deep Learning ◽

Meta Analysis ◽

High Accuracy ◽

Risk Of Bias ◽

Automated Detection ◽

Reference Test ◽

Landmark Detection ◽

Future Studies ◽

Using Data

Abstract Objectives Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs. Methods Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498). Data From 321 identified records, 19 studies (published 2017–2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7–93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (–0.581; 95 CI: –1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824). Conclusions DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed. Clinical significance Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.

Download Full-text