A proposed system for opinion mining using machine learning, NLP and classifiers

In today’s life consumer reviews are the part of everyday life. User read the reviews before purchase, or stores it for finding the best product through comparison of the product review. From customers view point the reviews play vital role to make a decision regarding an online purchase as well as spammers to write the fake reviews which can increase or defame the reputation of any product. Spammers are using these platforms illegally for financial benefits/incentives are involved in writing fake reviews and they are trying to achieve their motive in terms of financial or to defeat the competitor which causes an explosive growth of sentiment/opinion spamming of writing forged/fake reviews. The present studies and research are used to analyse and categorize the opinion spamming into three different detection targets opinion spam, spammers, and to find the collusive opinion spammer groups so that false opinions can be avoided. Opinion spamming further divided into three different types based on textual and linguistic, behavioral, and relational features. The motivation behind this work is to study the dynamics of spam diffusion and extract the latent features that fuel the diffusion process. The user-based features and content-based features have been used for the categorization of spam/non-spam content. The contributions of this work are building the datasetwhich assists as the ground-truth for classifying/analyzing the variation of fraud/genuine and non-spam/spam information diffusion and to analyze the effects of topics over the diffusibility of non-spam and spam evidences/information. The paper, carried out an in-depth analysis of Twitter Spam diffusion.

Download Full-text

Opinion Spam Detection in Online Reviews

Journal of Information & Knowledge Management ◽

10.1142/s0219649217500368 ◽

2017 ◽

Vol 16 (04) ◽

pp. 1750036 ◽

Cited By ~ 6

Author(s):

Ajay Rastogi ◽

Monica Mehrotra

Keyword(s):

Personal Experience ◽

State Of The Art ◽

Online Reviews ◽

Machine Learning Techniques ◽

Spam Detection ◽

Sources Of Information ◽

Online Purchase ◽

Learning Techniques ◽

Opinion Spam ◽

Fake Reviews

Online reviews are the most valuable sources of information about customer opinions and are considered the pillars on which the reputation of an organisation is built. From a customer’s perspective, review information is key to making a proper decision regarding an online purchase. Reviews are generally considered an unbiased opinion of an individual’s personal experience with a product, but the underlying truth about these reviews tells a different story. Spammers exploit these review platforms illegally because of incentives involved in writing fake reviews, thereby trying to gain an advantage over competitors resulting in an explosive growth of opinion spamming. The present study analyses and categorises the available literature on opinion spamming according to three detection targets: (1) opinion spam, (2) opinion spammers, and (3) collusive opinion spammer groups. The study further highlights and divides opinion spamming into three types based on textual and linguistic, behavioural, and relational features. Moreover, several state-of-the-art machine-learning techniques for opinion spam detection have also been discussed in the study. It concludes with a summary of the research articles on opinion spam detection and some interesting results to assist researchers for further exploration of the domain.

Download Full-text

Experience

Journal of Data and Information Quality ◽

10.1145/3439307 ◽

2021 ◽

Vol 13 (1) ◽

pp. 1-16

Author(s):

Michela Fazzolari ◽

Francesco Buccafurri ◽

Gianluca Lax ◽

Marinella Petrocchi

Keyword(s):

Relative Frequency ◽

Service Providers ◽

Real Data ◽

Online Reviews ◽

Purchase Decision ◽

The Past ◽

Supervised Classifiers ◽

Relative Frequency Distribution ◽

Opinion Spam ◽

Fake Reviews

Over the past few years, online reviews have become very important, since they can influence the purchase decision of consumers and the reputation of businesses. Therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.

Download Full-text

INTEGRATING FUZZY C-MEANS AND MAHALANOBIS METRIC CLASSIFICATION FOR EXUDATE DETECTION IN COLOR FUNDUS IMAGING

Journal of Mechanics in Medicine and Biology ◽

10.1142/s0219519415500852 ◽

2015 ◽

Vol 15 (05) ◽

pp. 1550085 ◽

Cited By ~ 2

Author(s):

MADHURI TASGAONKAR ◽

MADHURI KHAMBETE

Keyword(s):

Diabetic Retinopathy ◽

Early Detection ◽

Ground Truth ◽

Vital Role ◽

Detection Accuracy ◽

Fuzzy C Means ◽

Fundus Imaging ◽

Imaging Detection ◽

Retinal Structure

Diabetes affects retinal structure of a diabetic patient by generating various lesions. Early detection of these lesions can avoid the loss of vision. Automation of detection process can be made easily feasible to masses by the use of fundus imaging. Detection of exudates is significant in diabetic retinopathy (DR) as they are earlier signs and can cause blindness. Finding the exact location as well as correct number of exudates play vital role in the overall treatment of a patient. This paper presents an algorithm for automatic detection of exudates for DR. The algorithm combines the advantages of supervised and unsupervised techniques. It uses fuzzy-C means (FCM) segmentation on coarse level and mahalanobis metric for finer classification of segmented pixels. Mahalanobis criterion gives significance to most relevant features and thus proves a better classifier. The results are validated using DIARETDB0 and DIARETDB1 databases and the ground truth provided with it. This evaluation provided 95.77% detection accuracy.

Download Full-text

Classification of Fake Product Ratings Using a Timeline Based Approach

International Journal of Business Administration and Management Research ◽

10.24178/ijbamr.2017.3.2.12 ◽

2017 ◽

Vol 3 (2) ◽

pp. 12 ◽

Cited By ~ 1

Author(s):

Neha Thomas ◽

Susan Elias

Keyword(s):

Language Processing ◽

Opinion Mining ◽

Optimal Point ◽

Linear Classifiers ◽

Wide Range ◽

Text Content ◽

Classification Tool ◽

Fake Reviews ◽

Product Ratings

Abstract— Detection of fake review and reviewers is currently a challenging problem in cyber space. It is challenging primarily due to the dynamic nature of the methodology used to fake the review. There are several aspects to be considered when analyzing reviews to classify them effective into genuine and fake. Sentiment analysis, opinion mining and intend mining are fields of research that try to accomplish the goal through Natural Language Processing of the text content of the review. In this paper, an approach that uses the review ratings evaluated along a timeline is presented. An Amazon dataset comprising of ratings indicated for a wide range of products was used for the analysis presented here. The analysis of the ratings was carried out for an electronic product over a period of six years. The computed average rating helps to identify linear classifiers that define solution boundaries within the dataspace. This enables a product specific classification of review ratings and suitable recommendations can also be generated automatically. The paper explains a methodology to evaluate the average product ratings over time and presents the research outcomes using a novel classification tool. The proposed approach helps to determine the optimal point to distinguish between fake and genuine ratings for each product. Index Terms: Fake reviews, Fake Ratings, Product Ratings, Online Shopping, Amazon Dataset.

Download Full-text

Spam Diffusion in Social Networking Media using Latent Dirichlet Allocation

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7898.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 881-885

Keyword(s):

Online Social Networks ◽

Topic Modeling ◽

Information Diffusion ◽

Latent Dirichlet Allocation ◽

Good Accuracy ◽

Ground Truth ◽

Online Social Media ◽

Diffusion Dynamics ◽

Dirichlet Allocation

Like web spam has been a major threat to almost every aspect of the current World Wide Web, similarly social spam especially in information diffusion has led a serious threat to the utilities of online social media. To combat this challenge the significance and impact of such entities and content should be analyzed critically. In order to address this issue, this work usedTwitter as a case study and modeled the contents of information through topic modeling and coupled it with the user oriented feature to deal it with a good accuracy. Latent Dirichlet Allocation (LDA) a widely used topic modeling technique is applied to capture the latent topics from the tweets’ documents. The major contribution of this work is twofold: constructing the dataset which serves as the ground-truth for analyzing the diffusion dynamics of spam/non-spam information and analyzing the effects of topics over the diffusibility. Exhaustive experiments clearly reveal the variation in topics shared by the spam and nonspam tweets. The rise in popularity of online social networks, not only attracts legitimate users but also the spammers. Legitimate users use the services of OSNs for a good purpose i.e., maintaining the relations with friends/colleagues, sharing the information of interest, increasing the reach of their business through advertisings

Download Full-text

Mining Sentiment Using Conversation Ontology

Advancing Information Management through Semantic Web Concepts and Ontologies ◽

10.4018/978-1-4666-2494-8.ch016 ◽

2013 ◽

pp. 302-315

Author(s):

Priti Srinivas Sajja ◽

Rajendra Akerkar

Keyword(s):

Opinion Mining ◽

Vital Role ◽

Statistical Properties ◽

Text Documents ◽

Information Monitoring ◽

Sentiment Mining ◽

Numeric Data ◽

Structural Aspects

The research in the field of opinion mining has been ongoing for several years, and many models and techniques have been proposed. One of the techniques that can address the need for automated information monitoring to help to identify the trends and patterns that matter is sentiment mining. Existing approaches enable the analysis of a large number of text documents, mainly based on their statistical properties and possibly combined with numeric data. Most approaches are limited to simple word counts and largely ignore semantic and structural aspects of content. Conversation plays a vital role in expressing and promoting an opinion. In this chapter, the authors discuss the concept of ontology and propose a framework that allows the incorporation of information on conversation structure in the models for sentiment discovery in text.

Download Full-text

A survey on bankers’ perception of corporate social responsibility in India

Social Responsibility Journal ◽

10.1108/srj-11-2016-0198 ◽

2019 ◽

Vol 16 (2) ◽

pp. 225-253

Author(s):

Suvendu Kumar Pratihari ◽

Shigufta Hena Uzma

Keyword(s):

Corporate Social Responsibility ◽

Social Responsibility ◽

Banking Sector ◽

Vital Role ◽

Sustainable Growth ◽

Competitive Advantages ◽

Content Type ◽

Depth Analysis ◽

Corporate Social

PurposeThe purpose of this paper is to understand the perception of the bankers towards an integrated approach to corporate social responsibility (CSR) initiatives in a strategic way of achieving sustainable growth of the banking sector. The paper additionally provides insights into different CSR initiatives and their implementation process in the context of scheduled commercial banks (SCB) of India.Design/methodology/approachThe study is exploratory and endorses the qualitative approach of primary research methodology by adopting a non-random stratified sampling method. The localist approach of the face-to-face interview has been applied to collect the data from 26 elite class respondents from 13 SCBs. The interview method was semi-structured and open-ended. The conformity, trustworthiness, credibility, transferability, dependability test of the study have ensured the quality of the data.FindingsThe study reveals that the bankers perceive CSR as a moral obligation for the benefit of the society, beyond the regular banking operations. Further, the study comprehends that the CSR initiatives play a vital role in establishing the bank's image, brand and reputation, as well as, building a strong bond of trust among the employees and the bank management. Besides, CSR activities facilitate to cultivate a better culture by improvising in the quality of customer service for achieving competitive advantages.Research limitations/implicationsThe findings of the study represent a significant contribution to CSR theory from the interface of banking and society. Significantly, the results confirm that CSR initiatives play a vital role in building trust and minimise the gap between the employees and the management of the bank. The banks can increase its acceptance in the society and achieve competitive advantage by integrating CSR objectives with the business objectives to strengthen the corporate personality and brand.Practical implicationsThe study will help practitioners to develop the social identity of their firm to achieve competitive advantages in long-run. The bankers can channelise their limited resources while planning, designing and the implementation of different CSR activities with the overall goal of the bank in a cost-effective way. The study is confined only to public and private SCBs and limited to the geographical scope of one state in India. Therefore, further exploration may be carried out by considering other banks and geographic regions in India and different cross-cultural settings.Originality/valueThe originality of the study lies with the in-depth analysis and quality check of the data. The results can contribute significant value to the qualitative method of conducting research.

Download Full-text

Towards online anti-opinion spam: Spotting fake reviews from the review sequence

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) ◽

10.1109/asonam.2014.6921594 ◽

2014 ◽

Cited By ~ 28

Author(s):

Yuming Lin ◽

Tao Zhu ◽

Hao Wu ◽

Jingwei Zhang ◽

Xiaoling Wang ◽

...

Keyword(s):

Opinion Spam ◽

Fake Reviews

Download Full-text

Centrality measure based approach for detection of malicious nodes in twitter social network

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.5.21147 ◽

2018 ◽

Vol 7 (4.5) ◽

pp. 518 ◽

Cited By ~ 1

Author(s):

Krishna Das ◽

Smriti Kumar Sinha

Keyword(s):

Social Networks ◽

Social Network ◽

Information Diffusion ◽

Opinion Mining ◽

Theoretical Method ◽

Short Paper ◽

Malicious Nodes ◽

Data Set ◽

Centrality Measure ◽

The Social

In this short paper, network structural measure called centrality measure based mathematical approach is used for detection of malicious nodes in twitter social network. One of the objectives in analysing social networks is to detect malicious nodes which show anomaly behaviours in social networks. There are different approaches for anomaly detection in social networks such as opinion mining methods, behavioural methods, network structural approach etc. Centrality measure, a graph theoretical method related to social network structure, can be used to categorize a node either as popular and influential or as non-influential and anomalous node. Using this approach, we have analyzed twitter social network to remove anomalous nodes from the nodes-edges twitter data set. Thus removal of these kinds of nodes which are not important for information diffusion in the social network, makes the social network clean & speedy in fast information propagation.

Download Full-text

Stenosis Detection Algorithm for Coronary Angiograms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8743.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 2362-2367

Keyword(s):

Coronary Artery ◽

Coronary Vessels ◽

Ground Truth ◽

Detection Algorithm ◽

Vital Role ◽

Coronary Artery Segment ◽

Blob Analysis ◽

Artery Segment ◽

Stenosis Detection ◽

Background Image

Automatic detection of blocks in the angiographic images is a challenging task. The features such as contrast and gradient of the vessels and the background image are playing a vital role in the detection of the blocks in the X-Ray angiograms. Nowadays, doctors manually identify blocks in the coronary vessels. The automation tool is necessary to identify the blocks in the blood vessels of the heart to help the doctors in the diagnosing process. Spatiotemporal nature of the angiography sequences is used to isolate the coronary artery tree. The coronary artery segment is tracked and in each image frame by frame and the arterial width surface is detected. The stenosis identification is done by using coronary vessel surface’s persistent minima and blob analysis. The proposed method is experimented on 42 patients’ dataset. The performance of the proposed method was evaluated by comparing the blocks identified by the algorithm with the hand-labelled ground truth images given by the experts. The proposed method provides an accuracy of 95.5% on 42 patients with a total of 60 image runs.

Download Full-text