scholarly journals A proposed system for opinion mining using machine learning, NLP and classifiers

Author(s):  
Poonam Tanwar ◽  
Priyanka Rai

In today’s life consumer reviews are the part of everyday life. User read the reviews before purchase, or stores it for finding the best product through comparison of the product review. From customers view point the reviews play vital role to make a decision regarding an online purchase as well as spammers to write the fake reviews which can increase or defame the reputation of any product. Spammers are using these platforms illegally for financial benefits/incentives are involved in writing fake reviews and they are trying to achieve their motive in terms of financial or to defeat the competitor which causes an explosive growth of sentiment/opinion spamming of writing forged/fake reviews. The present studies and research are used to analyse and categorize the opinion spamming into three different detection targets opinion spam, spammers, and to find the collusive opinion spammer groups so that false opinions can be avoided. Opinion spamming further divided into three different types based on textual and linguistic, behavioral, and relational features. The motivation behind this work is to study the dynamics of spam diffusion and extract the latent features that fuel the diffusion process. The user-based features and content-based features have been used for the categorization of spam/non-spam content. The contributions of this work are building the datasetwhich assists as the ground-truth for classifying/analyzing the variation of fraud/genuine and non-spam/spam information diffusion and to analyze the effects of topics over the diffusibility of non-spam and spam evidences/information. The paper, carried out an in-depth analysis of Twitter Spam diffusion.

2017 ◽  
Vol 16 (04) ◽  
pp. 1750036 ◽  
Author(s):  
Ajay Rastogi ◽  
Monica Mehrotra

Online reviews are the most valuable sources of information about customer opinions and are considered the pillars on which the reputation of an organisation is built. From a customer’s perspective, review information is key to making a proper decision regarding an online purchase. Reviews are generally considered an unbiased opinion of an individual’s personal experience with a product, but the underlying truth about these reviews tells a different story. Spammers exploit these review platforms illegally because of incentives involved in writing fake reviews, thereby trying to gain an advantage over competitors resulting in an explosive growth of opinion spamming. The present study analyses and categorises the available literature on opinion spamming according to three detection targets: (1) opinion spam, (2) opinion spammers, and (3) collusive opinion spammer groups. The study further highlights and divides opinion spamming into three types based on textual and linguistic, behavioural, and relational features. Moreover, several state-of-the-art machine-learning techniques for opinion spam detection have also been discussed in the study. It concludes with a summary of the research articles on opinion spam detection and some interesting results to assist researchers for further exploration of the domain.


2021 ◽  
Vol 13 (1) ◽  
pp. 1-16
Author(s):  
Michela Fazzolari ◽  
Francesco Buccafurri ◽  
Gianluca Lax ◽  
Marinella Petrocchi

Over the past few years, online reviews have become very important, since they can influence the purchase decision of consumers and the reputation of businesses. Therefore, the practice of writing fake reviews can have severe consequences on customers and service providers. Various approaches have been proposed for detecting opinion spam in online reviews, especially based on supervised classifiers. In this contribution, we start from a set of effective features used for classifying opinion spam and we re-engineered them by considering the Cumulative Relative Frequency Distribution of each feature. By an experimental evaluation carried out on real data from Yelp.com, we show that the use of the distributional features is able to improve the performances of classifiers.


2015 ◽  
Vol 15 (05) ◽  
pp. 1550085 ◽  
Author(s):  
MADHURI TASGAONKAR ◽  
MADHURI KHAMBETE

Diabetes affects retinal structure of a diabetic patient by generating various lesions. Early detection of these lesions can avoid the loss of vision. Automation of detection process can be made easily feasible to masses by the use of fundus imaging. Detection of exudates is significant in diabetic retinopathy (DR) as they are earlier signs and can cause blindness. Finding the exact location as well as correct number of exudates play vital role in the overall treatment of a patient. This paper presents an algorithm for automatic detection of exudates for DR. The algorithm combines the advantages of supervised and unsupervised techniques. It uses fuzzy-C means (FCM) segmentation on coarse level and mahalanobis metric for finer classification of segmented pixels. Mahalanobis criterion gives significance to most relevant features and thus proves a better classifier. The results are validated using DIARETDB0 and DIARETDB1 databases and the ground truth provided with it. This evaluation provided 95.77% detection accuracy.


Author(s):  
Neha Thomas ◽  
Susan Elias

 Abstract— Detection of fake review and reviewers is currently a challenging problem in cyber space. It is challenging primarily due to the dynamic nature of the methodology used to fake the review. There are several aspects to be considered when analyzing reviews to classify them effective into genuine and fake. Sentiment analysis, opinion mining and intend mining are fields of research that try to accomplish the goal through Natural Language Processing of the text content of the review.  In this paper, an approach that uses the review ratings evaluated along a timeline is presented. An Amazon dataset comprising of ratings indicated for a wide range of products was used for the analysis presented here. The analysis of the ratings was carried out for an electronic product over a period of six years.  The computed average rating helps to identify linear classifiers that define solution boundaries within the dataspace. This enables a product specific classification of review ratings and suitable recommendations can also be generated automatically. The paper explains a methodology to evaluate the average product ratings over time and presents the research outcomes using a novel classification tool. The proposed approach helps to determine the optimal point to distinguish between fake and genuine ratings for each product.    Index Terms: Fake reviews, Fake Ratings, Product Ratings, Online Shopping, Amazon Dataset.


Like web spam has been a major threat to almost every aspect of the current World Wide Web, similarly social spam especially in information diffusion has led a serious threat to the utilities of online social media. To combat this challenge the significance and impact of such entities and content should be analyzed critically. In order to address this issue, this work usedTwitter as a case study and modeled the contents of information through topic modeling and coupled it with the user oriented feature to deal it with a good accuracy. Latent Dirichlet Allocation (LDA) a widely used topic modeling technique is applied to capture the latent topics from the tweets’ documents. The major contribution of this work is twofold: constructing the dataset which serves as the ground-truth for analyzing the diffusion dynamics of spam/non-spam information and analyzing the effects of topics over the diffusibility. Exhaustive experiments clearly reveal the variation in topics shared by the spam and nonspam tweets. The rise in popularity of online social networks, not only attracts legitimate users but also the spammers. Legitimate users use the services of OSNs for a good purpose i.e., maintaining the relations with friends/colleagues, sharing the information of interest, increasing the reach of their business through advertisings


Author(s):  
Priti Srinivas Sajja ◽  
Rajendra Akerkar

The research in the field of opinion mining has been ongoing for several years, and many models and techniques have been proposed. One of the techniques that can address the need for automated information monitoring to help to identify the trends and patterns that matter is sentiment mining. Existing approaches enable the analysis of a large number of text documents, mainly based on their statistical properties and possibly combined with numeric data. Most approaches are limited to simple word counts and largely ignore semantic and structural aspects of content. Conversation plays a vital role in expressing and promoting an opinion. In this chapter, the authors discuss the concept of ontology and propose a framework that allows the incorporation of information on conversation structure in the models for sentiment discovery in text.


2019 ◽  
Vol 16 (2) ◽  
pp. 225-253
Author(s):  
Suvendu Kumar Pratihari ◽  
Shigufta Hena Uzma

PurposeThe purpose of this paper is to understand the perception of the bankers towards an integrated approach to corporate social responsibility (CSR) initiatives in a strategic way of achieving sustainable growth of the banking sector. The paper additionally provides insights into different CSR initiatives and their implementation process in the context of scheduled commercial banks (SCB) of India.Design/methodology/approachThe study is exploratory and endorses the qualitative approach of primary research methodology by adopting a non-random stratified sampling method. The localist approach of the face-to-face interview has been applied to collect the data from 26 elite class respondents from 13 SCBs. The interview method was semi-structured and open-ended. The conformity, trustworthiness, credibility, transferability, dependability test of the study have ensured the quality of the data.FindingsThe study reveals that the bankers perceive CSR as a moral obligation for the benefit of the society, beyond the regular banking operations. Further, the study comprehends that the CSR initiatives play a vital role in establishing the bank's image, brand and reputation, as well as, building a strong bond of trust among the employees and the bank management. Besides, CSR activities facilitate to cultivate a better culture by improvising in the quality of customer service for achieving competitive advantages.Research limitations/implicationsThe findings of the study represent a significant contribution to CSR theory from the interface of banking and society. Significantly, the results confirm that CSR initiatives play a vital role in building trust and minimise the gap between the employees and the management of the bank. The banks can increase its acceptance in the society and achieve competitive advantage by integrating CSR objectives with the business objectives to strengthen the corporate personality and brand.Practical implicationsThe study will help practitioners to develop the social identity of their firm to achieve competitive advantages in long-run. The bankers can channelise their limited resources while planning, designing and the implementation of different CSR activities with the overall goal of the bank in a cost-effective way. The study is confined only to public and private SCBs and limited to the geographical scope of one state in India. Therefore, further exploration may be carried out by considering other banks and geographic regions in India and different cross-cultural settings.Originality/valueThe originality of the study lies with the in-depth analysis and quality check of the data. The results can contribute significant value to the qualitative method of conducting research.


2018 ◽  
Vol 7 (4.5) ◽  
pp. 518 ◽  
Author(s):  
Krishna Das ◽  
Smriti Kumar Sinha

In this short paper, network structural measure called centrality measure based mathematical approach is used for detection of malicious nodes in twitter social network. One of the objectives in analysing social networks is to detect malicious nodes which show anomaly behaviours in social networks. There are different approaches for anomaly detection in social networks such as opinion mining methods, behavioural methods, network structural approach etc. Centrality measure, a graph theoretical method related to social network structure, can be used to categorize a node either as popular and influential or as non-influential and anomalous node. Using this approach, we have analyzed twitter social network to remove anomalous nodes from the nodes-edges twitter data set. Thus removal of these kinds of nodes which are not important for information diffusion in the social network, makes the social network clean & speedy in fast information propagation.   


Automatic detection of blocks in the angiographic images is a challenging task. The features such as contrast and gradient of the vessels and the background image are playing a vital role in the detection of the blocks in the X-Ray angiograms. Nowadays, doctors manually identify blocks in the coronary vessels. The automation tool is necessary to identify the blocks in the blood vessels of the heart to help the doctors in the diagnosing process. Spatiotemporal nature of the angiography sequences is used to isolate the coronary artery tree. The coronary artery segment is tracked and in each image frame by frame and the arterial width surface is detected. The stenosis identification is done by using coronary vessel surface’s persistent minima and blob analysis. The proposed method is experimented on 42 patients’ dataset. The performance of the proposed method was evaluated by comparing the blocks identified by the algorithm with the hand-labelled ground truth images given by the experts. The proposed method provides an accuracy of 95.5% on 42 patients with a total of 60 image runs.


Sign in / Sign up

Export Citation Format

Share Document