scholarly journals A Method of Product Selection Based on Online Reviews

2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Xia Liang ◽  
Jie Guo ◽  
Yan Sun ◽  
Xiaoxiao Liu

With the rapid development of information technology and market economy, global e-commerce platform develops rapidly. Recently, online reviews are widely available on e-commerce platforms to express customers’ experience of products. When ranking alternative products based on online reviews, how to make full use of the information in online reviews to represent the sentiment analysis results of online reviews is an important prerequisite for decision analysis. To this end, we propose a method for measuring the time utility and support utility of online reviews. Then a method for representing the sentiment analysis results of online reviews in the form of linguistic distribution is proposed. In addition, in view of the attributes and their weights being unknown, we propose a method for extracting product attributes from online reviews by using the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm; and the objective weights of attributes are determined through the Criteria Importance through Intercriteria Correlation (CRITIC) method. Additionally, in order to highlight the differences between the alternatives, the roulette wheel selection algorithm is first used to randomly select product attributes. Then the alternative products can be ranked by the extended Multi-Attributive Border Approximation area Comparison (MABAC) method with mixed information. Finally, we illustrate the applicability of the proposed method through a case study of selecting a 5G mobile phone and simulation experiment.

2021 ◽  
Vol 143 (8) ◽  
Author(s):  
Junegak Joung ◽  
Harrison M. Kim

Abstract The importance–performance analysis (IPA) is a widely used technique to guide strategic planning for the improvement of customer satisfaction. Compared with surveys, numerous online reviews can be easily collected at a lower cost. Online reviews provide a promising source for the IPA. This paper proposes an approach for conducting the IPA from online reviews for product design. Product attributes from online reviews are first identified by latent Dirichlet allocation. The performance of the identified attributes is subsequently estimated by the aspect-based sentiment analysis of IBM Watson. Finally, the importance of the identified attributes is estimated by evaluating the effect of sentiments of each product attribute on the overall rating using an explainable deep neural network. A Shapley additive explanation-based method is proposed to estimate the importance values of product attributes with a low variance by combining the effect of the input features from multiple optimal neural networks with a high performance. A case study of smartphones is presented to demonstrate the proposed approach. The performance and importance estimates of the proposed approach are compared with those of previous sentiment analysis and neural network-based method, and the results exhibit that the former can perform IPA more reliably. The proposed approach uses minimal manual operation and can support companies to take decisions rapidly and effectively, compared with survey-based methods.


Author(s):  
Kranti Vithal Ghag ◽  
Ketan Shah

<span>Bag-of-words approach is popularly used for Sentiment analysis. It maps the terms in the reviews to term-document vectors and thus disrupts the syntactic structure of sentences in the reviews. Association among the terms or the semantic structure of sentences is also not preserved. This research work focuses on classifying the sentiments by considering the syntactic and semantic structure of the sentences in the review. To improve accuracy, sentiment classifiers based on relative frequency, average frequency and term frequency inverse document frequency were proposed. To handle terms with apostrophe, preprocessing techniques were extended. To focus on opinionated contents, subjectivity extraction was performed at phrase level. Experiments were performed on Pang &amp; Lees, Kaggle’s and UCI’s dataset. Classifiers were also evaluated on the UCI’s Product and Restaurant dataset. Sentiment Classification accuracy improved from 67.9% for a comparable term weighing technique, DeltaTFIDF, up to 77.2% for proposed classifiers. Inception of the proposed concept based approach, subjectivity extraction and extensions to preprocessing techniques, improved the accuracy to 93.9%.</span>


Author(s):  
Muhammet Sinan Basarslan ◽  
Fatih Kayaalp

Social media has become an important part of our everyday life due to the widespread use of the Internet. Of the social media services, Twitter is among the most used ones around the world. People share their opinions by writing tweets about numerous subjects, such as politics, sports, economy, etc. Millions of tweets per day create a huge dataset, which drew attention of the data scientists to focus on these data for sentiment analysis. The sentiment analysis focuses to identify the social media posts of users about a specific topic and categorize them as positive, negative or neutral. Thus, the study aims to investigate the effect of types of text representation on the performance of sentiment analysis. In this study, two datasets were used in the experiments. The first one is the user reviews about movies from the IMDB, which has been labeled by Kotzias, and the second one is the Twitter tweets, including the tweets of users about health topic in English in 2019, collected using the Twitter API. The Python programming language was used in the study both for implementing the classification models using the Naïve Bayes (NB), Support Vector Machines (SVM) and Artificial Neural Networks (ANN) algorithms, and for categorizing the sentiments as positive, negative and neutral. The feature extraction from the dataset was performed using Term Frequency-Inverse Document Frequency (TF-IDF) and Word2Vec (W2V) modeling techniques. The success percentages of the classification algorithms were compared at the end. According to the experimental results, Artificial Neural Network had the best accuracy performance in both datasets compared to the others.


Author(s):  
Syaifulloh Amien Pandega Perdana ◽  
Teguh Bharata Aji ◽  
Ridi Ferdiana

Ulasan pelanggan merupakan opini terhadap kualitas barang atau jasa yang dirasakan konsumen. Ulasan pelanggan mengandung informasi yang berguna bagi konsumen maupun penyedia barang atau jasa. Ketersediaan ulasan pelanggan dalam jumlah besar pada website membutuhkan suatu framework untuk mengekstraksi sentimen secara otomatis. Sebuah ulasan pelanggan sering kali mengandung banyak aspek sehingga Aspect Based Sentiment Analysis (ABSA) harus digunakan untuk mengetahui polaritas masing-masing aspek. Salah satu tugas penting dalam ABSA adalah Aspect Category Detection. Metode machine learning untuk Aspect Category Detection sudah banyak dilakukan pada domain berbahasa Inggris, tetapi pada domain bahasa Indonesia masih sedikit. Makalah ini membandingkan kinerja tiga algoritme machine learning, yaitu Naïve Bayes (NB), Support Vector Machine (SVM), dan Random Forest (RF) pada ulasan pelanggan berbahasa Indonesia menggunakan Term Frequency–Inverse Document Frequency (TF-IDF) sebagai term weighting. Hasil menunjukkan bahwa RF memiliki kinerja paling unggul dibandingkan NB dan SVM pada tiga domain yang berbeda, yaitu restoran, hotel, dan e-commerce, dengan nilai f1-score untuk masing-masing domain adalah 84.3%, 85.7%, dan 89,3%.


2020 ◽  
Author(s):  
Eliseu Guimarães ◽  
Jonnathan Carvalho ◽  
Aline Paes ◽  
Alexandre Plastino

Sentiment analysis on social media data can be a challenging task, among other reasons, because labeled data for training is not always available. Transfer learning approaches address this problem by leveraging a labeled source domain to obtain a model for a target domain that is different but related to the source domain. However, the question that arises is how to choose proper source data for training the target classifier, which can be made considering the similarity between source and target data using distance metrics. This article investigates the relation between these distance metrics and the classifiers’ performance. For this purpose, we propose to evaluate four metrics combined with distinct dataset representations. Computational experiments, conducted in the Twitter sentiment analysis scenario, showed that the cosine similarity metric combined with bag-of-words normalized with term frequency-inverse document frequency presented the best results in terms of predictive power, outperforming even the classifiers trained with the target dataset in many cases.


2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Xiaohong Wang ◽  
Shuang Dong

AbstractWith the rapid development of online shopping, how to explore the value of online reviews, so as to give full play to their role in potential users’ purchasing decisions. Based on text mining and quantitative analysis, this paper studies the sentiment analysis of online reviews on B2C shopping website. The main attributes of commodity or service are extracted based on the order of word frequency in the online reviews. Text analysis method is used to judge the relationship between attributes of commodity or service and its emotional words. The fine-grained sentimental polarity and intensity of attributes are identified to analyze users’ concerns and preferences. The research shows that users pay more attention to the configuration and after-sales service of mobile, and have a positive sentimental orientation to most of attributes, especially unlocking function, hand feeling attribute and logistics service; and have a neutral sentimental orientation towards the attributes of battery and memory, and a negative sentimental orientation towards the membrane of mobile phone. The results can provide a reference for consumers to make purchasing decisions, for enterprises to improve product quality, and for shopping platform to optimize service.


2020 ◽  
pp. 1-10
Author(s):  
Junegak Joung ◽  
Harrison M. Kim

Abstract Identifying product attributes from the perspective of a customer is essential to measure the satisfaction, importance, and Kano category of each product attribute for product design. This paper proposes automated keyword filtering to identify product attributes from online customer reviews based on latent Dirichlet allocation. The preprocessing for latent Dirichlet allocation is important because it affects the results of topic modeling; however, previous research performed latent Dirichlet allocation either without removing noise keywords or by manually eliminating them. The proposed method improves the preprocessing for latent Dirichlet allocation by conducting automated filtering to remove the noise keywords that are not related to the product. A case study of Android smartphones is performed to validate the proposed method. The performance of the latent Dirichlet allocation by the proposed method is compared to that of a previous method, and according to the latent Dirichlet allocation results, the former exhibits a higher performance than the latter.


2021 ◽  
Vol 1 ◽  
pp. 417-426
Author(s):  
Kangcheng Lin ◽  
Harrison Kim

AbstractWith the growth of online marketplaces and social media, product designers have been seeing an exponential growth of data available, which can serve as an extremely valuable source of information communicated from customers without geographical limitations. The data will reveal customers’ preferences, which can be expensive and slow to obtain via traditional methods such as survey and questionnaires. While existing methods in the literature have been proposed to extract product information and make inference from online data, they have limitations, especially in providing reliable results and in dealing with data sparsity. Therefore, this paper proposes a method to conduct an Important-performance analysis from online reviews. The major steps of this method involve using latent Dirichlet allocation (LDA) to identify product attributes, using IBM Watson Natural Language Understanding tool to perform aspect-based sentiment analysis, and using XGBoost model to infer product attribute importance from the collected dataset. In our case study, we have collected over 150,000 text reviews of more than 3,000 laptops from Amazon.


2019 ◽  
Vol 6 (1) ◽  
pp. 138-149
Author(s):  
Ukhti Ikhsani Larasati ◽  
Much Aziz Muslim ◽  
Riza Arifudin ◽  
Alamsyah Alamsyah

Data processing can be done with text mining techniques. To process large text data is required a machine to explore opinions, including positive or negative opinions. Sentiment analysis is a process that applies text mining methods. Sentiment analysis is a process that aims to determine the content of the dataset in the form of text is positive or negative. Support vector machine is one of the classification algorithms that can be used for sentiment analysis. However, support vector machine works less well on the large-sized data. In addition, in the text mining process there are constraints one is number of attributes used. With many attributes it will reduce the performance of the classifier so as to provide a low level of accuracy. The purpose of this research is to increase the support vector machine accuracy with implementation of feature selection and feature weighting. Feature selection will reduce a large number of irrelevant attributes. In this study the feature is selected based on the top value of K = 500. Once selected the relevant attributes are then performed feature weighting to calculate the weight of each attribute selected. The feature selection method used is chi square statistic and feature weighting using Term Frequency Inverse Document Frequency (TFIDF). Result of experiment using Matlab R2017b is integration of support vector machine with chi square statistic and TFIDF that uses 10 fold cross validation gives an increase of accuracy of 11.5% with the following explanation, the accuracy of the support vector machine without applying chi square statistic and TFIDF resulted in an accuracy of 68.7% and the accuracy of the support vector machine by applying chi square statistic and TFIDF resulted in an accuracy of 80.2%.


Information ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 486
Author(s):  
Xiaoyan Zhang ◽  
Qiang Yan ◽  
Simin Zhou ◽  
Linye Ma ◽  
Siran Wang

The number of consumers playing virtual reality games is booming. To speed up product iteration, the user experience team needs to collect and analyze unsatisfying experiences in time. In this paper, we aim to detect the unsatisfying experiences hidden in online reviews of virtual reality exergames using a deep learning method and find out the unmet psychological needs of users based on self-determination theory. Convolutional neural networks for sentence classification (textCNN) are used in this study to classify online reviews with unsatisfying experiences. For comparison, we set eXtreme gradient boosting (XGBoost) with lexical features as the baseline of machine learning. Term frequency-inverse document frequency (TF-IDF) is used to extract keywords from every set of classified reviews. The micro-F1 score of textCNN classifier is 90.00, which is better than 82.69 of XGBoost. The top 10 keywords of every set of reviews reflect relevant topics of unmet psychological needs. This paper explores the potential problems causing unsatisfying experiences and unmet psychological needs in virtual reality exergames through text mining and makes a supplement for experimental studies about virtual reality exergames.


Sign in / Sign up

Export Citation Format

Share Document