scholarly journals Multi-task learning based on question–answering style reviews for aspect category classification and aspect term extraction on GPU clusters

2020 ◽  
Vol 23 (3) ◽  
pp. 1973-1986
Author(s):  
Hanqian Wu ◽  
Siliang Cheng ◽  
Zhike Wang ◽  
Shangbin Zhang ◽  
Feng Yuan

Abstract Cluster computing technologies are rapidly advancing and user-generated online reviews are booming in the current Internet and e-commerce environment. The latest question–answering (Q&A)-style reviews are novel, abundant and easily digestible product reviews that also contain massive valuable information for customers. In this paper, we mine valuable aspect information of products contained in these reviews on GPU clusters. To achieve this goal, we utilize two subtasks of aspect-based sentiment analysis: aspect term extraction (ATE) and aspect category classification (ACC). Most previous works focused on only one task or solved these two tasks separately, even though they are highly interrelated, and they do not make full use of abundant training resources. To address this problem, we propose a novel multi-task neural learning model to jointly handle these two tasks and explore the performance of our model on GPU clusters. We conducted extensive comparative experiments on an annotated corpus and found that our proposed model outperforms several baseline models in ATE and ACC tasks on GPU clusters, yielding significant strides in data mining for these types of reviews.

2022 ◽  
Vol 3 (4) ◽  
pp. 283-294
Author(s):  
M. Duraipandian ◽  
R. Vinothkanna

Customers post online product reviews based on their own experience. They may share their thoughts and comments on items on online shopping websites. The sentiment analysis comprises of opinion or idea process and process of sorting high rating reviews according to how well the product satisfies. Opinion mining is a technique for extracting useful data from large amounts of texts in order to use those to enhance or expand a company's operations. According to consumer evaluations, many of the goods aren't as good as they seem. It's common that buyers submit their thoughts on a product but then forget to rate it. The prior data preprocessing is more efficient to extract the features by CNN approach. This proposed methodology breaks down each user's rating prediction model into two parts: one based on the review text and other based on the user rating matrix with the help of CNN feature engineering. The goal of this study is to classify all reviews into ratings by SVM model. This proposed classification model provides good accuracy to predict the online reviews efficiently. For reviews without ratings, a further prediction of feelings is generated using multiple classifiers. The benefits of this proposed model are honed using helpfulness ratings from a small number of evaluations such as accuracy, F1 score, sensitivity, and precision. According to studies using the standard benchmark dataset, the accuracy of customized recommendation services, user happiness, and corporate trust may all be enhanced by including review helpfulness information in the recommender system.


Author(s):  
Seema Rani ◽  
Avadhesh Kumar ◽  
Naresh Kumar

Background: Duplicate content often corrupts the filtering mechanism in online question answering. Moreover, as users are usually more comfortable conversing in their native language questions, transliteration adds to the challenges in detecting duplicate questions. This compromises with the response time and increases the answer overload. Thus, it has now become crucial to build clever, intelligent and semantic filters which semantically match linguistically disparate questions. Objective: Most of the research on duplicate question detection has been done on mono-lingual, majorly English Q&A platforms. The aim is to build a model which extends the cognitive capabilities of machines to interpret, comprehend and learn features for semantic matching in transliterated bi-lingual Hinglish (Hindi + English) data acquired from different Q&A platforms. Method: In the proposed DQDHinglish (Duplicate Question Detection) Model, firstly language transformation (transliteration & translation) is done to convert the bi-lingual transliterated question into a mono-lingual English only text. Next a hybrid of Siamese neural network containing two identical Long-term-Short-memory (LSTM) models and Multi-layer perceptron network is proposed to detect semantically similar question pairs. Manhattan distance function is used as the similarity measure. Result: A dataset was prepared by scrapping 100 question pairs from various social media platforms, such as Quora and TripAdvisor. The performance of the proposed model on the basis of accuracy and F-score. The proposed DQDHinglish achieves a validation accuracy of 82.40%. Conclusion: A deep neural model was introduced to find semantic match between English question and a Hinglish (Hindi + English) question such that similar intent questions can be combined to enable fast and efficient information processing and delivery. A dataset was created and the proposed model was evaluated on the basis of performance accuracy. To the best of our knowledge, this work is the first reported study on transliterated Hinglish semantic question matching.


2013 ◽  
Vol 427-429 ◽  
pp. 2614-2617
Author(s):  
Qing Xi Peng

Online reviews as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. Although supersized methods have obtained good results, a large amount of corpus should be trained beforehand. Recently, topic models have been introduced for the simultaneous analysis for sentiment in the document. However, the LDA model makes the assumption that, given the parameters the words in the document are all independent. It obviously isnt the case. The words in the document express the sentiment of the author. This paper proposes a model to solve the problem. We assume that the sentiments are related to the topic in the documents. A sentiment layer is added to the LDA model to improve it. Experimental result in the dataset demonstrates the advantage of the proposed model.


2021 ◽  
Vol 2 (2) ◽  
pp. 27-39
Author(s):  
Charles C. Willow

This paper investigates the data analytics between consumer purchase decisions relative to the on-line reviews. The multi-attributes associated with purchase decisions are comprised of nationalism and consumer preference to be correlated with online reviews using big data analytics. By far, a small fraction of meaningful studies have sought to correlate nationalism and ethnocentrism with big data analytics to date. Globally accepted generic products are selected to expedite the process of data engineering. Two sets were arranged: passenger automobiles for transportation with an estimated $9 trillion global market and the smart phone, boosting its market size of approximately $5 billion. Both products provide minimized cultural, linguistic, gender, age, and/or custom barriers of entry for prospective digital consumers, thereby allowing relatively unrestricted engagement with online reviews and purchases. A series of hypothesis tests indicate that there is a positive correlation between nationalism and automobiles. As to smart cell phones, however, nationalism had nominal control factors. Multi-variate analytics were performed by using R and Tableau Public.


Kybernetes ◽  
2019 ◽  
Vol 48 (6) ◽  
pp. 1355-1372 ◽  
Author(s):  
Ying Huang ◽  
Nu-nu Wang ◽  
Hongyu Zhang ◽  
Jianqiang Wang

Purpose The purpose of this paper is to propose a model for product recommendation to improve the accuracy of recommendation based on the current search engines used in e-commerce platforms like Tmall.com. Design/methodology/approach First, the proposed model comprehensively considers price, trust and online reviews, which all represent critical factors in consumers’ purchasing decisions. Second, the model introduces the quantization methods for these criteria incorporating fuzzy theory. Third, the model uses a distance measure between two single valued neutrosophic sets based on the prioritized average operator to consolidate the influences of positive, neutral and negative comments. Finally, the model uses multi-criteria decision-making methods to integrate the influences of price, trust and online reviews on purchasing decisions to generate recommendations. Findings To demonstrate the feasibility and efficiency of the proposed model, a case study is conducted based on Tmall.com. The results of case study indicate that the recommendations of our model perform better than those of current search engines of Tmall.com. The proposed model can significantly improve the accuracy of product recommendations based on search engines. Originality/value The product recommendation method can meet the critical challenge from the search engines on e-commerce platforms. In addition, the proposed method could be used in practice to develop a new application for e-commerce platforms.


2016 ◽  
Vol 31 (2) ◽  
pp. 97-123 ◽  
Author(s):  
Alfred Krzywicki ◽  
Wayne Wobcke ◽  
Michael Bain ◽  
John Calvo Martinez ◽  
Paul Compton

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.


2014 ◽  
Vol 631-632 ◽  
pp. 1053-1056
Author(s):  
Hui Xia

The paper addressed the issues of limited resource for data optimization for efficiency, reliability, scalability and security of data in distributed, cluster systems with huge datasets. The study’s experimental results predicted that the MapReduce tool developed improved data optimization. The system exhibits undesired speedup with smaller datasets, but reasonable speedup is achieved with a larger enough datasets that complements the number of computing nodes reducing the execution time by 30% as compared to normal data mining and processing. The MapReduce tool is able to handle data growth trendily, especially with larger number of computing nodes. Scaleup gracefully grows as data and number of computing nodes increases. Security of data is guaranteed at all computing nodes since data is replicated at various nodes on the cluster system hence reliable. Our implementation of the MapReduce runs on distributed cluster computing environment of a national education web portal and is highly scalable.


Author(s):  
Alan D. Smith

In an age of public mistrust of the most basic institutions, businesses are not exempted. Essentially all e-tailers want to deliver personalized and real-time communications to customers that are tailored to their interests and preferences, and are based on big data mining that customers will value over privacy concerns. This is an era in which e-commerce retailers continue to dominate the marketplace and it is integral that consumers are able to trust the manufacturers, retailers, and the service/product reviews that they read online. Such trust is particularly important if their ultimate purchase decision is a successful one. A survey of middle-level managers was analyzed to identity the basic elements: e-personalization, namely online purchasing behaviors, personalized communications, information-retrieval services, degree of personal web presence, quality assurance of customer service, and the promotion of customization services. These elements were found to be conceptually and statistically related to retailer benefits of increased buying and customer loyalty.


2019 ◽  
Vol 11 (3) ◽  
pp. 81-97
Author(s):  
Chao Li ◽  
Jun Xiang ◽  
Shiqiang Chen

Reviews can reflect the degree of consumers' satisfaction and views on product quality, and consumers tend to read product reviews and then get helpful information about product quality before placing an order in e-commerce platforms. However, the existing research mainly focus on the assessment of review quality, fake review detection, opinion mining, and there is little research to assess product quality from the perspectives of product features based on reviews objectively and quantifialy. Therefore, the authors propose a method to assess product quality based on reviews in a granularity of product feature. The authors define the related quality dimensions and develop the corresponding assessment models, assess the review quality crawled from an e-commerce platform, then extract product features and opinion words from the quality reviews, and finally assess product quality on the extracted and consumer-concerned features. Experiment results demonstrate the methodology can achieve the assessment of product quality on any feature objectively and quantificationally.


Sign in / Sign up

Export Citation Format

Share Document