scholarly journals A Three-Stage method for Data Text Mining: Using UGC in Business Intelligence Analysis

Symmetry ◽  
2019 ◽  
Vol 11 (4) ◽  
pp. 519 ◽  
Author(s):  
Saura ◽  
Bennett

The global development of the Internet, which has enabled the analysis of large amounts of data and the services linked to their use, has led companies to modify their business strategies in search of new ways to increase marketing productivity and profitability. Many strategies are based on business intelligence (BI) and marketing intelligence (MI) that make it possible to extract profitable knowledge and insights from large amounts of data generated by company customers in digital environments. In this context, the present study proposes a three-step research methodology based on data text mining (DTM). In further research, this methodology can be used for business intelligence analysis (BIA) strategies to analyze user generated content (UGC) in social networks and on digital platforms. The proposed methodology unfolds in the following three stages. First, a Latent Dirichlet Allocation (LDA) model that determines the database topic is used. Second, a sentiment analysis (SA) is proposed. This SA is applied to the LDA results to divide the topics identified in the sample into three sentiments. Thirdly, textual analysis (TA) with data text mining techniques is applied on the topics in each sentiment. The proposed methodology offers important advances in data text mining in terms of accuracy, reliability and insight generation for both researchers and practitioners seeking to improve the BIA processes in business and other sectors.

2020 ◽  
Vol 202 ◽  
pp. 16005
Author(s):  
Chashif Syadzali ◽  
Suryono Suryono ◽  
Jatmiko Endro Suseno

Customer behavior classification can be useful to assist companies in conducting business intelligence analysis. Data mining techniques can classify customer behavior using the K-Nearest Neighbor algorithm based on the customer's life cycle consisting of prospect, responder, active and former. Data used to classify include age, gender, number of donations, donation retention and number of user visits. The calculation results from 2,114 data in the classification of each customer’s category are namely active by 1.18%, prospect by 8.99%, responder by 4.26% and former by 85.57%. System accuracy using a range of K from K = 1 to K = 20 produces that the highest accuracy is 94.3731% at a value of K = 4. The results of the training data that produce a classification of user behavior can be used as a Business Intelligence analysis that is useful for companies in determining business strategies by knowing the target of optimal market.


Proceedings ◽  
2018 ◽  
Vol 2 (18) ◽  
pp. 1170
Author(s):  
Yerai Doval ◽  
David Vilares

User-generated content published on microblogging social platforms constitutes an invaluable source of information for diverse purposes: health surveillance, business intelligence, political analysis, etc. We present an overview of our work on the field of microtext processing covering the entire pipeline: from input preprocessing to high-level text mining applications.


2021 ◽  
Vol 16 (4) ◽  
pp. 1042-1065
Author(s):  
Anne Gottfried ◽  
Caroline Hartmann ◽  
Donald Yates

The business intelligence (BI) market has grown at a tremendous rate in the past decade due to technological advancements, big data and the availability of open source content. Despite this growth, the use of open government data (OGD) as a source of information is very limited among the private sector due to a lack of knowledge as to its benefits. Scant evidence on the use of OGD by private organizations suggests that it can lead to the creation of innovative ideas as well as assist in making better informed decisions. Given the benefits but lack of use of OGD to generate business intelligence, we extend research in this area by exploring how OGD can be used to generate business intelligence for the identification of market opportunities and strategy formulation; an area of research that is still in its infancy. Using a two-industry case study approach (footwear and lumber), we use latent Dirichlet allocation (LDA) topic modeling to extract emerging topics in these two industries from OGD, and a data visualization tool (pyLDAVis) to visualize the topics in order to interpret and transform the data into business intelligence. Additionally, we perform an environmental scanning of the environment for the two industries to validate the usability of the information obtained. The results provide evidence that OGD can be a valuable source of information for generating business intelligence and demonstrate how topic modeling and visualization tools can assist organizations in extracting and analyzing information for the identification of market opportunities.


2018 ◽  
Vol 3 (3) ◽  
pp. 213-230 ◽  
Author(s):  
Filippo Gilardi ◽  
Celia Lam ◽  
K Cohen Tan ◽  
Andrew White ◽  
Shuxin Cheng ◽  
...  

The relationship between online media platforms in China and fan groups is a dynamic one when it comes to the distribution of international TV series and other media content, as media platforms incorporate user-generated content to encourage or foster audience engagement. Through a series of case studies, this article investigates how international TV series are acquired, distributed, marketed and curated on Chinese online video platforms. This helps to identify specific strategies and themes used by these platforms to promote international content and engage users. These marketing techniques, however, are not always as successful as expected, suggesting the need for a closer examination of the types of engagement sought by media platforms, and the ways in which Chinese audiences have responded within their cultural context.


Author(s):  
Byung-Kwon Park ◽  
Il-Yeol Song

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both data types for total business intelligence. The data can be classified into two categories: structured and unstructured. For getting total business intelligence, it is important to seamlessly analyze both of them. Especially, as most of business data are unstructured text documents, including the Web pages in Internet, we need a Text OLAP solution to perform multidimensional analysis of text documents in the same way as structured relational data. We first survey the representative works selected for demonstrating how the technologies of text mining and information retrieval can be applied for multidimensional analysis of text documents, because they are major technologies handling text data. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present a future business intelligence platform architecture as well as related research topics. We expect the proposed total heterogeneous business intelligence architecture, which integrates information retrieval, text mining, and information extraction technologies all together, including relational OLAP technologies, would make a better platform toward total business intelligence.


2019 ◽  
Vol 33 (4) ◽  
pp. 369-379 ◽  
Author(s):  
Xia Liu

Purpose Social bots are prevalent on social media. Malicious bots can severely distort the true voices of customers. This paper aims to examine social bots in the context of big data of user-generated content. In particular, the author investigates the scope of information distortion for 24 brands across seven industries. Furthermore, the author studies the mechanisms that make social bots viral. Last, approaches to detecting and preventing malicious bots are recommended. Design/methodology/approach A Twitter data set of 29 million tweets was collected. Latent Dirichlet allocation and word cloud were used to visualize unstructured big data of textual content. Sentiment analysis was used to automatically classify 29 million tweets. A fixed-effects model was run on the final panel data. Findings The findings demonstrate that social bots significantly distort brand-related information across all industries and among all brands under study. Moreover, Twitter social bots are significantly more effective at spreading word of mouth. In addition, social bots use volumes and emotions as major effective mechanisms to influence and manipulate the spread of information about brands. Finally, the bot detection approaches are effective at identifying bots. Research limitations/implications As brand companies use social networks to monitor brand reputation and engage customers, it is critical for them to distinguish true consumer opinions from fake ones which are artificially created by social bots. Originality/value This is the first big data examination of social bots in the context of brand-related user-generated content.


2019 ◽  
Vol 62 (2) ◽  
pp. 195-215
Author(s):  
Frederik Situmeang ◽  
Nelleke de Boer ◽  
Austin Zhang

The purpose of this study is to contribute to the marketing literature and practice by describing a research methodology to identify latent dimensions of customer satisfaction in product reviews, and examining the relationship between these attributes and customer satisfaction. Previous research in product reviews has largely relied only on quantitative ratings, either stars or review score. Advanced techniques for text mining provide the opportunity to extract meaning from customer online reviews. By analyzing 51,110 online reviews for 1,610 restaurants via latent Dirichlet allocation, this study uncovers 30 latent dimensions that are determinants of customer satisfaction. Furthermore, this study developed measurements of sentiment and innovativeness as moderators of the effect of these latent attributes to satisfaction.


Sign in / Sign up

Export Citation Format

Share Document