scholarly journals The Voice of Drug Consumers: Online Textual Review Analysis Using Structural Topic Model

Author(s):  
Lifeng He ◽  
Dongmei Han ◽  
Xiaohang Zhou ◽  
Zheng Qu

Many web-based pharmaceutical e-commerce platforms allow consumers to post open-ended textual reviews based on their purchase experiences. Understanding the true voice of consumers by analyzing such a large amount of user-generated content is of great significance to pharmaceutical manufacturers and e-commerce websites. The aim of this paper is to automatically extract hidden topics from web-based drug reviews using the structural topic model (STM) to examine consumers’ concerns when they buy drugs online. The STM is a probabilistic extension of Latent Dirichlet Allocation (LDA), which allows the consolidation of document-level covariates. This innovation allows us to capture consumer dissatisfaction along with their dynamics over time. We extract 12 topics, and five of them are negative topics representing consumer dissatisfaction, whose appearances in the negative reviews are substantially higher than those in the positive reviews. We also come to the conclusion that the prevalence of these five negative topics has not decreased over time. Furthermore, our results reveal that the prevalence of price-related topics has decreased significantly in positive reviews, which indicates that low-price strategies are becoming less attractive to customers. To the best of our knowledge, our work is the first study using STM to analyze the unstructured textual data of drug reviews, which enhances the understanding of the aspects of drug consumer concerns and contributes to the research of pharmaceutical e-commerce literature.

Author(s):  
Annamaria Bianchi ◽  
Camilla Salvatore ◽  
Silvia Biffignandi

Social media are fundamental in creating new opportunities for firms and they represent a relevant tool for the communication and the engagement with customers. The purpose of this paper is to analyse the communication of Corporate Social Responsibility (CSR) activities on Twitter. We consider the listed companies included in the Dow Jones Industrial Average Index and we implement a topic model analysis on their timelines. In order to identify the topic discussed, their correlation, and their evolution over time and sectors,we apply the Structural Topic Model algorithm, which allows estimating the model including document-level metadata. This model proves to be a powerful tool for topic detection and for estimating the effects of document-level metadata. Indeed, we find that the topics are overall well identified, and the model allows catching signals from the data. Finally, we discuss issues related to the validity of the analysis, including data quality problems.


Author(s):  
Yi Sun ◽  
Teruaki Hayashi ◽  
Yukio Ohsawa

AbstractDeciding when and which products to recommend to whom is always an essential issue for retailers. In this study, we propose a mixed framework with two components to capture customer buying behavior and its changes over time and visualize these results to better help retailers choose and target products strategically for marketing. In this framework, a topic model is first used to extract customer’s purchase behavior instead of association rules or K-means as mainly used in market field. To automatically choose the optimal number of topics, we implement an approach proposed by Koltcov et al. on point-of-sale (POS) data in the supermarket. Meanwhile, to grasp the change of topics over time, we divided monthly POS data in half and applied the topic model with Renyi entropy separately. The results suggest that splitting data might be a better way to understand customer behavior. Second, we consider how to develop an effective way to visualize the results of the topic model, which is essential, because in a supermarket context, simply knowing which product categories are included under which topics is not enough to support how a supermarket promotes their products. To address this, we design a three-layer visualization approach to better interpret the topic model results and to help retailers design target promotion strategies. The design of visualization was overlooked by studies related to the use of topic models on supermarket data. Finally, to demonstrate the usefulness of our proposed framework, we conduct a simple scenario-based analysis between our framework and other models, such as Latent Dirichlet Allocation (LDA) and the Dynamic Topic Model (DTM). The results show that for most periods, our proposed framework outperforms LDA and DTM.


Author(s):  
Subasish Das ◽  
Anandi Dutta ◽  
Marcus A. Brewer

This study employs two topic models to perform trend mining on an abundance of textual data to determine trends in research topics from immense collections of unstructured documents over the years. This study collected data from the titles and abstracts of the papers published in Transportation Research Record: Journal of the Transportation Research Board, since 1974. The content of these papers was ideal for examining research trends in various fields of research because it contains large textual data. In previous studies, exploratory analysis tools such as text mining were used to provide descriptive information about the data. However, this method does not provide researchers with quantifications of the topics and their correlations. Furthermore, the contents examined in this study are largely unstructured, and therefore they require faster machine learning algorithms to decipher them. For these reasons, the research team chose to employ two topic modeling tools, latent Dirichlet allocation and structural topic model, to perform trend mining. This analysis succeeded in extracting 20 main topics, identified by keywords, from the data. The research team also developed two interactive topic model visualization tools that can be used to extract topics from journal titles and abstracts, respectively. The findings from this study provide researchers with a further understanding of research patterns within ever-evolving area of transportation engineering studies.


Author(s):  
Subasish Das ◽  
Karen Dixon ◽  
Xiaoduan Sun ◽  
Anandi Dutta ◽  
Michelle Zupancich

Proceedings of journal and conference papers are good sources of big textual data to examine research trends in various branches of science. The contents, usually unstructured in nature, require fast machine-learning algorithms to be deciphered. Exploratory analysis through text mining usually provides the descriptive nature of the contents but lacks quantification of the topics and their correlations. Topic models are algorithms designed to discover the main theme or trend in massive collections of unstructured documents. Through the use of a structural topic model, an extension of latent Dirichlet allocation, this study introduced distinct topic models on the basis of the relative frequencies of the words used in the abstracts of 15,357 TRB compendium papers. With data from 7 years (2008 through 2014) of TRB annual meeting compendium papers, the 20 most dominant topics emerged from a bag of 4 million words. The findings of this study contributed to the understanding of topical trends in the complex and evolving field of transportation engineering research.


Author(s):  
Xiwen Bai ◽  
Xiunian Zhang ◽  
Kevin X. Li ◽  
Yaoming Zhou ◽  
Kum Fai Yuen

Author(s):  
Simeon Dekker

AbstractThe ‘diatribe’ is a dialogical mode of exposition, originating in Hellenistic Greek, where the author dramatically performs different voices in a polemical-didactic discourse. The voice of a fictitious opponent is often disambiguated by means of parenthetical verba dicendi, especially φησί(ν). Although diatribal texts were widely translated into Slavic in the Middle Ages, the textual history of the Zlatostruj collection of Chrysostomic homilies especially suits an investigation not only of how Greek ‘diatribal’ verbs were translated, but also how the Slavic verbs were transmitted or developed in different textual traditions. Over time, Slavic redactional activity led to a homogenization of verb forms. The initial variety of the original translation was partly eliminated, and the verb forms "Equation missing" and "Equation missing" became more firmly established as prototypical diatribal formulae. Especially the (increased) use of the 2sg form "Equation missing" has theoretical consequences for the text’s dialogical structure. Thus, an important dialogical component of the diatribe was reinforced in the Zlatostruj’s textual history on Slavic soil.


Author(s):  
Xi Liu ◽  
Yongfeng Yin ◽  
Haifeng Li ◽  
Jiabin Chen ◽  
Chang Liu ◽  
...  

AbstractExisting software intelligent defect classification approaches do not consider radar characters and prior statistics information. Thus, when applying these appaoraches into radar software testing and validation, the precision rate and recall rate of defect classification are poor and have effect on the reuse effectiveness of software defects. To solve this problem, a new intelligent defect classification approach based on the latent Dirichlet allocation (LDA) topic model is proposed for radar software in this paper. The proposed approach includes the defect text segmentation algorithm based on the dictionary of radar domain, the modified LDA model combining radar software requirement, and the top acquisition and classification approach of radar software defect based on the modified LDA model. The proposed approach is applied on the typical radar software defects to validate the effectiveness and applicability. The application results illustrate that the prediction precison rate and recall rate of the poposed approach are improved up to 15 ~ 20% compared with the other defect classification approaches. Thus, the proposed approach can be applied in the segmentation and classification of radar software defects effectively to improve the identifying adequacy of the defects in radar software.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 415
Author(s):  
Jinli Wang ◽  
Yong Fan ◽  
Hui Zhang ◽  
Libo Feng

Tracking scientific and technological (S&T) research hotspots can help scholars to grasp the status of current research and develop regular patterns in the field over time. It contributes to the generation of new ideas and plays an important role in promoting the writing of scientific research projects and scientific papers. Patents are important S&T resources, which can reflect the development status of the field. In this paper, we use topic modeling, topic intensity, and evolutionary computing models to discover research hotspots and development trends in the field of blockchain patents. First, we propose a time-based dynamic latent Dirichlet allocation (TDLDA) modeling method based on a probabilistic graph model and knowledge representation learning for patent text mining. Second, we present a computational model, topic intensity (TI), that expresses the topic strength and evolution. Finally, the point-wise mutual information (PMI) value is used to evaluate topic quality. We obtain 20 hot topics through TDLDA experiments and rank them according to the strength calculation model. The topic evolution model is used to analyze the topic evolution trend from the perspectives of rising, falling, and stable. From the experiments we found that 8 topics showed an upward trend, 6 topics showed a downward trend, and 6 topics became stable or fluctuated. Compared with the baseline method, TDLDA can have the best effect when K is 40 or less. TDLDA is an effective topic model that can extract hot topics and evolution trends of blockchain patent texts, which helps researchers to more accurately grasp the research direction and improves the quality of project application and paper writing in the blockchain technology domain.


Sign in / Sign up

Export Citation Format

Share Document