TopicInk: Visualizing Disaster-Related Textual Data Using LDA Topic Modeling: Vast Challenge 2019: Honorable Mention for Clear Articulation of Methodology

AbstractSocial media enable companies to assess consumers’ opinions, complaints and needs. The systematic and data-driven analysis of social media to generate business value is summarized under the term Social Media Analytics which includes statistical, network-based and language-based approaches. We focus on textual data and investigate which conversation topics arise during the time of a new product introduction on Twitter and how the overall sentiment is during and after the event. The analysis via Natural Language Processing tools is conducted in two languages and four different countries, such that cultural differences in the tonality and customer needs can be identified for the product. Different methods of sentiment analysis and topic modeling are compared to identify the usability in social media and in the respective languages English and German. Furthermore, we illustrate the importance of preprocessing steps when applying these methods and identify relevant product insights.

Download Full-text

Topic modeling in software engineering research

Empirical Software Engineering ◽

10.1007/s10664-021-10026-0 ◽

2021 ◽

Vol 26 (6) ◽

Author(s):

Camila Costa Silva ◽

Matthias Galster ◽

Fabian Gilson

Keyword(s):

Software Engineering ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Empirical Studies ◽

Engineering Research ◽

Bug Reports ◽

Textual Data ◽

Modeling Techniques ◽

Software Engineering Research ◽

Support Software

AbstractTopic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling needs to be applied carefully (e.g., depending on the type of textual data analyzed and modeling parameters). Our study aims at describing how topic modeling has been applied in software engineering research with a focus on four aspects: (1) which topic models and modeling techniques have been applied, (2) which textual inputs have been used for topic modeling, (3) how textual data was “prepared” (i.e., pre-processed) for topic modeling, and (4) how generated topics (i.e., word clusters) were named to give them a human-understandable meaning. We analyzed topic modeling as applied in 111 papers from ten highly-ranked software engineering venues (five journals and five conferences) published between 2009 and 2020. We found that (1) LDA and LDA-based techniques are the most frequent topic modeling techniques, (2) developer communication and bug reports have been modelled most, (3) data pre-processing and modeling parameters vary quite a bit and are often vaguely reported, and (4) manual topic naming (such as deducting names based on frequent words in a topic) is common.

Download Full-text

Topic Modeling as a Strategy of Inquiry in Organizational Research: A Tutorial With an Application Example on Organizational Culture

Organizational Research Methods ◽

10.1177/1094428118773858 ◽

2018 ◽

Vol 22 (4) ◽

pp. 941-968 ◽

Cited By ~ 19

Author(s):

Theresa Schmiedel ◽

Oliver Müller ◽

Jan vom Brocke

Keyword(s):

Organizational Culture ◽

Text Mining ◽

Topic Modeling ◽

Future Research ◽

Data Sets ◽

Organizational Research ◽

Fortune 500 Companies ◽

Textual Data ◽

Complementary Strategy ◽

Depth Interviews

Research has emphasized the limitations of qualitative and quantitative approaches to studying organizational phenomena. For example, in-depth interviews are resource-intensive, while questionnaires with closed-ended questions can only measure predefined constructs. With the recent availability of large textual data sets and increased computational power, text mining has become an attractive method that has the potential to mitigate some of these limitations. Thus, we suggest applying topic modeling, a specific text mining technique, as a new and complementary strategy of inquiry to study organizational phenomena. In particular, we outline the potentials of structural topic modeling for organizational research and provide a step-by-step tutorial on how to apply it. Our application example builds on 428,492 reviews of Fortune 500 companies from the online platform Glassdoor, on which employees can evaluate organizations. We demonstrate how structural topic models allow to inductively identify topics that matter to employees and quantify their relationship with employees’ perception of organizational culture. We discuss the advantages and limitations of topic modeling as a research method and outline how future research can apply the technique to study organizational phenomena.

Download Full-text

Avoiding the Ambiguity with Textual Data: Topic Modeling for Linear Models

SSRN Electronic Journal ◽

10.2139/ssrn.3014118 ◽

2017 ◽

Author(s):

Yuan Cheng ◽

Shawn Mankad

Keyword(s):

Topic Modeling ◽

Linear Models ◽

Textual Data

Download Full-text

Integrating Big Data Into Evaluation: R Code for Topic Identification and Modeling

American Journal of Evaluation ◽

10.1177/10982140211031640 ◽

2021 ◽

pp. 109821402110316

Author(s):

Dakota W. Cintron ◽

Bianca Montrosse-Moorhead

Keyword(s):

Big Data ◽

Topic Modeling ◽

Data Analytics ◽

Latent Dirichlet Allocation ◽

Big Data Analytics ◽

Specific Topic ◽

Topic Identification ◽

Ethical Concerns ◽

Textual Data ◽

R Packages

Despite the rising popularity of big data, there is speculation that evaluators have been slow adopters of these new statistical approaches. Several possible reasons have been offered for why this is the case: ethical concerns, institutional capacity, and evaluator capacity and values. In this method note, we address one of these barriers and aim to build evaluator capacity to integrate big data analytics into their studies. We focus our efforts on a specific topic modeling technique referred to as latent Dirichlet allocation (LDA) because of the ubiquitousness of qualitative textual data in evaluation. Given current equity debates, both within evaluation and the communities in which we practice, we analyze 1,796 tweets that use the hashtag #equity with the R packages topicmodels and ldatuning to illustrate the use of LDA. Furthermore, a freely available workbook for implementing LDA topic modeling is provided as Supplemental Material Online.

Download Full-text

Text Mining in Organizational Research

Organizational Research Methods ◽

10.1177/1094428117722619 ◽

2017 ◽

Vol 21 (3) ◽

pp. 733-765 ◽

Cited By ~ 46

Author(s):

Vladimer B. Kobayashi ◽

Stefan T. Mol ◽

Hannah A. Berkers ◽

Gábor Kismihók ◽

Deanne N. Den Hartog

Keyword(s):

Text Mining ◽

Knowledge Discovery ◽

Topic Modeling ◽

Analytical Techniques ◽

Job Analysis ◽

Organizational Research ◽

Textual Data ◽

Different Types ◽

Research Questions ◽

New Research

Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.

Download Full-text

Text-mining open-ended survey responses using structural topic modeling: A practical demonstration to understand parents’ coping methods during COVID-19 pandemic in Singapore

10.31219/osf.io/enzst ◽

2021 ◽

Author(s):

Gerard Chung ◽

Maria Rodriguez ◽

Paul Lanier ◽

Daniel Gibbs

Keyword(s):

Text Mining ◽

Topic Modeling ◽

Topic Model ◽

Research Process ◽

Estimation Model ◽

Coping Methods ◽

Model Interpretation ◽

Textual Data ◽

Survey Responses ◽

Structural Topic Modeling

Objective: Open-ended survey questions crucially contribute to researchers’ understandings of respondents’ experiences. However, analyzing open-ended responses using human coders is labor-intensive and prone to inconsistencies. Structural topic modeling (STM) is a text mining method that discover topics from textual data. We demonstrate the use of STM to analyze open- ended survey responses to understand how parents cope during COVID-19 lock-down in Singapore. Method: We administered online surveys to 199 parents in Singapore during the COVID-19 lock-down. To show a STM analysis, we demonstrated a workflow that includes steps in data preprocessing, model estimation, model selection, and model interpretation. Results: An 18-topic model best fitted the data based on model diagnostics and researchers’ expertise. Prevalent coping methods described by respondents include “Spousal Support”, “Routines/Schedules” and “Managing Expectations”. Topic prevalence for some topics varies with respondents’ levels of parenting stress and whether parents were fathers or mothers. Conclusion: STM offers an efficient, valid, and replicable way to analyze textual data such as open-ended survey responses and case notes that can complement researchers’ knowledge and skills. STM can be used as part of a multistage research process or to support other analyses such as clarifying quantitative findings and identifying preliminary themes from qualitative data.

Download Full-text

Topic Modeling in Management Research: Rendering New Theory from Textual Data

Academy of Management Annals ◽

10.5465/annals.2017.0099 ◽

2019 ◽

Vol 13 (2) ◽

pp. 586-632 ◽

Cited By ~ 39

Author(s):

Timothy R. Hannigan ◽

Richard F. J. Haans ◽

Keyvan Vakili ◽

Hovig Tchalian ◽

Vern L. Glaser ◽

...

Keyword(s):

Topic Modeling ◽

Management Research ◽

Textual Data

Download Full-text

Modeling longitudinal dynamics in textual data

PsycEXTRA Dataset ◽

10.1037/e482892008-001 ◽

2003 ◽

Author(s):

Kevin Dooley ◽

Steven Corman

Keyword(s):

Longitudinal Dynamics ◽

Textual Data

Download Full-text

Innovative Approach to Information Search by Example of a Patent Analysis of an Important Substitution Plan

Экономическая наука современной России ◽

10.33293/1609-1442-2020-1(88)-143-157 ◽

2020 ◽

pp. 143-157

Author(s):

Maria A. Milkova

Keyword(s):

Information Search ◽

Topic Modeling ◽

Cognitive Biases ◽

A Priori ◽

Import Substitution ◽

Innovative Approach ◽

Iterative Search ◽

Comprehensive Picture ◽

Priori Information ◽

Selection Of

Nowadays the process of information accumulation is so rapid that the concept of the usual iterative search requires revision. Being in the world of oversaturated information in order to comprehensively cover and analyze the problem under study, it is necessary to make high demands on the search methods. An innovative approach to search should flexibly take into account the large amount of already accumulated knowledge and a priori requirements for results. The results, in turn, should immediately provide a roadmap of the direction being studied with the possibility of as much detail as possible. The approach to search based on topic modeling, the so-called topic search, allows you to take into account all these requirements and thereby streamline the nature of working with information, increase the efficiency of knowledge production, avoid cognitive biases in the perception of information, which is important both on micro and macro level. In order to demonstrate an example of applying topic search, the article considers the task of analyzing an import substitution program based on patent data. The program includes plans for 22 industries and contains more than 1,500 products and technologies for the proposed import substitution. The use of patent search based on topic modeling allows to search immediately by the blocks of a priori information – terms of industrial plans for import substitution and at the output get a selection of relevant documents for each of the industries. This approach allows not only to provide a comprehensive picture of the effectiveness of the program as a whole, but also to visually obtain more detailed information about which groups of products and technologies have been patented.

Download Full-text