Implementation and comparison of topic modeling techniques based on user reviews in e-commerce recommendations

Intricate user-behaviors can be understood by discovering user interests from their reviews. Topic modeling techniques have been extensively explored to discover latent user interests from user reviews. However, a topic extracted by topic modelling techniques can be a mixture of several quite different concepts and thus less interpretable. In this paper, the authors present a method that uses topic modeling techniques to discover a large number of topics and applies hierarchical clustering to generate a much smaller number of interpretable User-Concerns. These User-Concerns are further compared with topics generated by Latent Dirichlet Allocation (LDA) and Pachinko Allocation Model (PAM) and shown to be more coherent and interpretable. The authors cut the linkage tree formed while performing the hierarchical clustering of the User-Concerns, at different levels, and generate a hierarchy of User-Concerns. They also discuss how collaborative filtering based recommendation systems can be enriched by infusing additional user-behavioral knowledge from such hierarchy.

Download Full-text

Analyzing the effect of brand building activities on brand image using topic modeling techniques

Proceedings of the 31st Annual ACM Symposium on Applied Computing - SAC '16 ◽

10.1145/2851613.2852010 ◽

2016 ◽

Author(s):

Kapil Kaushik

Keyword(s):

Topic Modeling ◽

Brand Image ◽

Brand Building ◽

Modeling Techniques ◽

Building Activities

Download Full-text

On Building an Interpretable Topic Modeling Approach for the Urdu Language

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/740 ◽

2020 ◽

Author(s):

Zarmeen Nasim

Keyword(s):

Feature Extraction ◽

Deep Learning ◽

Topic Modeling ◽

Language Modeling ◽

Low Resource ◽

Modeling Approach ◽

Part Of Speech ◽

Modeling Techniques ◽

Pos Tagger

This research is an endeavor to combine deep-learning-based language modeling with classical topic modeling techniques to produce interpretable topics for a given set of documents in Urdu, a low resource language. The existing topic modeling techniques produce a collection of words, often un-interpretable, as suggested topics without integrat-ing them into a semantically correct phrase/sentence. The proposed approach would first build an accurate Part of Speech (POS) tagger for the Urdu Language using a publicly available corpus of many million sentences. Using semanti-cally rich feature extraction approaches including Word2Vec and BERT, the proposed approach, in the next step, would experiment with different clus-tering and topic modeling techniques to produce a list of potential topics for a given set of documents. Finally, this list of topics would be sent to a labeler module to produce syntactically correct phrases that will represent interpretable topics.

Download Full-text

Applying topic modeling techniques to degraded texts

Proceedings of the Sixth International Conference on Technological Ecosystems for Enhancing Multiculturality - TEEM'18 ◽

10.1145/3284179.3284319 ◽

2018 ◽

Author(s):

Carlos G. Figuerola

Keyword(s):

Topic Modeling ◽

Modeling Techniques

Download Full-text

Examining the performance of topic modeling techniques in Twitter trends extraction

The International Conference on Information Networking 2014 (ICOIN2014) ◽

10.1109/icoin.2014.6799706 ◽

2014 ◽

Cited By ~ 1

Author(s):

Mutia N. Kurniati ◽

Woo-Jong Ryu ◽

Md. Hijbul Alam ◽

SangKeun Lee

Keyword(s):

Topic Modeling ◽

Modeling Techniques

Download Full-text

TOPIC MODELING IN CLINICAL REPORTS - A SURVEY

International Journal of Advanced Information and Communication Technology ◽

10.46532/ijaict-2020002 ◽

2020 ◽

pp. 6-10

Author(s):

Ponmalar R ◽

Ponnarasi D ◽

Sangeetha A ◽

Kingsy Grace R

Keyword(s):

Text Mining ◽

Topic Modeling ◽

Unstructured Data ◽

Modeling Techniques ◽

Mining Tool ◽

Text Mining Tool ◽

Extract Information ◽

Meaningful Data

Text mining is a process of converting unstructured data into meaningful data. It may be loosely characterized as the process of analyzing text to extract information that is useful for particular purposes. Topic modeling is a form of text mining, a way of identifying patterns in a corpus. The topics produced by topic modeling techniques are clusters of similar words that are frequently occur together. Topic modeling is also a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, a document is about a particular topic, one would expect particular words to appear in the document more or less frequently. This paper, presents a survey on topic modeling in clinical documents.

Download Full-text

An Investigation on The Effectiveness of Employing Topic Modeling Techniques to Provide Topic Awareness For Conversational Agents

10.18653/v1/p16-3012 ◽

2016 ◽

Author(s):

Omid Moradiannasab

Keyword(s):

Topic Modeling ◽

Conversational Agents ◽

Modeling Techniques

Download Full-text

Topic modeling in software engineering research

Empirical Software Engineering ◽

10.1007/s10664-021-10026-0 ◽

2021 ◽

Vol 26 (6) ◽

Author(s):

Camila Costa Silva ◽

Matthias Galster ◽

Fabian Gilson

Keyword(s):

Software Engineering ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Empirical Studies ◽

Engineering Research ◽

Bug Reports ◽

Textual Data ◽

Modeling Techniques ◽

Software Engineering Research ◽

Support Software

AbstractTopic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents. In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling needs to be applied carefully (e.g., depending on the type of textual data analyzed and modeling parameters). Our study aims at describing how topic modeling has been applied in software engineering research with a focus on four aspects: (1) which topic models and modeling techniques have been applied, (2) which textual inputs have been used for topic modeling, (3) how textual data was “prepared” (i.e., pre-processed) for topic modeling, and (4) how generated topics (i.e., word clusters) were named to give them a human-understandable meaning. We analyzed topic modeling as applied in 111 papers from ten highly-ranked software engineering venues (five journals and five conferences) published between 2009 and 2020. We found that (1) LDA and LDA-based techniques are the most frequent topic modeling techniques, (2) developer communication and bug reports have been modelled most, (3) data pre-processing and modeling parameters vary quite a bit and are often vaguely reported, and (4) manual topic naming (such as deducting names based on frequent words in a topic) is common.

Download Full-text