probabilistic topic models Latest Research Papers

We propose a new approach for tracing value change. Value change may lead to a mismatch between current value priorities in society and the values for which technologies were designed in the past, such as energy technologies based on fossil fuels, which were developed when sustainability was not considered a very important value. Better anticipating value change is essential to avoid a lack of social acceptance and moral acceptability of technologies. While value change can be studied historically and qualitatively, we propose a more quantitative approach that uses large text corpora. It uses probabilistic topic models, which allow us to trace (new) values that are (still) latent. We demonstrate the approach for five types of value change in technology. Our approach is useful for testing hypotheses about value change, such as verifying whether value change has occurred and identifying patterns of value change. The approach can be used to trace value change for various technologies and text corpora, including scientific articles, newspaper articles, and policy documents.

Download Full-text

Term-Community-Based Topic Detection with Variable Resolution

Information ◽

10.3390/info12060221 ◽

2021 ◽

Vol 12 (6) ◽

pp. 221

Author(s):

Andreas Hamm ◽

Simon Odrowski

Keyword(s):

Social Sciences ◽

Community Detection ◽

Latent Dirichlet Allocation ◽

Topic Detection ◽

Community Based ◽

Variable Resolution ◽

Domain Experts ◽

Probabilistic Topic Models ◽

Resolution Parameter ◽

Text Collections

Network-based procedures for topic detection in huge text collections offer an intuitive alternative to probabilistic topic models. We present in detail a method that is especially designed with the requirements of domain experts in mind. Like similar methods, it employs community detection in term co-occurrence graphs, but it is enhanced by including a resolution parameter that can be used for changing the targeted topic granularity. We also establish a term ranking and use semantic word-embedding for presenting term communities in a way that facilitates their interpretation. We demonstrate the application of our method with a widely used corpus of general news articles and show the results of detailed social-sciences expert evaluations of detected topics at various resolutions. A comparison with topics detected by Latent Dirichlet Allocation is also included. Finally, we discuss factors that influence topic interpretation.

Download Full-text

Probabilistic Topic Models for Enriching Ontology from Texts

SN Computer Science ◽

10.1007/s42979-020-00349-y ◽

2020 ◽

Vol 1 (6) ◽

Author(s):

Anis Tissaoui ◽

Salma Sassi ◽

Richard Chbeir

Keyword(s):

Topic Models ◽

Probabilistic Topic Models

Download Full-text

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

Computational Linguistics ◽

10.1162/coli_a_00369 ◽

2020 ◽

Vol 46 (1) ◽

pp. 95-134

Author(s):

Shudong Hao ◽

Michael J. Paul

Keyword(s):

Knowledge Transfer ◽

Empirical Study ◽

Future Development ◽

Topic Modeling ◽

Topic Models ◽

Training Corpus ◽

Training Conditions ◽

Probabilistic Topic Models ◽

Probabilistic Topic Modeling ◽

Transfer Mechanisms

Probabilistic topic modeling is a common first step in crosslingual tasks to enable knowledge transfer and extract multilingual features. Although many multilingual topic models have been developed, their assumptions about the training corpus are quite varied, and it is not clear how well the different models can be utilized under various training conditions. In this article, the knowledge transfer mechanisms behind different multilingual topic models are systematically studied, and through a broad set of experiments with four models on ten languages, we provide empirical insights that can inform the selection and future development of multilingual topic models.

Download Full-text

Similarities Between Human Structured Subject Indexing and Probabilistic Topic Models

Knowledge Organization at the Interface ◽

10.5771/9783956507762-374 ◽

2020 ◽

pp. 374-383

Author(s):

Günter Reiner ◽

Philipp Adämmer

Keyword(s):

Topic Models ◽

Probabilistic Topic Models ◽

Subject Indexing

Download Full-text

Combining semantic graph and probabilistic topic models for discovering coherent topics

Web Intelligence ◽

10.3233/web-190424 ◽

2019 ◽

Vol 17 (4) ◽

pp. 365-379

Author(s):

Mehdi Allahyari ◽

Seyedamin Pouriyeh ◽

Krys Kochut

Keyword(s):

Topic Models ◽

Semantic Graph ◽

Probabilistic Topic Models

Download Full-text

Eliminating overfitting of probabilistic topic models on short and noisy text: The role of dropout

International Journal of Approximate Reasoning ◽

10.1016/j.ijar.2019.05.010 ◽

2019 ◽

Vol 112 ◽

pp. 85-104 ◽

Cited By ~ 10

Author(s):

Cuong Ha ◽

Van-Dang Tran ◽

Linh Ngo Van ◽

Khoat Than

Keyword(s):

Topic Models ◽

Probabilistic Topic Models ◽

Noisy Text

Download Full-text

Concept-LDA: Incorporating Babelfy into LDA for aspect extraction

Journal of Information Science ◽

10.1177/0165551519845854 ◽

2019 ◽

Vol 46 (3) ◽

pp. 406-418 ◽

Cited By ~ 2

Author(s):

Ekin Ekinci ◽

Sevinç İlhan Omurca

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Models ◽

Feature Space ◽

Semantic Knowledge ◽

Named Entities ◽

Aspect Extraction ◽

Probabilistic Topic Models ◽

Proposed Model ◽

Document Collection ◽

Analysis System

Latent Dirichlet allocation (LDA) is one of the probabilistic topic models; it discovers the latent topic structure in a document collection. The basic assumption under LDA is that documents are viewed as a probabilistic mixture of latent topics; a topic has a probability distribution over words and each document is modelled on the basis of a bag-of-words model. The topic models such as LDA are sufficient in learning hidden topics but they do not take into account the deeper semantic knowledge of a document. In this article, we propose a novel method based on topic modelling to determine the latent aspects of online review documents. In the proposed model, which is called Concept-LDA, the feature space of reviews is enriched with the concepts and named entities, which are extracted from Babelfy to obtain topics that contain not only co-occurred words but also semantically related words. The performance in terms of topic coherence and topic quality is reported over 10 publicly available datasets, and it is demonstrated that Concept-LDA achieves better topic representations than an LDA model alone, as measured by topic coherence and F-measure. The learned topic representation by Concept-LDA leads to accurate and an easy aspect extraction task in an aspect-based sentiment analysis system.

Download Full-text