Topic Models For Feature Selection in Document Clustering

Author(s):  
Anna Drummond ◽  
Zografoula Vagena ◽  
Chris Jermaine
2013 ◽  
Vol 22 (05) ◽  
pp. 1360008 ◽  
Author(s):  
PATRICIA J. CROSSNO ◽  
ANDREW T. WILSON ◽  
TIMOTHY M. SHEAD ◽  
WARREN L. DAVIS ◽  
DANIEL M. DUNLAVY

We present a new approach for analyzing topic models using visual analytics. We have developed TopicView, an application for visually comparing and exploring multiple models of text corpora, as a prototype for this type of analysis tool. TopicView uses multiple linked views to visually analyze conceptual and topical content, document relationships identified by models, and the impact of models on the results of document clustering. As case studies, we examine models created using two standard approaches: Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Conceptual content is compared through the combination of (i) a bipartite graph matching LSA concepts with LDA topics based on the cosine similarities of model factors and (ii) a table containing the terms for each LSA concept and LDA topic listed in decreasing order of importance. Document relationships are examined through the combination of (i) side-by-side document similarity graphs, (ii) a table listing the weights for each document's contribution to each concept/topic, and (iii) a full text reader for documents selected in either of the graphs or the table. The impact of LSA and LDA models on document clustering applications is explored through similar means, using proximities between documents and cluster exemplars for graph layout edge weighting and table entries. We demonstrate the utility of TopicView's visual approach to model assessment by comparing LSA and LDA models of several example corpora.


2017 ◽  
Vol 84 ◽  
pp. 24-36 ◽  
Author(s):  
Laith Mohammad Abualigah ◽  
Ahamad Tajudin Khader ◽  
Mohammed Azmi Al-Betar ◽  
Osama Ahmad Alomari

Author(s):  
El Moukhtar Zemmouri ◽  
Bouchra Frikh ◽  
Brahim Ouhbi ◽  
Asmaa Benghabrit ◽  
Hicham Behja

Sign in / Sign up

Export Citation Format

Share Document