A Visual Analytics Approach for Interactive Document Clustering

We present a new approach for analyzing topic models using visual analytics. We have developed TopicView, an application for visually comparing and exploring multiple models of text corpora, as a prototype for this type of analysis tool. TopicView uses multiple linked views to visually analyze conceptual and topical content, document relationships identified by models, and the impact of models on the results of document clustering. As case studies, we examine models created using two standard approaches: Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Conceptual content is compared through the combination of (i) a bipartite graph matching LSA concepts with LDA topics based on the cosine similarities of model factors and (ii) a table containing the terms for each LSA concept and LDA topic listed in decreasing order of importance. Document relationships are examined through the combination of (i) side-by-side document similarity graphs, (ii) a table listing the weights for each document's contribution to each concept/topic, and (iii) a full text reader for documents selected in either of the graphs or the table. The impact of LSA and LDA models on document clustering applications is explored through similar means, using proximities between documents and cluster exemplars for graph layout edge weighting and table entries. We demonstrate the utility of TopicView's visual approach to model assessment by comparing LSA and LDA models of several example corpora.

Download Full-text

BrainTrawler: A Web-Based Visual Analytics Framework for Big Brain Network Data in their Spatial Context

10.26226/morressier.5b31ec402afeeb001345bd32 ◽

2018 ◽

Author(s):

Florian Ganglberger ◽

Katja Bühler

Keyword(s):

Visual Analytics ◽

Brain Network ◽

Spatial Context ◽

Network Data ◽

Web Based

Download Full-text

Visual Analytics Need, Process, Scope, Tools and Techniques and Challenges

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i10.153158 ◽

2018 ◽

Vol 6 (10) ◽

pp. 153-158

Author(s):

R. Shankar ◽

S. Duraisamy

Keyword(s):

Visual Analytics ◽

Tools And Techniques

Download Full-text

Open challenges in visual analytics for security information and event management

Information and Control Systems ◽

10.31799/1684-8853-2019-2-57-67 ◽

2019 ◽

pp. 57-67

Author(s):

E. S. Novikova ◽

I. V. Kotenko

Keyword(s):

Visual Analytics ◽

Event Management ◽

Security Information

Download Full-text

Exploratory Visual Analytics of a Dynamically Built Network of Nodes in a WebGL-Enabled Browser

10.21236/ada597717 ◽

2014 ◽

Author(s):

Andrew M. Neiderer

Keyword(s):

Visual Analytics

Download Full-text

Comparision of Different Distance Measure Methods in Text Document Clustering

INTERNATIONAL JOURNAL OF RESEARCH AND ENGINEERING ◽

10.21276/ijre.2018.5.7.2 ◽

2018 ◽

Vol 5 (7) ◽

Author(s):

Yin Min Tun ◽

Keyword(s):

Distance Measure ◽

Document Clustering ◽

Text Document ◽

Measure Methods

Download Full-text

An Improved B-hill Climbing Optimization Technique for Solving the Text Documents Clustering Problem

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405614666180903112541 ◽

2020 ◽

Vol 16 (4) ◽

pp. 296-306 ◽

Cited By ~ 3

Author(s):

Laith Mohammad Abualigah ◽

Essam Said Hanandeh ◽

Ahamad Tajudin Khader ◽

Mohammed Abdallh Otair ◽

Shishir Kumar Shandilya

Keyword(s):

Optimization Technique ◽

Document Clustering ◽

Text Clustering ◽

Hill Climbing ◽

Text Documents ◽

Clustering Problem ◽

Text Document ◽

Text Information ◽

Amount Of Knowledge ◽

The Hill

Background: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters. Aims: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster. Methods: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques. Results: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem. Conclusion: The performance of the text clustering is useful by adding the β operator to the hill climbing.

Download Full-text