Document visualization: an overview of current research

Collections of scientific publications are growing rapidly. Scientists have access to portals containing a large number of documents. Such a large amount of data is difficult to investigate. Methods of document visualization are used to reduce labor costs, search for necessary and similar documents, evaluate the scientific contribution of certain publications and reveal hidden links between documents. The methods of document visualization can be based on various models of document representation. In recent years, word embedding methods for natural language processing have become extremely popular. Following them, methods for analyzing text collections began to appear to obtain vector representations of documents. Although there are many document analyzing systems, new methods can give new understandings of collections, have better performance for analyzing large collections of documents, or find new relationships between documents. This article discusses two methods Paper2vec and Cite2vec that get vector representations of documents using citation information. The text provides a brief description of the considered methods for analyzing collections of scientific publications, describes experiments with these methods, including the visualization of the results of the methods and a description of the problems that arise.

Document visualization on small displays

10.14711/thesis-b680868 ◽

2000 ◽

Author(s):

Ka Kit Hoi

Keyword(s):

Document Visualization

Making sense of an Electronic Document - Visualization Strategies for Concept Presentation

2006 10th IEEE International Enterprise Distributed Object Computing Conference Workshops (EDOCW'06) ◽

10.1109/edocw.2006.45 ◽

2006 ◽

Author(s):

Naresh Kumar Agarwal ◽

Danny C. C. Poo

Keyword(s):

Electronic Document ◽

Making Sense ◽

Document Visualization

Document Visualization on Small Displays

Mobile Data Management - Lecture Notes in Computer Science ◽

10.1007/3-540-36389-0_18 ◽

2002 ◽

pp. 262-278 ◽

Cited By ~ 2

Author(s):

Ka Kit Hoi ◽

Dik Lun Lee ◽

Jianliang Xu

Keyword(s):

Document Visualization

Document Visualization Based on Semantic Graphs

10.1109/iv.2009.57 ◽

2009 ◽

Cited By ~ 13

Author(s):

Delia Rusu ◽

Blaž Fortuna ◽

Dunja Mladenic ◽

Marko Grobelnik ◽

Ruben Sipoš

Keyword(s):

Document Visualization

Enhancing an Evolving Tree-based text document visualization model with Fuzzy c-Means clustering

2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) ◽

10.1109/fuzz-ieee.2013.6622363 ◽

2013 ◽

Cited By ~ 2

Author(s):

Wui Lee Chang ◽

Kai Meng Tay ◽

Chee Peng Lim

Keyword(s):

Fuzzy C Means ◽

Text Document ◽

Fuzzy C Means Clustering ◽

Evolving Tree ◽

Document Visualization

Exploiting extra-textual and linguistic information in keyphrase extraction

Natural Language Engineering ◽

10.1017/s1351324914000126 ◽

2014 ◽

Vol 22 (1) ◽

pp. 73-95 ◽

Cited By ~ 6

Author(s):

GÁBOR BEREND

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Language Processing ◽

State Of The Art ◽

Keyphrase Extraction ◽

Textual Information ◽

Multiword Expressions ◽

Pos Tagging ◽

Multiple Datasets ◽

Document Visualization

AbstractKeyphrases are the most important phrases of documents that make them suitable for improving natural language processing tasks, including information retrieval, document classification, document visualization, summarization and categorization. Here, we propose a supervised framework augmented by novel extra-textual information derived primarily from Wikipedia. Wikipedia is utilized in such an advantageous way that – unlike most other methods relying on Wikipedia – a full textual index of all the Wikipedia articles is not required by our approach, as we only exploit the category hierarchy and a list of multiword expressions derived from Wikipedia. This approach is not only less resource intensive, but also produces comparable or superior results compared to previous similar works. Our thorough evaluations also suggest that the proposed framework performs consistently well on multiple datasets, being competitive or even outperforming the results obtained by other state-of-the-art methods. Besides introducing features that incorporate extra-textual information, we also experimented with a novel way of representing features that are derived from the POS tagging of the keyphrase candidates.

Using Luhn’s Automatic Abstract Method to Create Graphs of Words for Document Visualization

Social Networking ◽

10.4236/sn.2014.32008 ◽

2014 ◽

Vol 03 (02) ◽

pp. 65-70

Author(s):

Luiz Cláudio Santos Silva ◽

Renelson Ribeiro Sampaio

Keyword(s):

Abstract Method ◽

Document Visualization

INFORMATION ACCESS IN THE DIGITAL ERA - DOCUMENT VISUALIZATION STRATEGY

10.31219/osf.io/wyjs7 ◽

2020 ◽

Author(s):

FRANCISCO CARLOS PALETTA ◽

Armando Manuel Barreiros da Silva

Keyword(s):

Computational Intelligence ◽

Information Access ◽

Digital Transformation ◽

Decision Making Process ◽

Access To Information ◽

Digital Era ◽

Knowledge Organization Systems ◽

Computational Intelligence Methods ◽

Visualization Strategy ◽

Document Visualization

In this work, we focus on the document visualization strategy to support the access to information in the digital era. First, we discuss the dynamics of the document visualization approach and the ability to generate innovations with a direct impact in the digital transformation competitive scenario. Second, we discuss the visualization and computational intelligence methods such as data mining and knowledge discovery as important tools to improve decision making process. Then we present the knowledge organization systems concept and the main challenges related to document visualization strategy. Finally, we discuss the digital and visual literacies have become common to how we read and view information and communicate with others to meet the demands of the transformations of the digital age.