Web Summarization and Browsing Through Semantic Tag Clouds

Author(s):  
Antonio M. Rinaldi

The need to manage electronic documents is an open issue in the digital era. It becomes a challenging problem on the internet where a large amount of data needs even more efficient and effective methods and techniques for mining and representing information. In this context, document summarization, browsing processes and visualization techniques have had a great impact on several dimensions of user information perception. In this context, the use of ontologies for knowledge representation has rapidly grown in the last years in several application domains together with social-based techniques such as tag clouds. This form of visualization tool is becoming particularly useful in the interaction process between users and social applications where a huge amount of data needs to have effective and efficient interfaces. In this article, the authors propose a novel methodology based on a combination of ontologies and Tag Clouds for web document collections browsing and summarizing, they call this tool Semantic Tag Cloud.

2003 ◽  
Vol 12 (4) ◽  
pp. 320-332 ◽  
Author(s):  
Maria Halkidi ◽  
Benjamin Nguyen ◽  
Iraklis Varlamis ◽  
Michalis Vazirgiannis

Themes and examples examined in this chapter discuss the fast growing field of visualization. First, basic terms: data, information, knowledge, dimensions, and variables are discussed before going into the visualization issues. The next part of the text overviews some of the basics in visualization techniques: data-, information-, and knowledge-visualization, and tells about tools and techniques used in visualization such as data mining, clusters and biclustering, concept mapping, knowledge maps, network visualization, Web-search result visualization, open source intelligence, visualization of the Semantic Web, visual analytics, and tag cloud visualization. This is followed by some remarks on music visualization. The next part of the chapter is about the meaning and the role of visualization in various kinds of presentations. Discussion relates to concept visualization in visual learning, visualization in education, collaborative visualization, professions that employ visualization skills, and well-known examples of visualization that progress science. Comments on cultural heritage knowledge visualization conclude the chapter.


Author(s):  
Mohd Vasim Ahamad ◽  
Misbahul Haque ◽  
Mohd Imran

In the present digital era, more data are generated and collected than ever before. But, this huge amount of data is of no use until it is converted into some useful information. This huge amount of data, coming from a number of sources in various data formats and having more complexity, is called big data. To convert the big data into meaningful information, the authors use different analytical approaches. Information extracted, after applying big data analytics methods over big data, can be used in business decision making, fraud detection, healthcare services, education sector, machine learning, extreme personalization, etc. This chapter presents the basics of big data and big data analytics. Big data analysts face many challenges in storing, managing, and analyzing big data. This chapter provides details of challenges in all mentioned dimensions. Furthermore, recent trends of big data analytics and future directions for big data researchers are also described.


Author(s):  
Sumit Arun Hirve ◽  
Pradeep Reddy C. H.

Being premature, the traditional data visualization techniques suffer from several challenges and lack the ability to handle a huge amount of data, particularly in gigabytes and terabytes. In this research, we propose an R-tool and data analytics framework for handling a huge amount of commercial market stored data and discover knowledge patterns from the dataset for conveying the derived conclusion. In this chapter, we elaborate on pre-processing a commercial market dataset using the R tool and its packages for information and visual analytics. We suggest a recommendation system based on the data which identifies if the food entry inserted into the database is hygienic or non-hygienic based on the quality preserved attributes. For a precise recommendation system with strong predictive accuracy, we will put emphasis on Algorithms such as J48 or Naive Bayes and utilize the one who outclasses the comparison based on accuracy. Such a system, when combined with R language, can be potentially used for enhanced decision making.


2018 ◽  
Vol 9 (2) ◽  
pp. 46-68
Author(s):  
Omar Khrouf ◽  
Kais Khrouf ◽  
Jamel Feki

There is an explosion in the amount of textual documents that have been generated and stored in recent years. Effective management of these documents is essential for better exploitation in decisional analyses. In this context, the authors propose their CobWeb multidimensional model based on standard facets and dedicated to the OLAP (on-line analytical processing) of XML documents; it aims to provide decision makers with facilities for expressing their analytical queries. Secondly, they suggest new visualization operators for OLAP query results by introducing the concept of Tag clouds as a means to help decision-makers to display OLAP results in an intuitive format and focus on main concepts. The authors have developed a software prototype called MQF (Multidimensional Query based on Facets) to support their proposals and then tested it on documents from the PubMed collection.


2020 ◽  
pp. 147387162096663
Author(s):  
Úrsula Torres Parejo ◽  
Jesús R Campaña ◽  
M Amparo Vila ◽  
Miguel Delgado

Tag clouds are tools that have been widely used on the Internet since their conception. The main applications of these textual visualizations are information retrieval, content representation and browsing of the original text from which the tags are generated. Despite the extensive use of tag clouds, their enormous popularity and the amount of research related to different aspects of them, few studies have summarized their most important features when they work as tools for information retrieval and content representation. In this paper we present a summary of the main characteristics of tag clouds found in the literature, such as their different functions, designs and negative aspects. We also present a summary of the most popular metrics used to capture the structural properties of a tag cloud generated from the query results, as well as other measures for evaluating the goodness of the tag cloud when it works as a tool for content representation. The different methods for tagging and the semantic association processes in tag clouds are also considered. Finally we give a list of alternative for visual interfaces, which makes this study a useful first help for researchers who want to study the content representation and information retrieval interfaces in greater depth.


Author(s):  
Yusef Hassan Montero ◽  
Víctor Herrero-Solana ◽  
Vicente Guerrero-Bote

Los tag-clouds, o nubes de etiquetas, son componentes de interfaz en forma de lista compacta de palabras clave, que permiten al usuario explorar y navegar por conjuntos documentales. Si bien en los últimos años han gozado de gran popularidad en el entorno Web, también es cierto que, como interfaces visuales de recuperación de información, presentan evidentes problemas de usabilidad. El presente trabajo se propone indagar en la usabilidad de los tag-clouds, a través de la revisión bibliográfica y un estudio con usuarios utilizando técnicas de eye-tracking o seguimiento visual. Los resultados demuestran que tanto el tamaño de fuente como la forma del tag-cloud tienen una clara influencia en la exploración visual de los usuarios. Respecto a la ordenación de los tags, si bien la ordenación alfabética no ofrece ventajas en tareas de búsqueda exploratoria, los resultados sugieren que la agrupación semántica tampoco supone una mejora en términos de eficiencia en tareas de localización visual de los tags. Finalmente se proponen posibles mejoras en la presentación de los tag-clouds agrupados semánticamente, así como futuras líneas de investigación


Author(s):  
Muhammad Abulaish ◽  
Tarique Anwar

Tag clouds have become an effective tool to quickly perceive the most prominent terms embedded within textual data. Tag clouds help grasp the main theme of a corpus without exploring the pile of documents. However, the effectiveness of tag clouds to conceptualize text corpora is directly proportional to the quality of the tags. In this paper, the authors propose a keyphrase-based tag cloud generation framework. In contrast to existing tag cloud generation systems that use single words as tags and their frequency counts to determine the font size of the tags, the proposed framework identifies feasible keyphrases and uses them as tags. The font-size of a keyphrase is determined as a function of its relevance weight. Instead of using partial or full parsing, which is inefficient for lengthy sentences and inaccurate for the sentences that do not follow proper grammatical structure, the proposed method applies n-gram techniques followed by various heuristics-based refinements to identify candidate phrases from text documents. A rich set of lexical and semantic features are identified to characterize the candidate phrases and determine their keyphraseness and relevance weights. The authors also propose a font-size determination function, which utilizes the relevance weights of the keyphrases to determine their relative font size for tag cloud visualization. The efficacy of the proposed framework is established through experimentation and its comparison with the existing state-of-the-art tag cloud generation methods.


Author(s):  
Alison Smith ◽  
Tak Yeon Lee ◽  
Forough Poursabzi-Sangdeh ◽  
Jordan Boyd-Graber ◽  
Niklas Elmqvist ◽  
...  

Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniques—word lists, word lists with bars, word clouds, and network graphs—against each other and against automatically generated labels. Our basis of comparison is participant ratings of how well labels describe documents from the topic. Our study has two phases: a labeling phase where participants label visualized topics and a validation phase where different participants select which labels best describe the topics’ documents. Although all visualizations produce similar quality labels, simple visualizations such as word lists allow participants to quickly understand topics, while complex visualizations take longer but expose multi-word expressions that simpler visualizations obscure. Automatic labels lag behind user-created labels, but our dataset of manually labeled topics highlights linguistic patterns (e.g., hypernyms, phrases) that can be used to improve automatic topic labeling algorithms.


2013 ◽  
Vol 760-762 ◽  
pp. 2060-2063
Author(s):  
Yuan Zhang ◽  
Yun Lin

Tag clouds are now very popular in websites because of its ability to recruit the activity of web users into effectively information retrieval. In recent years, although there are many researches on English tag clouds, few are known about Chinese tag clouds. In this paper, we investigated the layout of the Chinese tag cloud, and analyzed eight visual features of the Chinese tag cloud according to users browsing behavior. Our results could provide a theoretical reference for the tag cloud designers.


Sign in / Sign up

Export Citation Format

Share Document