Evaluating Visual Representations for Topic Understanding and Their Effects on Manually Generated Topic Labels

Probabilistic topic models are important tools for indexing, summarizing, and analyzing large document collections by their themes. However, promoting end-user understanding of topics remains an open research problem. We compare labels generated by users given four topic visualization techniques—word lists, word lists with bars, word clouds, and network graphs—against each other and against automatically generated labels. Our basis of comparison is participant ratings of how well labels describe documents from the topic. Our study has two phases: a labeling phase where participants label visualized topics and a validation phase where different participants select which labels best describe the topics’ documents. Although all visualizations produce similar quality labels, simple visualizations such as word lists allow participants to quickly understand topics, while complex visualizations take longer but expose multi-word expressions that simpler visualizations obscure. Automatic labels lag behind user-created labels, but our dataset of manually labeled topics highlights linguistic patterns (e.g., hypernyms, phrases) that can be used to improve automatic topic labeling algorithms.

Download Full-text

Improving Topic Models with Latent Feature Word Representations

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00140 ◽

2015 ◽

Vol 3 ◽

pp. 299-313 ◽

Cited By ~ 94

Author(s):

Dat Quoc Nguyen ◽

Richard Billingsley ◽

Lan Du ◽

Mark Johnson

Keyword(s):

High Performance ◽

Feature Vector ◽

Topic Models ◽

Document Collections ◽

Probabilistic Topic Models ◽

New Models ◽

Classification Tasks ◽

Latent Topics ◽

Vector Representations ◽

Feature Word

Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks. In this paper, we extend two different Dirichlet multinomial topic models by incorporating latent feature vector representations of words trained on very large corpora to improve the word-topic mapping learnt on a smaller corpus. Experimental results show that by using information from the external corpora, our new models produce significant improvements on topic coherence, document clustering and document classification tasks, especially on datasets with few or short documents.

Download Full-text

Model of stochastic process of restoration of working capacity of agricultural machine in inertial systems with delay

Naukovij žurnal «Tehnìka ta energetika» ◽

10.31548/machenergy2020.03.143 ◽

2020 ◽

Vol 11 (3) ◽

pp. 143-150

Author(s):

I. L. Rogovskii ◽

Keyword(s):

Research Problem ◽

Theoretical Research ◽

Working Capacity ◽

Maintenance System ◽

Healthy State ◽

Maintenance Service ◽

System Maintenance ◽

Basic Functions ◽

Two Phases ◽

Complex Indicators

In the article the analysis of existing agricultural machines in a healthy state, followed by work on the maintenance system subject to the conditions of reforming of the agrarian sector. Under maintenance refers to the complex of works on maintenance of working capacity or serviceability of the products during use by adjusting, knowledgeable, filling and retaining work. For the assessment of the alternatives it is advisable to conduct morphological analysis of the entire set of possible solutions to the research problem presented in a morphological matrix, which presents the basic functions of the machine and options the subject of the forms of their implementation. Theoretical research has provided the answer to two fundamental questions – how to change maintenance system depending on the level of development of agricultural production, and what parameters must have the system maintenance service to perform the appropriate intervention with the minimum technologically necessary costs of resources and investment. Assessment of the level of maintenance of agricultural machinery is provided to carry on the totality of organizational and technical factors, formalized through parts and complex indicators, in two phases. The first evaluation of using of the indicators for each factor separately. Second – assessment of a complex indicator (for all factors).

Download Full-text

StanceVis Prime: visual analysis of sentiment and stance in social media texts

Journal of Visualization ◽

10.1007/s12650-020-00684-5 ◽

2020 ◽

Vol 23 (6) ◽

pp. 1015-1034

Author(s):

Kostiantyn Kucher ◽

Rafael M. Martins ◽

Carita Paradis ◽

Andreas Kerren

Keyword(s):

Social Media ◽

Visual Analytics ◽

Visual Analysis ◽

Research Problem ◽

Data Sources ◽

Data Series ◽

Text Documents ◽

Document Collections ◽

Social Media Data ◽

Media Data

Abstract Text visualization and visual text analytics methods have been successfully applied for various tasks related to the analysis of individual text documents and large document collections such as summarization of main topics or identification of events in discourse. Visualization of sentiments and emotions detected in textual data has also become an important topic of interest, especially with regard to the data originating from social media. Despite the growing interest in this topic, the research problem related to detecting and visualizing various stances, such as rudeness or uncertainty, has not been adequately addressed by the existing approaches. The challenges associated with this problem include the development of the underlying computational methods and visualization of the corresponding multi-label stance classification results. In this paper, we describe our work on a visual analytics platform, called StanceVis Prime, which has been designed for the analysis of sentiment and stance in temporal text data from various social media data sources. The use case scenarios intended for StanceVis Prime include social media monitoring and research in sociolinguistics. The design was motivated by the requirements of collaborating domain experts in linguistics as part of a larger research project on stance analysis. Our approach involves consuming documents from several text stream sources and applying sentiment and stance classification, resulting in multiple data series associated with source texts. StanceVis Prime provides the end users with an overview of similarities between the data series based on dynamic time warping analysis, as well as detailed visualizations of data series values. Users can also retrieve and conduct both distant and close reading of the documents corresponding to the data series. We demonstrate our approach with case studies involving political targets of interest and several social media data sources and report preliminary user feedback received from a domain expert. Graphic abstract

Download Full-text

Segmentation and Identification of Vertebrae in CT Scans Using CNN, k-Means Clustering and k-NN

Informatics ◽

10.3390/informatics8020040 ◽

2021 ◽

Vol 8 (2) ◽

pp. 40

Author(s):

Nicola Altini ◽

Giuseppe De Giosa ◽

Nicola Fragasso ◽

Claudia Coscia ◽

Elena Sibilano ◽

...

Keyword(s):

Machine Learning ◽

Large Scale ◽

Machine Learning Algorithms ◽

Ct Scans ◽

Computer Assisted ◽

Dice Coefficient ◽

Labeling Algorithms ◽

Two Phases ◽

Vertebrae Segmentation ◽

Whole Spine

The accurate segmentation and identification of vertebrae presents the foundations for spine analysis including fractures, malfunctions and other visual insights. The large-scale vertebrae segmentation challenge (VerSe), organized as a competition at the Medical Image Computing and Computer Assisted Intervention (MICCAI), is aimed at vertebrae segmentation and labeling. In this paper, we propose a framework that addresses the tasks of vertebrae segmentation and identification by exploiting both deep learning and classical machine learning methodologies. The proposed solution comprises two phases: a binary fully automated segmentation of the whole spine, which exploits a 3D convolutional neural network, and a semi-automated procedure that allows locating vertebrae centroids using traditional machine learning algorithms. Unlike other approaches, the proposed method comes with the added advantage of no requirement for single vertebrae-level annotations to be trained. A dataset of 214 CT scans has been extracted from VerSe’20 challenge data, for training, validating and testing the proposed approach. In addition, to evaluate the robustness of the segmentation and labeling algorithms, 12 CT scans from subjects affected by severe, moderate and mild scoliosis have been collected from a local medical clinic. On the designated test set from Verse’20 data, the binary spine segmentation stage allowed to obtain a binary Dice coefficient of 89.17%, whilst the vertebrae identification one reached an average multi-class Dice coefficient of 90.09%. In order to ensure the reproducibility of the algorithms hereby developed, the code has been made publicly available.

Download Full-text

Modelling User Perception of Online Visualisation in Real Estate Marketplaces

10.29007/v571 ◽

2018 ◽

Author(s):

Osama Bin Usuf ◽

Mehrafarin Takin ◽

Samad Mohammad Ebrahimzadeh Sepasgoza

Keyword(s):

Real Estate ◽

3D Model ◽

User Preference ◽

Key Factors ◽

Developed Country ◽

Web Based ◽

Customer Perception ◽

Factors Influencing ◽

Two Phases ◽

Visualization Techniques

Increased internet penetration rate has made internet marketing an integral part of real estate industry. This may result in an inefficient process for the buyers and sellers due to the need for physical inspection. The aim of this study is to present key factors influencing the users’ decision to use a web-based technology for real estate purposes. This is an ongoing study including two phases: developing a framework based on a case study, and conducting a survey to measure customer perception on incorporating online visualization techniques. The paper presents the result of the first phase evaluating real estate marketing platforms as case studies in Pakistan and Australia. While the initial results show that physical inspections are still required before deciding on property transaction, it was found that the number of inspections can be reduced by incorporating a 3D model of the property to the listing platform. In addition, it was observed that clarity of search results and provision of a 3D model are some of the key factors influencing the user preference to use the website again. This reinforces the idea that advanced visualization techniques can improve the current reliability issues faced by customers and may also streamline the transactions. This study will be extended by conducting the designed survey in two target countries one a developed country and the other one a developing country to compare the most popular features to international customers.

Download Full-text

Web Summarization and Browsing Through Semantic Tag Clouds

International Journal of Intelligent Information Technologies ◽

10.4018/ijiit.2019070101 ◽

2019 ◽

Vol 15 (3) ◽

pp. 1-23 ◽

Cited By ~ 1

Author(s):

Antonio M. Rinaldi

Keyword(s):

Huge Amount ◽

Document Collections ◽

Electronic Documents ◽

CARTOGRAPHIC VISUALIZATION OF OUTPUTS FOR SPATIAL DECISION-MAKING IN REGIONAL DEVELOPMENT

Geodesy and Cartography ◽

10.3846/20296991.2015.1120431 ◽

2015 ◽

Vol 41 (4) ◽

pp. 174-184 ◽

Cited By ~ 2

Author(s):

Aleš Ruda

Keyword(s):

Decision Making ◽

Regional Development ◽

Risk Model ◽

Weighted Average ◽

Weighted Sum Method ◽

Spatial Decision Making ◽

Thematic Cartography ◽

Cartographic Visualization ◽

Two Phases ◽

Visualization Techniques

Regional development is full of planning and decision making. Having precise results for spatial decision making (SDM) is more than necessary. On one site, there are many approaches how to process input data, on the other hand thematic cartography also operates with many visualizing methods and techniques. Loss of accuracy of the results is more than expected because there are two phases (data processing during SDM and cartographic visualization) where the accuracy might be distorted. In both phases processing recommendations must be obeyed. Selection of spatial decision making method must follow considered aims as well as visualization techniques and setting their parameters (especially during reclassification, interpolation or generalization). Paper deals with the proposal of elementary scheme of SDM and related visualization during two case studies (CS). First CS represents composite indicators proposal followed by weighted sum method using heuristics approaches with the aim to identify the tourism influence on the landscape. Combined visualization techniques for quantitative and qualitative data are presented. Second CS uses ordered weighted average method for finding the best place for building of a new public logistics centre. Constraints and factors represent key indicators and following factor and order weights enable to propose the best accepted risk model. In this case grid maps describe derived values and chosen reclassification documents conversion into linguistic variables.

Download Full-text

Mapping texts through dimensionality reduction and visualization techniques for interactive exploration of document collections

10.1117/12.650899 ◽

2006 ◽

Author(s):

Alneu de Andrade Lopes ◽

Rosane Minghim ◽

Vinícius Melo ◽

Fernando V. Paulovich

Keyword(s):

Dimensionality Reduction ◽

Document Collections ◽

Interactive Exploration ◽

Visualization Techniques

Download Full-text

USING MULTIPLE PARADIGM RESEARCH METHODOLOGIES TO GAIN NEW INSIGHTS INTO ENTREPRENEURIAL MOTIVATIONS

Journal of Enterprising Culture ◽

10.1142/s0218495807000137 ◽

2007 ◽

Vol 15 (03) ◽

pp. 219-241 ◽

Cited By ~ 19

Author(s):

JODYANNE KIRKWOOD ◽

COLIN CAMPBELL-HUNT

Keyword(s):

Research Methodology ◽

Mail Survey ◽

Research Problem ◽

Research Process ◽

Theoretical Development ◽

Second Phase ◽

Research Methodologies ◽

Organizational Research ◽

Face To Face ◽

Two Phases

Much of the extant entrepreneurship research has focused on studying the field using positivist research methodologies and little attention has been paid to interpretive methodologies or the use of multiple paradigms. The focus of the paper is on illustrating how we applied a multiple paradigm research methodology to an existing research problem. Specifically, the research was concerned with gender differences in motivations for becoming an entrepreneur. We explain how a multiple paradigm research methodology enabled us to gain new insights into an aspect of entrepreneurship where results of the prior research were not only contradictory, but also lacked a focus on theoretical development. Our research process involved two phases. First, a mail survey which was designed to replicate existing studies was administered to 289 entrepreneurs. The second phase involved in-depth face-to-face interviews with 50 entrepreneurs (25 men and 25 women) who responded to the mail survey. Theoretical contributions regarding entrepreneurial motivation are detailed, as are some more general implications of using multiple paradigm research methodologies in other entrepreneurship and organizational research.

Download Full-text

Precipitation of NiHfsi phase in NiAl single crystals containing Hf

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100139032 ◽

1995 ◽

Vol 53 ◽

pp. 532-533

Author(s):

A. Garg ◽

R. D. Noebe ◽

R. Darolia

Keyword(s):

High Temperature ◽

Single Crystals ◽

High Temperature Strength ◽

Bridgman Technique ◽

Shell Mold ◽

Third Phase ◽

Unknown Phase ◽

Transmission Electron ◽

Two Phases ◽

Nial Matrix

Small additions of Hf to NiAl produce a significant increase in the high-temperature strength of single crystals. Hf has a very limited solubility in NiAl and in the presence of Si, results in a high density of G-phase (Ni16Hf6Si7) cuboidal precipitates and some G-platelets in a NiAl matrix. These precipitates have a F.C.C structure and nucleate on {100}NiAl planes with almost perfect coherency and a cube-on-cube orientation-relationship (O.R.). However, G-phase is metastable and after prolonged aging at high temperature dissolves at the expense of a more stable Heusler (β'-Ni2AlHf) phase. In addition to these two phases, a third phase was shown to be present in a NiAl-0.3at. % Hf alloy, but was not previously identified (Fig. 4 of ref. 2 ). In this work, we report the morphology, crystal-structure, O.R., and stability of this unknown phase, which were determined using conventional and analytical transmission electron microscopy (TEM).Single crystals of NiAl containing 0.5at. % Hf were grown by a Bridgman technique. Chemical analysis indicated that these crystals also contained Si, which was not an intentional alloying addition but was picked up from the shell mold during directional solidification.

Download Full-text