Give Chance a Chance: Modeling Density to Enhance Scatter Plot Quality through Random Data Sampling

2006 ◽  
Vol 5 (2) ◽  
pp. 95-110 ◽  
Author(s):  
Enrico Bertini ◽  
Giuseppe Santucci

The problem of visualizing huge amounts of data is well known in information visualization. Dealing with a large number of items forces almost any kind of Infovis technique to reveal its limits in terms of expressivity and scalability. In this paper we focus on 2D scatter plots, proposing a ‘feature preservation’ approach, based on the idea of modeling the visualization in a virtual space in order to analyze its features (e.g., absolute density, relative density, etc.). In this way we provide a formal framework to measure the visual overlapping, obtaining precise quality metrics about the visualization degradation and devising automatic sampling strategies able to improve the overall image quality. Metrics and algorithms have been improved through suitable user studies.

2018 ◽  
Vol 35 (7) ◽  
pp. 1505-1519 ◽  
Author(s):  
Yu-Chiao Liang ◽  
Matthew R. Mazloff ◽  
Isabella Rosso ◽  
Shih-Wei Fang ◽  
Jin-Yi Yu

AbstractThe ability to construct nitrate maps in the Southern Ocean (SO) from sparse observations is important for marine biogeochemistry research, as it offers a geographical estimate of biological productivity. The goal of this study is to infer the skill of constructed SO nitrate maps using varying data sampling strategies. The mapping method uses multivariate empirical orthogonal functions (MEOFs) constructed from nitrate, salinity, and potential temperature (N-S-T) fields from a biogeochemical general circulation model simulation Synthetic N-S-T datasets are created by sampling modeled N-S-T fields in specific regions, determined either by random selection or by selecting regions over a certain threshold of nitrate temporal variances. The first 500 MEOF modes, determined by their capability to reconstruct the original N-S-T fields, are projected onto these synthetic N-S-T data to construct time-varying nitrate maps. Normalized root-mean-square errors (NRMSEs) are calculated between the constructed nitrate maps and the original modeled fields for different sampling strategies. The sampling strategy according to nitrate variances is shown to yield maps with lower NRMSEs than mapping adopting random sampling. A k-means cluster method that considers the N-S-T combined variances to identify key regions to insert data is most effective in reducing the mapping errors. These findings are further quantified by a series of mapping error analyses that also address the significance of data sampling density. The results provide a sampling framework to prioritize the deployment of biogeochemical Argo floats for constructing nitrate maps.


2018 ◽  
Vol 37 (3) ◽  
pp. 625-662 ◽  
Author(s):  
M. Behrisch ◽  
M. Blumenschein ◽  
N. W. Kim ◽  
L. Shao ◽  
M. El-Assady ◽  
...  

Author(s):  
Gulsebnem Bishop

Statistics can be used to describe, model, and predict archaeological data, provided that the analyst has an understanding of the strengths and limitations of their data type and has a well-defined statistical population. This chapter discusses the major types of archaeological data, sampling strategies, and statistics appropriate for both describing and predicting outcomes for simple and complex ceramic datasets. Description and modeling of complex data can be done with many tools ranging from simple charts and histograms to more complicated methods, such as T-Test, Chi-Square Test, Multi-Response Permutation Procedure (MRPP), and Kernel Density Estimation (KDE), as well as Principle Components Analysis (PCA).


Author(s):  
Peter Holtz ◽  
Nicole Kronberger ◽  
Wolfgang Wagner

Within Internet forums, members of certain (online) communities discuss matters of concern to the respective groups, with comparatively few social restraints. For radical, extremist, and other ideologically “sensitive” groups and organizations in particular, Internet forums are a very efficient and widely used tool to connect members, inform others about the group’s agenda, and attract new members. Whereas members of such groups may be reluctant to express their opinions in interviews or surveys, we argue that Internet forums can yield an abundance of useful “natural” discursive data for social scientific research. Based on two exemplary studies, we present a practical guide for the analysis of such data, including data-sampling strategies, the refinement of the data for computer-assisted qualitative and quantitative analysis, and strategies for in-depth analysis. The first study is an in-depth analysis of discourses within a German neo-Nazi discussion board. In the second, nine online forums for young German Muslims were analyzed and compared. Advantages and potential issues with analyzing Internet forums are discussed.


2020 ◽  
Vol 34 (04) ◽  
pp. 5989-5996 ◽  
Author(s):  
Xiaoyu Tao ◽  
Xiaopeng Hong ◽  
Xinyuan Chang ◽  
Yihong Gong

In this paper, we propose a novel single-task continual learning framework named Bi-Objective Continual Learning (BOCL). BOCL aims at both consolidating historical knowledge and learning from new data. On one hand, we propose to preserve the old knowledge using a small set of pillars, and develop the pillar consolidation (PLC) loss to preserve the old knowledge and to alleviate the catastrophic forgetting problem. On the other hand, we develop the contrastive pillar (CPL) loss term to improve the classification performance, and examine several data sampling strategies for efficient onsite learning from ‘new’ with a reasonable amount of computational resources. Comprehensive experiments on CIFAR10/100, CORe50 and a subset of ImageNet validate the BOCL framework. We also reveal the performance accuracy of different sampling strategies when used to finetune a given CNN model. The code will be released.


Author(s):  
Zhi-Jiang Liu ◽  
Vera Levina ◽  
Yuliya Frolova

The rapid development of computer visualization techniques as well as virtual and augmented reality has led to the possibility of perfect data visualization and the creation of a special virtual space for educating new generation. Simultaneously, the increase in the amount of data to be processed requires a proper selection and presentation of data for solving specific problems. Education sets such tasks as 1) improving the efficiency of presenting information and its assimilation by stu-dents, and 2) increasing the convenience and quality of the teachers’ work. The purpose of this study is to test an acceleration and improvement of the teacher’s response to students studying with more productive visualized learning material. Meanwhile, the created visualization system was based on minimizing the efforts and costs of its preparation and constant support. Only free cloud-based services and visualization tools were used. Students were given the opportunity to con-stantly, in real time control their learning process and create education markers with the help of perspicuous visual environment. To create a visualization system, already existing works on the implementation and verification of the system was used. The study was based on the results of applying this technology. A survey of 300 students from three universities in China, Russia and Kazakhstan was conducted. The control group consisted of 150 students from the same universi-ties who did not use visualization to master the same educational material. Ac-cording to the results of the study, students who used information visualization showed a sharp increase in the subjective assessment of the speed and quality of their learning (58.58% and 37.73%, respectively, of the total number of partici-pants gave a high rating, while in the control group – only12.25%). Further, the level of anxiety associated with an assimilation of new language material was significantly decreased (13.54% in the study group did not feel anxiety, while on-ly 7% – in the control group).


Sign in / Sign up

Export Citation Format

Share Document