scholarly journals Visual analytics for BigData variety and its behaviours

2015 ◽  
Vol 12 (4) ◽  
pp. 1171-1191 ◽  
Author(s):  
Jinson Zhang ◽  
Mao Huang ◽  
Zhao-Peng Meng

BigData, defined as structured and unstructured data containing images, videos, texts, audio and other forms of data collected from multiple datasets, is too big, too complex and moves too fast to analyze using traditional methods. This has given rise to a few issues that must be addressed; 1) how to analyze BigData across multiple datasets, 2) how to classify the different data forms, 3) how to identify BigData patterns based on its behaviours, 4) how to visualize BigData attributes in order to gain a better understanding of data. It is therefore necessary to establish a new framework for BigData analysis and visualization. In this paper, we have extended our previous works for classifying the BigData attributes into the "5Ws" dimensions based on different data behaviours. Our approach not only classifies BigData attributes for different data forms across multiple datasets, but also establishes the "5Ws" densities to represent the characteristics of data flow patterns. We use additional non-dimensional parallel axes in parallel coordinates to display the ?5Ws? sending and receiving densities, which provide more analytic features for BigData analysis. The experiment shows that our approach with parallel coordinate visualization can be efficiently used for BigData analysis and visualization.

2013 ◽  
Vol 14 (1) ◽  
pp. 51-61 ◽  
Author(s):  
Fabian Fischer ◽  
Johannes Fuchs ◽  
Florian Mansmann ◽  
Daniel A Keim

The enormous growth of data in the last decades led to a wide variety of different database technologies. Nowadays, we are capable of storing vast amounts of structured and unstructured data. To address the challenge of exploring and making sense out of big data using visual analytics, the tight integration of such backend services is needed. In this article, we introduce BANKSAFE, which was built for the VAST Challenge 2012 and won the outstanding comprehensive submission award. BANKSAFE is based on modern database technologies and is capable of visually analyzing vast amounts of monitoring data and security-related datasets of large-scale computer networks. To better describe and demonstrate the visualizations, we utilize the Visual Analytics Science and Technology (VAST) Challenge 2012 as case study. Additionally, we discuss lessons learned during the design and development of BANKSAFE, which are also applicable to other visual analytics applications for big data.


1996 ◽  
Vol 31 (5) ◽  
pp. 278-290 ◽  
Author(s):  
Vugranam C. Sreedhar ◽  
Guang R. Gao ◽  
Yong-Fong Lee

2020 ◽  
Vol 5 (3) ◽  
pp. 172-177
Author(s):  
Mislav Radic ◽  
Tracy M Frech

Since it was first used in 1997, the term “big data” has been popularized; however, the concept of big data is relatively new to medicine. Big data refers to a method and technique to systematically retrieve, collect, manage, and analyze very large and complex sets of structured and unstructured data that cannot be sufficiently processed using traditional methods of processing data. Integrating big data in rare diseases with low prevalence and incidence, like systemic sclerosis is of particular importance. We conducted a literature review of use of big data in systemic sclerosis. The volume of data on systemic sclerosis has grown steadily in the recent years; however, big data methods have not been readily used. This inexhaustible source of data needs to be used more to unleash its full potential.


2019 ◽  
Vol 8 (3) ◽  
pp. 1950-1955

With the growing intricacy in data engendered and processed across sundry platforms today, the desideratum for consistency has grown. Structured data is utilized for a number of purposes which is not feasible with unstructured data. The purpose of this study was to convert data from unstructured format to structured in portable document format with the help of new framework using the concept of Binary Decision Diagrams and Boolean operations. Binary decision diagrams are data structures for representing Boolean functions taking Boolean as input and generating Boolean as output and hence creating a binary diagram. This research is mainly carried out to show how we can store large number of data easily in the form of bits. The entire focus is on retrieving the meaningful information from unstructured textual data in PDF documents using Boolean operations and bag model, thus, saving the meaningful keywords in the form of binary decision trees. Later on clustering the documents based on commonalities between the documents. This research presents a way for increasing the efficiency of converting unstructured data to structured in PDF and saving huge number of data in the form of bits using this novel framework


2017 ◽  
Vol 18 (1) ◽  
pp. 3-32 ◽  
Author(s):  
Boris Kovalerchuk ◽  
Vladimir Grishin

Preserving all multidimensional data in two-dimensional visualization is a long-standing problem in Visual Analytics, Machine Learning/Data Mining, and Multiobjective Pareto Optimization. While Parallel and Radial (Star) coordinates preserve all n-D data in two dimensions, they are not sufficient to address visualization challenges of all possible datasets such as occlusion. More such methods are needed. Recently, the concepts of lossless General Line Coordinates that generalize Parallel, Radial, Cartesian, and other coordinates were proposed with initial exploration and application of several subclasses of General Line Coordinates such as Collocated Paired Coordinates and Star Collocated Paired Coordinates. This article explores and enhances benefits of General Line Coordinates. It shows the ways to increase expressiveness of General Line Coordinates including decreasing occlusion and simplifying visual pattern while preserving all n-D data in two dimensions by adjusting General Line Coordinates for given n-D datasets. The adjustments include relocating, rescaling, and other transformations of General Line Coordinates. One of the major sources of benefits of General Line Coordinates relative to Parallel Coordinates is twice less number of point and lines in visual representation of each n-D points. This article demonstrates the benefits of different General Line Coordinates for real data visual analysis such as health monitoring and benchmark Iris data classification compared with results from Parallel Coordinates, Radvis, and Support Vector Machine. The experimental part of the article presents the results of the experiment with about 70 participants on efficiency of visual pattern discovery using Star Collocated Paired Coordinates, Parallel, and Radial Coordinates. It shows advantages of visual discovery of n-D patterns using General Line Coordinates subclass Star Collocated Paired Coordinates with n = 160 dimensions.


Sign in / Sign up

Export Citation Format

Share Document