scholarly journals WEB APPLICATION FOR LARGE-SCALE MULTIDIMENSIONAL DATA VISUALIZATION

2011 ◽  
Vol 16 (1) ◽  
pp. 273-285 ◽  
Author(s):  
Gintautas Dzemyda ◽  
Virginijus Marcinkevičius ◽  
Viktor Medvedev

In this paper, we present an approach of the web application (as a service) for data mining oriented to the multidimensional data visualization. This paper focuses on visualization methods as a tool for the visual presentation of large-scale multidimensional data sets. The proposed implementation of such a web application obtains a multidimensional data set and as a result produces a visualization of this data set. It also supports different configuration parameters of the data mining methods used. Parallel computation has been used in the proposed implementation to run the algorithms simultaneously on different computers.

Author(s):  
Qusay Abdullah Abed ◽  
Osamah Mohammed Fadhil ◽  
Wathiq Laftah Al-Yaseen

In general, multidimensional data (mobile application for example) contain a large number of unnecessary information. Web app users find it difficult to get the information needed quickly and effectively due to the sheer volume of data (big data produced per second). In this paper, we tend to study the data mining in web personalization using blended deep learning model. So, one of the effective solutions to this problem is web personalization. As well as, explore how this model helps to analyze and estimate the huge amounts of operations. Providing personalized recommendations to improve reliability depends on the web application using useful information in the web application. The results of this research are important for the training and testing of large data sets for a map of deep mixed learning based on the model of back-spread neural network. The HADOOP framework was used to perform a number of experiments in a different environment with a learning rate between -1 and +1. Also, using the number of techniques to evaluate the number of parameters, true positive cases are represent and fall into positive cases in this example to evaluate the proposed model.


2002 ◽  
Vol 1 (3-4) ◽  
pp. 194-210 ◽  
Author(s):  
Matthew O Ward

Glyphs are graphical entities that convey one or more data values via attributes such as shape, size, color, and position. They have been widely used in the visualization of data and information, and are especially well suited for displaying complex, multivariate data sets. The placement or layout of glyphs on a display can communicate significant information regarding the data values themselves as well as relationships between data points, and a wide assortment of placement strategies have been developed to date. Methods range from simply using data dimensions as positional attributes to basing placement on implicit or explicit structure within the data set. This paper presents an overview of multivariate glyphs, a list of issues regarding the layout of glyphs, and a comprehensive taxonomy of placement strategies to assist the visualization designer in selecting the technique most suitable to his or her data and task. Examples, strengths, weaknesses, and design considerations are given for each category of technique. We conclude with some general guidelines for selecting a placement strategy, along with a brief description of some of our future research directions.


2008 ◽  
Vol 7 (1) ◽  
pp. 18-33 ◽  
Author(s):  
Niklas Elmqvist ◽  
John Stasko ◽  
Philippas Tsigas

Supporting visual analytics of multiple large-scale multidimensional data sets requires a high degree of interactivity and user control beyond the conventional challenges of visualizing such data sets. We present the DataMeadow, a visual canvas providing rich interaction for constructing visual queries using graphical set representations called DataRoses. A DataRose is essentially a starplot of selected columns in a data set displayed as multivariate visualizations with dynamic query sliders integrated into each axis. The purpose of the DataMeadow is to allow users to create advanced visual queries by iteratively selecting and filtering into the multidimensional data. Furthermore, the canvas provides a clear history of the analysis that can be annotated to facilitate dissemination of analytical results to stakeholders. A powerful direct manipulation interface allows for selection, filtering, and creation of sets, subsets, and data dependencies. We have evaluated our system using a qualitative expert review involving two visualization researchers. Results from this review are favorable for the new method.


2019 ◽  
Vol 109 (4) ◽  
pp. 1451-1468 ◽  
Author(s):  
Clara E. Yoon ◽  
Karianne J. Bergen ◽  
Kexin Rong ◽  
Hashem Elezabi ◽  
William L. Ellsworth ◽  
...  

Abstract Seismology has continuously recorded ground‐motion spanning up to decades. Blind, uninformed search for similar‐signal waveforms within this continuous data can detect small earthquakes missing from earthquake catalogs, yet doing so with naive approaches is computationally infeasible. We present results from an improved version of the Fingerprint And Similarity Thresholding (FAST) algorithm, an unsupervised data‐mining approach to earthquake detection, now available as open‐source software. We use FAST to search for small earthquakes in 6–11 yr of continuous data from 27 channels over an 11‐station local seismic network near the Diablo Canyon nuclear power plant in central California. FAST detected 4554 earthquakes in this data set, with a 7.5% false detection rate: 4134 of the detected events were previously cataloged earthquakes located across California, and 420 were new local earthquake detections with magnitudes −0.3≤ML≤2.4, of which 224 events were located near the seismic network. Although seismicity rates are low, this study confirms that nearby faults are active. This example shows how seismology can leverage recent advances in data‐mining algorithms, along with improved computing power, to extract useful additional earthquake information from long‐duration continuous data sets.


2006 ◽  
Vol 12 (4) ◽  
pp. 283-288
Author(s):  
Jolita Bernatavičienė ◽  
Gintautas Dzemyda ◽  
Olga Kurasova ◽  
Virginijus Marcinkevičius

In this paper, a method of large multidimensional data visualization that associates the multidimensional scaling (MDS) with clustering is modified and investigated. In the original algorithm, the visualization process is divided into three steps: the basis vector set is constructed using the k‐means clustering method; this set is projected onto the plane using the MDS algorithm; the remaining data set is visualized using the relative MDS algorithm. We propose a modification which differs from the original algorithm in the strategy of selecting the basis vectors. In our modification, the set of basis vectors consists of vectors that are selected from k clusters in a new way. The experimental investigation showed that the modification exceeds the original algorithm in visualization quality and computational expenses.


2020 ◽  
Author(s):  
Isha Sood ◽  
Varsha Sharma

Essentially, data mining concerns the computation of data and the identification of patterns and trends in the information so that we might decide or judge. Data mining concepts have been in use for years, but with the emergence of big data, they are even more common. In particular, the scalable mining of such large data sets is a difficult issue that has attached several recent findings. A few of these recent works use the MapReduce methodology to construct data mining models across the data set. In this article, we examine current approaches to large-scale data mining and compare their output to the MapReduce model. Based on our research, a system for data mining that combines MapReduce and sampling is implemented and addressed


Author(s):  
Lior Shamir

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .


Sign in / Sign up

Export Citation Format

Share Document