WEB APPLICATION FOR LARGE-SCALE MULTIDIMENSIONAL DATA VISUALIZATION

In this paper, we present an approach of the web application (as a service) for data mining oriented to the multidimensional data visualization. This paper focuses on visualization methods as a tool for the visual presentation of large-scale multidimensional data sets. The proposed implementation of such a web application obtains a multidimensional data set and as a result produces a visualization of this data set. It also supports different configuration parameters of the data mining methods used. Parallel computation has been used in the proposed implementation to run the algorithms simultaneously on different computers.

Download Full-text

Large-Scale Multidimensional Data Visualization: A Web Service for Data Mining

Towards a Service-Based Internet - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24755-2_2 ◽

2011 ◽

pp. 14-25 ◽

Cited By ~ 4

Author(s):

Gintautas Dzemyda ◽

Virginijus Marcinkevičius ◽

Viktor Medvedev

Keyword(s):

Data Mining ◽

Web Service ◽

Data Visualization ◽

Large Scale ◽

Multidimensional Data ◽

Multidimensional Data Visualization

Download Full-text

Data mining in web personalization using the blended deep learning model

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v20.i3.pp1507-1512 ◽

2020 ◽

Vol 20 (3) ◽

pp. 1507

Author(s):

Qusay Abdullah Abed ◽

Osamah Mohammed Fadhil ◽

Wathiq Laftah Al-Yaseen

Keyword(s):

Data Mining ◽

Deep Learning ◽

Web Application ◽

Large Data ◽

Learning Model ◽

Multidimensional Data ◽

Data Sets ◽

Web Personalization ◽

Deep Learning Model ◽

The Web

In general, multidimensional data (mobile application for example) contain a large number of unnecessary information. Web app users find it difficult to get the information needed quickly and effectively due to the sheer volume of data (big data produced per second). In this paper, we tend to study the data mining in web personalization using blended deep learning model. So, one of the effective solutions to this problem is web personalization. As well as, explore how this model helps to analyze and estimate the huge amounts of operations. Providing personalized recommendations to improve reliability depends on the web application using useful information in the web application. The results of this research are important for the training and testing of large data sets for a map of deep mixed learning based on the model of back-spread neural network. The HADOOP framework was used to perform a number of experiments in a different environment with a learning rate between -1 and +1. Also, using the number of techniques to evaluate the number of parameters, true positive cases are represent and fall into positive cases in this example to evaluate the proposed model.

Download Full-text

A Taxonomy of Glyph Placement Strategies for Multidimensional Data Visualization

Information Visualization ◽

10.1057/palgrave.ivs.9500025 ◽

2002 ◽

Vol 1 (3-4) ◽

pp. 194-210 ◽

Cited By ~ 115

Author(s):

Matthew O Ward

Keyword(s):

Multidimensional Data ◽

Future Research ◽

Data Sets ◽

Significant Information ◽

Data Set ◽

Multidimensional Data Visualization ◽

Data Points ◽

Future Research Directions ◽

Visualization Of Data ◽

Using Data

Glyphs are graphical entities that convey one or more data values via attributes such as shape, size, color, and position. They have been widely used in the visualization of data and information, and are especially well suited for displaying complex, multivariate data sets. The placement or layout of glyphs on a display can communicate significant information regarding the data values themselves as well as relationships between data points, and a wide assortment of placement strategies have been developed to date. Methods range from simply using data dimensions as positional attributes to basing placement on implicit or explicit structure within the data set. This paper presents an overview of multivariate glyphs, a list of issues regarding the layout of glyphs, and a comprehensive taxonomy of placement strategies to assist the visualization designer in selecting the technique most suitable to his or her data and task. Examples, strengths, weaknesses, and design considerations are given for each category of technique. We conclude with some general guidelines for selecting a placement strategy, along with a brief description of some of our future research directions.

Download Full-text

DataMeadow: A Visual Canvas for Analysis of Large-Scale Multivariate Data

Information Visualization ◽

10.1057/palgrave.ivs.9500170 ◽

2008 ◽

Vol 7 (1) ◽

pp. 18-33 ◽

Cited By ~ 34

Author(s):

Niklas Elmqvist ◽

John Stasko ◽

Philippas Tsigas

Keyword(s):

Visual Analytics ◽

Large Scale ◽

Multidimensional Data ◽

Data Sets ◽

Data Set ◽

Data Dependencies ◽

Expert Review ◽

History Of ◽

Multidimensional Data Sets ◽

High Degree

Supporting visual analytics of multiple large-scale multidimensional data sets requires a high degree of interactivity and user control beyond the conventional challenges of visualizing such data sets. We present the DataMeadow, a visual canvas providing rich interaction for constructing visual queries using graphical set representations called DataRoses. A DataRose is essentially a starplot of selected columns in a data set displayed as multivariate visualizations with dynamic query sliders integrated into each axis. The purpose of the DataMeadow is to allow users to create advanced visual queries by iteratively selecting and filtering into the multidimensional data. Furthermore, the canvas provides a clear history of the analysis that can be annotated to facilitate dissemination of analytical results to stakeholders. A powerful direct manipulation interface allows for selection, filtering, and creation of sets, subsets, and data dependencies. We have evaluated our system using a qualitative expert review involving two visualization researchers. Results from this review are favorable for the new method.

Download Full-text

A New Analytics Model for Large Scale Multidimensional Data Visualization

Cloud Computing and Big Data - Lecture Notes in Computer Science ◽

10.1007/978-3-319-28430-9_5 ◽

2015 ◽

pp. 55-71

Author(s):

Jinson Zhang ◽

Mao Lin Huang

Keyword(s):

Data Visualization ◽

Large Scale ◽

Multidimensional Data ◽

Multidimensional Data Visualization

Download Full-text

Unsupervised Large‐Scale Search for Similar Earthquake Signals

Bulletin of the Seismological Society of America ◽

10.1785/0120190006 ◽

2019 ◽

Vol 109 (4) ◽

pp. 1451-1468 ◽

Cited By ~ 1

Author(s):

Clara E. Yoon ◽

Karianne J. Bergen ◽

Kexin Rong ◽

Hashem Elezabi ◽

William L. Ellsworth ◽

...

Keyword(s):

Data Mining ◽

Nuclear Power ◽

Large Scale ◽

Seismic Network ◽

Continuous Data ◽

Data Sets ◽

Data Set ◽

Data Mining Algorithms ◽

Seismicity Rates ◽

Unsupervised Data Mining

Abstract Seismology has continuously recorded ground‐motion spanning up to decades. Blind, uninformed search for similar‐signal waveforms within this continuous data can detect small earthquakes missing from earthquake catalogs, yet doing so with naive approaches is computationally infeasible. We present results from an improved version of the Fingerprint And Similarity Thresholding (FAST) algorithm, an unsupervised data‐mining approach to earthquake detection, now available as open‐source software. We use FAST to search for small earthquakes in 6–11 yr of continuous data from 27 channels over an 11‐station local seismic network near the Diablo Canyon nuclear power plant in central California. FAST detected 4554 earthquakes in this data set, with a 7.5% false detection rate: 4134 of the detected events were previously cataloged earthquakes located across California, and 420 were new local earthquake detections with magnitudes −0.3≤ML≤2.4, of which 224 events were located near the seismic network. Although seismicity rates are low, this study confirms that nearby faults are active. This example shows how seismology can leverage recent advances in data‐mining algorithms, along with improved computing power, to extract useful additional earthquake information from long‐duration continuous data sets.

Download Full-text

STRATEGIES OF SELECTING THE BASIS VECTOR SET IN THE RELATIVE MDS

Technological and Economic Development of Economy ◽

10.3846/13928619.2006.9637755 ◽

2006 ◽

Vol 12 (4) ◽

pp. 283-288

Author(s):

Jolita Bernatavičienė ◽

Gintautas Dzemyda ◽

Olga Kurasova ◽

Virginijus Marcinkevičius

Keyword(s):

Experimental Investigation ◽

Data Visualization ◽

Basis Vector ◽

Multidimensional Data ◽

Clustering Method ◽

Data Set ◽

Original Algorithm ◽

Multidimensional Data Visualization ◽

Visualization Process ◽

Basis Vectors

In this paper, a method of large multidimensional data visualization that associates the multidimensional scaling (MDS) with clustering is modified and investigated. In the original algorithm, the visualization process is divided into three steps: the basis vector set is constructed using the k‐means clustering method; this set is projected onto the plane using the MDS algorithm; the remaining data set is visualized using the relative MDS algorithm. We propose a modification which differs from the original algorithm in the strategy of selecting the basis vectors. In our modification, the set of basis vectors consists of vectors that are selected from k clusters in a new way. The experimental investigation showed that the modification exceeds the original algorithm in visualization quality and computational expenses.

Download Full-text

A comparative study of Distributed Large Scale Data Mining Algorithms

BSSS Journal of Computer ◽

10.51767/jc1102 ◽

2020 ◽

Author(s):

Isha Sood ◽

Varsha Sharma

Keyword(s):

Data Mining ◽

Large Scale ◽

Large Data ◽

Data Sets ◽

Data Set ◽

Data Mining Algorithms ◽

Large Scale Data ◽

Mapreduce Model ◽

Mining Algorithms ◽

Scale Data

Essentially, data mining concerns the computation of data and the identification of patterns and trends in the information so that we might decide or judge. Data mining concepts have been in use for years, but with the emergence of big data, they are even more common. In particular, the scalable mining of such large data sets is a difficult issue that has attached several recent findings. A few of these recent works use the MapReduce methodology to construct data mining models across the data set. In this article, we examine current approaches to large-scale data mining and compare their output to the MapReduce model. Based on our research, a system for data mining that combines MapReduce and sampling is implemented and addressed

Download Full-text

Analysis and Implementation of Multidimensional Data Visualization Methods in Large-Scale Power Internet of Things

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Broadband Communications, Networks, and Systems ◽

10.1007/978-3-030-36442-7_9 ◽

2019 ◽

pp. 135-143

Author(s):

Zhoubin Liu ◽

Zixiang Wang ◽

Boyang Wei ◽

Xiaolu Yuan

Keyword(s):

Internet Of Things ◽

Data Visualization ◽

Large Scale ◽

Multidimensional Data ◽

Multidimensional Data Visualization

Download Full-text

Galaxy spin direction distribution in HST and SDSS show similar large-scale asymmetry

Publications of the Astronomical Society of Australia ◽

10.1017/pasa.2020.46 ◽

2020 ◽

Vol 37 ◽

Author(s):

Lior Shamir

Keyword(s):

Large Scale ◽

Spiral Galaxies ◽

Hubble Space Telescope ◽

Gravitational Interaction ◽

Large Data ◽

Sloan Digital Sky Survey ◽

Data Sets ◽

Dipole Axis ◽

Data Set ◽

The Asymmetry

Abstract Several recent observations using large data sets of galaxies showed non-random distribution of the spin directions of spiral galaxies, even when the galaxies are too far from each other to have gravitational interaction. Here, a data set of $\sim8.7\cdot10^3$ spiral galaxies imaged by Hubble Space Telescope (HST) is used to test and profile a possible asymmetry between galaxy spin directions. The asymmetry between galaxies with opposite spin directions is compared to the asymmetry of galaxies from the Sloan Digital Sky Survey. The two data sets contain different galaxies at different redshift ranges, and each data set was annotated using a different annotation method. The results show that both data sets show a similar asymmetry in the COSMOS field, which is covered by both telescopes. Fitting the asymmetry of the galaxies to cosine dependence shows a dipole axis with probabilities of $\sim2.8\sigma$ and $\sim7.38\sigma$ in HST and SDSS, respectively. The most likely dipole axis identified in the HST galaxies is at $(\alpha=78^{\rm o},\delta=47^{\rm o})$ and is well within the $1\sigma$ error range compared to the location of the most likely dipole axis in the SDSS galaxies with $z>0.15$ , identified at $(\alpha=71^{\rm o},\delta=61^{\rm o})$ .

Download Full-text