scholarly journals Comprehensible Visualization of Multidimensional Data: Sum of Ranking Differences-Based Parallel Coordinates

Mathematics ◽  
2021 ◽  
Vol 9 (24) ◽  
pp. 3203
Author(s):  
Ádám Ipkovich ◽  
Károly Héberger ◽  
János Abonyi

A novel visualization technique is proposed for the sum of ranking differences method (SRD) based on parallel coordinates. An axis is defined for each variable, on which the data are depicted row-wise. By connecting data, the lines may intersect. The fewer intersections between the variables, the more similar they are and the clearer the figure becomes. Therefore, the visualization depends on what techniques are used to order the variables. The key idea is to employ the SRD method to measure the degree of similarity of the variables, establishing a distance-based order. The distances between the axes are not uniformly distributed in the proposed visualization; their closeness reflects similarity, according to their SRD value. The proposed algorithm identifies false similarities through an iterative approach, where the angles between the SRD values determine which side a variable is plotted. Visualization of the algorithm is provided by MATLAB/Octave source codes. The proposed tool is applied to study how the sources of greenhouse gas emissions can be grouped based on the statistical data of the countries. A comparison to multidimensional scaling (MDS)-based ordering is also given. The use case demonstrates the applicability of the method and the synergies of the incorporation of the SRD method into parallel coordinates.

Author(s):  
Никита Сергеевич Олейник ◽  
Владислав Юрьевич Щеколдин

Рассмотрена задача выявления аномальных наблюдений в данных больших размерностей на основе метода многомерного шкалирования с учетом возможности построения качественной визуализации данных. Предложен алгоритм модифицированного метода главных проекций Торгерсона, основанный на построении подпространства проектирования исходных данных путем изменения способа факторизации матрицы скалярных произведений при помощи метода анализа кумулятивных кривых. Построено и проанализировано эмпирическое распределение F -меры для разных вариантов проектирования исходных данных Purpose. Purpose of the article. The paper aims at the development of methods for multidimensional data presentation for solving classification problems based on the cumulative curves analysis. The paper considers the outlier detection problem for high-dimensional data based on the multidimensional scaling, in order to construct high-quality data visualization. An abnormal observation (or outlier), according to D. Hawkins, is an observation that is so different from others that it may be assumed as appeared in the sample in a fundamentally different way. Methods. One of the conceptual approaches that allow providing the classification of sample observations is multidimensional scaling, representing by the classical Orlochi method, the Torgerson main projections and others. The Torgerson method assumes that when converting data to construct the most convenient classification, the origin must be placed at the gravity center of the analyzed data, after which the matrix of scalar products of vectors with the origin at the gravity center is calculated, the two largest eigenvalues and corresponding eigenvectors are chosen and projection matrix is evaluated. Moreover, the method assumes the linear partitioning of regular and anomalous observations, which arises rarely. Therefore, it is logical to choose among the possible axes for designing those that allow obtaining more effective results for solving the problem of detecting outlier observations. A procedure of modified CC-ABOD (Cumulative Curves for Angle Based Outlier Detection) to estimate the visualization quality has been applied. It is based on the estimation of the variances of angles assumed by particular observation and remaining observations in multidimensional space. Further the cumulative curves analysis is implemented, which allows partitioning out groups of closely localized observations (in accordance with the chosen metric) and form classes of regular, intermediate, and anomalous observations. Results. A proposed modification of the Torgerson method is developed. The F1-measure distribution is constructed and analyzed for different design options in the source data. An analysis of the empirical distribution showed that in a number of cases the best axes are corresponding to the second, third, or even fourth largest eigenvalues. Findings. The multidimensional scaling methods for constructing visualizations of multi-dimensional data and solving problems of outlier detection have been considered. It was found out that the determination of design is an ambiguous problem.


2015 ◽  
Vol 7 (3) ◽  
pp. 275-279 ◽  
Author(s):  
Agnė Dzidolikaitė

The paper analyzes global optimization problem. In order to solve this problem multidimensional scaling algorithm is combined with genetic algorithm. Using multidimensional scaling we search for multidimensional data projections in a lower-dimensional space and try to keep dissimilarities of the set that we analyze. Using genetic algorithms we can get more than one local solution, but the whole population of optimal points. Different optimal points give different images. Looking at several multidimensional data images an expert can notice some qualities of given multidimensional data. In the paper genetic algorithm is applied for multidimensional scaling and glass data is visualized, and certain qualities are noticed. Analizuojamas globaliojo optimizavimo uždavinys. Jis apibrėžiamas kaip netiesinės tolydžiųjų kintamųjų tikslo funkcijos optimizavimas leistinojoje srityje. Optimizuojant taikomi įvairūs algoritmai. Paprastai taikant tikslius algoritmus randamas tikslus sprendinys, tačiau tai gali trukti labai ilgai. Dažnai norima gauti gerą sprendinį per priimtiną laiko tarpą. Tokiu atveju galimi kiti – euristiniai, algoritmai, kitaip dar vadinami euristikomis. Viena iš euristikų yra genetiniai algoritmai, kopijuojantys gyvojoje gamtoje vykstančią evoliuciją. Sudarant algoritmus naudojami evoliuciniai operatoriai: paveldimumas, mutacija, selekcija ir rekombinacija. Taikant genetinius algoritmus galima rasti pakankamai gerus sprendinius tų uždavinių, kuriems nėra tikslių algoritmų. Genetiniai algoritmai taip pat taikytini vizualizuojant duomenis daugiamačių skalių metodu. Taikant daugiamates skales ieškoma daugiamačių duomenų projekcijų mažesnio skaičiaus matmenų erdvėje siekiant išsaugoti analizuojamos aibės panašumus arba skirtingumus. Taikant genetinius algoritmus gaunamas ne vienas lokalusis sprendinys, o visa optimumų populiacija. Skirtingi optimumai atitinka skirtingus vaizdus. Matydamas kelis daugiamačių duomenų variantus, ekspertas gali įžvelgti daugiau daugiamačių duomenų savybių. Straipsnyje genetinis algoritmas pritaikytas daugiamatėms skalėms. Parodoma, kad daugiamačių skalių algoritmą galima kombinuoti su genetiniu algoritmu ir panaudoti daugiamačiams duomenims vizualizuoti.


2018 ◽  
Vol 224 ◽  
pp. 02071
Author(s):  
Dmitrii Voronin ◽  
Victoria Shevchenko ◽  
Olga Chengar

Scientific problems related to the classification, assessment, visualization and management of risks in the cloud environments have been considered. The analysis of the state-of-the-art methods, offered for these problems solving, has been carried out taking into account the specificity of the cloud infrastructure oriented on large-scale tasks processing in distributed production infrastructures. Unfortunately, not much of scientific and objective researches had been focused on the developing of effective approaches for cloud risks visualization providing the necessary information to support decision-making in distributed production infrastructures. In order to fill this research gap, this study attempts to propose a risks visualization technique that is based on radar chart implementation for multidimensional data visualization.


2009 ◽  
Vol 14 (2) ◽  
pp. 259-270 ◽  
Author(s):  
Julius Žilinskas

Multidimensional scaling is a technique for exploratory analysis of multidimensional data. The essential part of the technique is minimization of a multimodal function with unfavorable properties like invariants and non‐differentiability. In this paper a two‐level optimization based on combinatorial optimization and systems of linear equations is proposed exploiting piecewise quadratic structure of the objective function with city‐block distances. The approach is tested experimentally and improvement directions are identified.


2021 ◽  
Vol 35 (2) ◽  
pp. 115-122
Author(s):  
Mohan Mahanty ◽  
K. Swathi ◽  
K. Sasi Teja ◽  
P. Hemanth Kumar ◽  
A. Sravani

COVID-19 pandemic shook the whole world with its brutality, and the spread has been still rising on a daily basis, causing many nations to suffer seriously. This paper presents a medical stance on research studies of COVID-19, wherein we estimated a time-series data-based statistical model using prophet to comprehend the trend of the current pandemic in the coming future after July 29, 2020 by using data at a global level. Prophet is an open-source framework discovered by the Data Science team at Facebook for carrying out forecasting based operations. It aids to automate the procedure of developing accurate forecasts and can be customized according to the use case we are solving. The Prophet model is easy to work because the official repository of prophet is live on GitHub and is open for contributions and can be fitted effortlessly. The statistical data presented on the paper refers to the number of daily confirmed cases officially for the period January 22, 2020, to July 29, 2020. The estimated data produced by the forecast models can then be used by Governments and medical care departments of various countries to manage the existing situation, thus trying to flatten the curve in various nations as we believe that there is minimal time to do this. The inferences made using the model can be clearly comprehended without much effort. Furthermore, it tries to give an understanding of the past, present, and future trends by showing graphical forecasts and statistics. Compared to other models, prophet specifically holds its own importance and innovativeness as the model is fully automated and generates quick and precise forecasts that can be tunable additionally.


Author(s):  
Olga Blazekova ◽  
Maria Vojtekova

Airspace domain may be represented by a time-space consisting of a three-dimensional Cartesian coordinate system and time as the fourth dimension. A coordinate system provides a scheme for locating points given its coordinates and vice versa. The choice of coordinate system is important, as it transforms data to geometric representation. Visualization of the three and more dimensional data on the two-dimensional drawing - computer monitor is usually done by projection, which often can restrict the amount of information presented at a time. Using the parallel coordinate system is one of possibilities to present multidimensional data. The aim of this article is to describe basics of parallel coordinate system and to investigate lines and their characteristics in time-space.


2014 ◽  
Vol 59 (2) ◽  
pp. 413-425 ◽  
Author(s):  
Dariusz Jamróz

Abstract Visualization of multidimensional data is a new way of statistical analysis of so-called statistical graphical methods. These methods allow to classify some analyzed objects, including their various features. Facing grained materials problems, like coal or ores many characteristics have an influence on the quality of product. In case of coal, many features must be taken into consideration to determine quality of the material. Apart from most obvious characteristics like particle size, particle density or ash contents there are many others which cause significant differences between considered types of material. In the paper the application of Multidimensional Scaling Method is presented which is one of the multidimensional data visualization techniques. To this purpose, sampling of three types of coal was performed, which were 31, 34.2 and 35 (according to Polish classification of coal types). First, the material was screened on sieves and then divided into density fractions. Next step was to analyze chemically the obtained particle and size fractions of researched coal. Then, the Multidimensional Scaling Method was applied to visualize the investigated set of data. It was proved that the applied methodology allows to identify certain coal types efficiently and can be used as a qualitative criterion for grained materials. However, it was impossible to achieve such identification comparing all three types of coal together. The Multidimensional Scaling Method is new technique of data analysis concerning widely understood mineral processing.


1976 ◽  
Vol 43 (2) ◽  
pp. 575-584 ◽  
Author(s):  
James R. Ullrich ◽  
Maureen F. Ullrich

To investigate the usefulness of multidimensional scaling analysis, 24 canoeists and fishermen were asked to judge the degree of similarity between all possible pairings of 12 river sections in Western Montana. Using a multidimensional scaling method it was shown that perception of the river was based on a size dimension (physical breadth of the river) and altered vs natural dimension (pristine versus developed).


Sign in / Sign up

Export Citation Format

Share Document