Hierarchical Exploration of Large Multivariate Data Sets

2003 ◽  
pp. 201-212 ◽  
Author(s):  
Jing Yang ◽  
Matthew O. Ward ◽  
Elke A. Rundensteiner
Keyword(s):  
2019 ◽  
Vol 19 (1) ◽  
pp. 3-23
Author(s):  
Aurea Soriano-Vargas ◽  
Bernd Hamann ◽  
Maria Cristina F de Oliveira

We present an integrated interactive framework for the visual analysis of time-varying multivariate data sets. As part of our research, we performed in-depth studies concerning the applicability of visualization techniques to obtain valuable insights. We consolidated the considered analysis and visualization methods in one framework, called TV-MV Analytics. TV-MV Analytics effectively combines visualization and data mining algorithms providing the following capabilities: (1) visual exploration of multivariate data at different temporal scales, and (2) a hierarchical small multiples visualization combined with interactive clustering and multidimensional projection to detect temporal relationships in the data. We demonstrate the value of our framework for specific scenarios, by studying three use cases that were validated and discussed with domain experts.


Author(s):  
Gudmund Kleiven

The Empirical Orthogonal Functions (EOF) technique has widely being used by oceanographers and meteorologists, while the Singular Value Decomposition (SVD being a related technique is frequently used in the statistics community. Another related technique called Principal Component Analysis (PCA) is observed being used for instance in pattern recognition. The predominant applications of these techniques are data compression of multivariate data sets which also facilitates subsequent statistical analysis of such data sets. Within Ocean Engineering the EOF technique is not yet widely in use, although there are several areas where multivariate data sets occur and where the EOF technique could represent a supplementary analysis technique. Examples are oceanographic data, in particular current data. Furthermore data sets of model- or full-scale data of loads and responses of slender bodies, such as pipelines and risers are relevant examples. One attractive property of the EOF technique is that it does not require any a priori information on the physical system by which the data is generated. In the present paper a description of the EOF technique is given. Thereafter an example on use of the EOF technique is presented. The example is analysis of response data from a model test of a pipeline in a long free span exposed to current. The model test program was carried out in order to identify the occurrence of multi-mode vibrations and vibration mode amplitudes. In the present example the EOF technique demonstrates the capability of identifying predominant vibration modes of inline as well as cross-flow vibrations. Vibration mode shapes together with mode amplitudes and frequencies are also estimated. Although the present example is not sufficient for concluding on the applicability of the EOF technique on a general basis, the results of the present example demonstrate some of the potential of the technique.


2013 ◽  
Vol 19 (12) ◽  
pp. 2683-2692 ◽  
Author(s):  
Ayan Biswas ◽  
Soumya Dutta ◽  
Han-Wei Shen ◽  
Jonathan Woodring
Keyword(s):  

Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 166 ◽  
Author(s):  
Rahim Khan ◽  
Ihsan Ali ◽  
Saleh M. Altowaijri ◽  
Muhammad Zakarya ◽  
Atiq Ur Rahman ◽  
...  

Multivariate data sets are common in various application areas, such as wireless sensor networks (WSNs) and DNA analysis. A robust mechanism is required to compute their similarity indexes regardless of the environment and problem domain. This study describes the usefulness of a non-metric-based approach (i.e., longest common subsequence) in computing similarity indexes. Several non-metric-based algorithms are available in the literature, the most robust and reliable one is the dynamic programming-based technique. However, dynamic programming-based techniques are considered inefficient, particularly in the context of multivariate data sets. Furthermore, the classical approaches are not powerful enough in scenarios with multivariate data sets, sensor data or when the similarity indexes are extremely high or low. To address this issue, we propose an efficient algorithm to measure the similarity indexes of multivariate data sets using a non-metric-based methodology. The proposed algorithm performs exceptionally well on numerous multivariate data sets compared with the classical dynamic programming-based algorithms. The performance of the algorithms is evaluated on the basis of several benchmark data sets and a dynamic multivariate data set, which is obtained from a WSN deployed in the Ghulam Ishaq Khan (GIK) Institute of Engineering Sciences and Technology. Our evaluation suggests that the proposed algorithm can be approximately 39.9% more efficient than its counterparts for various data sets in terms of computational time.


2020 ◽  
Vol 34 (04) ◽  
pp. 6786-6794
Author(s):  
Lifeng Zhang

Detecting relationships among multivariate data is often of great importance in the analysis of high-dimensional data sets, and has received growing attention for decades from both academic and industrial fields. In this study, we propose a statistical tool named the neighbor correlation coefficient (nCor), which is based on a new idea that measures the local continuity of the reordered data points to quantify the strength of the global association between variables. With sufficient sample size, the new method is able to capture a wide range of functional relationship, whether it is linear or nonlinear, bivariate or multivariate, main effect or interaction. The score of nCor roughly approximates the coefficient of determination (R2) of the data which implies the proportion of variance in one variable that is predictable from one or more other variables. On this basis, three nCor based statistics are also proposed here to further characterize the intra and inter structures of the associations from the aspects of nonlinearity, interaction effect, and variable redundancy. The mechanisms of these measures are proved in theory and demonstrated with numerical analyses.


2021 ◽  
pp. 147387162110506
Author(s):  
Kenan Koc ◽  
Andrew Stephen McGough ◽  
Sara Johansson Fernstad

For many data analysis tasks, such as the formation of well-balanced groups for a fair race or collaboration in learning settings, the balancing between data attributes is at least as important as the actual values of items. At the same time, comparison of values is implicitly desired for these tasks. Even with statistical methods available to measure the level of balance, human judgment, and domain expertise plays an important role in judging the level of balance, and whether the level of unbalance is acceptable or not. Accordingly, there is a need for techniques that improve decision-making in the context of group formation that can be used as a visual complement to statistical analysis. This paper introduces a novel glyph-based visualization, PeaGlyph, which aims to support the understanding of balanced and unbalanced data structures, for instance by using a frequency format through countable marks and salient shape characteristics. The glyph was designed particularly for tasks of relevance for investigation of properties of balanced and unbalanced groups, such as looking-up and comparing values. Glyph-based visualization methods provide flexible and useful abstractions for exploring and analyzing multivariate data sets. The PeaGlyph design was based on an initial study that compared four glyph visualization methods in a joint study, including two base glyphs and their variations. The performance of the novel PeaGlyph was then compared to the best “performers” of the first study through evaluation. The initial results from the study are encouraging, and the proposed design may be a good alternative to the traditional glyphs for depicting multivariate data and allowing viewers to form an intuitive impression as to how balanced or unbalanced a set of objects are. Furthermore, a set of design considerations is discussed in context of the design of the glyphs.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 21759-21769
Author(s):  
Rahim Khan ◽  
Muhammad Zakarya ◽  
Ayaz Ali Khan ◽  
Izaz Ur Rahman ◽  
Mohd Amiruddin Abd Rahman ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document