A Novel Approach to Visualization of High-Dimensional Data by Pairwise Fusion Matrices Using t-SNE

Clustering Based Feature Data Selection Technique Algorithm for High Dimensional Data: A Novel Approach

10.9734/bpi/nvst/v7/5002f ◽

2021 ◽

pp. 33-38

Author(s):

Amos R ◽

Kowshik N ◽

Suraksha M. S

Keyword(s):

High Dimensional Data ◽

Data Selection ◽

High Dimensional ◽

Selection Technique ◽

Novel Approach

Download Full-text

A Novel Approach for High Dimensional Data Clustering

2010 Third International Conference on Knowledge Discovery and Data Mining ◽

10.1109/wkdd.2010.120 ◽

2010 ◽

Cited By ~ 3

Author(s):

A. Alijamaat ◽

M. Khalilian ◽

N. Mustapha

Keyword(s):

Data Clustering ◽

High Dimensional Data ◽

High Dimensional ◽

Novel Approach

Download Full-text

A novel approach to understanding Parkinsonian cognitive decline using minimum spanning trees, edge cutting, and magnetoencephalography

Scientific Reports ◽

10.1038/s41598-021-99167-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Olivier B. Simon ◽

Isabelle Buard ◽

Donald C. Rojas ◽

Samantha K. Holden ◽

Benzi M. Kluger ◽

...

Keyword(s):

Gene Expression Analysis ◽

Spanning Trees ◽

Minimum Spanning Tree ◽

High Dimensional Data ◽

Null Distribution ◽

High Dimensional ◽

Neural Connectivity ◽

Cross Sectional ◽

Novel Approach ◽

Wide Range

AbstractGraph theory-based approaches are efficient tools for detecting clustering and group-wise differences in high-dimensional data across a wide range of fields, such as gene expression analysis and neural connectivity. Here, we examine data from a cross-sectional, resting-state magnetoencephalography study of 89 Parkinson’s disease patients, and use minimum-spanning tree (MST) methods to relate severity of Parkinsonian cognitive impairment to neural connectivity changes. In particular, we implement the two-sample multivariate-runs test of Friedman and Rafsky (Ann Stat 7(4):697–717, 1979) and find it to be a powerful paradigm for distinguishing highly significant deviations from the null distribution in high-dimensional data. We also generalize this test for use with greater than two classes, and show its ability to localize significance to particular sub-classes. We observe multiple indications of altered connectivity in Parkinsonian dementia that may be of future use in diagnosis and prediction.

Download Full-text

Visualization of High-Dimensional Data by Pairwise Fusion Matrices Using t-SNE

Symmetry ◽

10.3390/sym11010107 ◽

2019 ◽

Vol 11 (1) ◽

pp. 107 ◽

Cited By ~ 6

Author(s):

Mujtaba Husnain ◽

Malik Missen ◽

Shahzad Mumtaz ◽

Muhammad Luqman ◽

Mickaël Coustaty ◽

...

Keyword(s):

Local Structure ◽

High Dimensional Data ◽

Three Dimensional ◽

Principal Component ◽

Large Data ◽

High Dimensional ◽

Data Set ◽

Novel Approach ◽

Critical Issues ◽

Low Dimensional

We applied t-distributed stochastic neighbor embedding (t-SNE) to visualize Urdu handwritten numerals (or digits). The data set used consists of 28 × 28 images of handwritten Urdu numerals. The data set was created by inviting authors from different categories of native Urdu speakers. One of the challenging and critical issues for the correct visualization of Urdu numerals is shape similarity between some of the digits. This issue was resolved using t-SNE, by exploiting local and global structures of the large data set at different scales. The global structure consists of geometrical features and local structure is the pixel-based information for each class of Urdu digits. We introduce a novel approach that allows the fusion of these two independent spaces using Euclidean pairwise distances in a highly organized and principled way. The fusion matrix embedded with t-SNE helps to locate each data point in a two (or three-) dimensional map in a very different way. Furthermore, our proposed approach focuses on preserving the local structure of the high-dimensional data while mapping to a low-dimensional plane. The visualizations produced by t-SNE outperformed other classical techniques like principal component analysis (PCA) and auto-encoders (AE) on our handwritten Urdu numeral dataset.

Download Full-text

A Robust Supervised Variable Selection for Noisy High-Dimensional Data

BioMed Research International ◽

10.1155/2015/320385 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 6

Author(s):

Jan Kalina ◽

Anna Schlenker

Keyword(s):

Variable Selection ◽

Dimensionality Reduction ◽

Robust Statistics ◽

High Dimensional Data ◽

Real Data ◽

High Dimensional ◽

Adaptive Weights ◽

Novel Approach ◽

Reduction Methods ◽

Data Adaptive

The Minimum Redundancy Maximum Relevance (MRMR) approach to supervised variable selection represents a successful methodology for dimensionality reduction, which is suitable for high-dimensional data observed in two or more different groups. Various available versions of the MRMR approach have been designed to search for variables with the largest relevance for a classification task while controlling for redundancy of the selected set of variables. However, usual relevance and redundancy criteria have the disadvantages of being too sensitive to the presence of outlying measurements and/or being inefficient. We propose a novel approach called Minimum Regularized Redundancy Maximum Robust Relevance (MRRMRR), suitable for noisy high-dimensional data observed in two groups. It combines principles of regularization and robust statistics. Particularly, redundancy is measured by a new regularized version of the coefficient of multiple correlation and relevance is measured by a highly robust correlation coefficient based on the least weighted squares regression with data-adaptive weights. We compare various dimensionality reduction methods on three real data sets. To investigate the influence of noise or outliers on the data, we perform the computations also for data artificially contaminated by severe noise of various forms. The experimental results confirm the robustness of the method with respect to outliers.

Download Full-text

A New Outlier Detection Algorithms Based on Markov Chain

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.366.456 ◽

2011 ◽

Vol 366 ◽

pp. 456-459 ◽

Cited By ~ 3

Author(s):

Jun Yang ◽

Ying Long Wang

Keyword(s):

Markov Chain ◽

Outlier Detection ◽

High Dimensional Data ◽

Weighted Graph ◽

Real Data ◽

Curse Of Dimensionality ◽

High Dimensional ◽

Large Set ◽

Data Set ◽

Novel Approach

Detecting outliers in a large set of data objects is a major data mining task aiming at finding different mechanisms responsible for different groups of objects in a data set. In high-dimensional data, these approaches are bound to deteriorate due to the notorious “curse of dimensionality”. In this paper, we propose a novel approach named ODMC (Outlier Detection Based On Markov Chain)，the effects of the “curse of dimensionality” are alleviated compared to purely distance-based approaches. A main advantage of our new approach is that our method is to use a major feature of an undirected weighted graph to calculate the outlier degree of each node, In a thorough experimental evaluation, we compare ODMC to the ABOD and FindFPOF for various artificial and real data set and show ODMC to perform especially well on high-dimensional data.

Download Full-text

A High-Dimensional Modeling System Based on Analytical Hierarchy Process and Information Criteria

Mathematical Problems in Engineering ◽

10.1155/2021/6198317 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Tuba Koç

Keyword(s):

High Dimensional Data ◽

Information Criteria ◽

Classification Model ◽

High Dimensional ◽

Data Sets ◽

Suitable Model ◽

Classification Problems ◽

Dimensional Modeling ◽

Novel Approach ◽

Hierarchy Process

High-dimensional data sets frequently occur in several scientific areas, and special techniques are required to analyze these types of data sets. Especially, it becomes important to apply a suitable model in classification problems. In this study, a novel approach is proposed to estimate a statistical model for high-dimensional data sets. The proposed method uses analytical hierarchical process (AHP) and information criteria for determining the optimal PCs for the classification model. The high-dimensional “colon” and “gravier” datasets were used in evaluation part. Application results demonstrate that the proposed approach can be successfully used for modeling purposes.

Download Full-text

A Novel Approach for Clustering High Dimensional Data Using Kernal Hubness

2015 Fifth International Conference on Advances in Computing and Communications (ICACC) ◽

10.1109/icacc.2015.67 ◽

2015 ◽

Author(s):

M. Amina ◽

Farook K. Syed

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Novel Approach

Download Full-text

Large Sample Covariance Matrices and High-Dimensional Data Analysis

10.1017/cbo9781107588080 ◽

2015 ◽

Cited By ~ 26

Author(s):

Jianfeng Yao ◽

Shurong Zheng ◽

Zhidong Bai

Keyword(s):

Data Analysis ◽

High Dimensional Data ◽

Covariance Matrices ◽

High Dimensional ◽

Large Sample ◽

Sample Covariance Matrices ◽

Sample Covariance ◽

High Dimensional Data Analysis

Download Full-text

Fractal-Based Methods as a Technique for Estimating the Intrinsic Dimensionality of High-Dimensional Data: A Survey

Informatica ◽

10.15388/informatica.2016.84 ◽

2016 ◽

Vol 27 (2) ◽

pp. 257-281 ◽

Cited By ~ 5

Author(s):

Rasa Karbauskaitė ◽

Gintautas Dzemyda

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Intrinsic Dimensionality

Download Full-text