H-tSNE: Hierarchical Nonlinear Dimensionality Reduction

Mapping Intimacies ◽

10.1101/2020.10.05.324798 ◽

2020 ◽

Author(s):

Kevin C. VanHorn ◽

Murat Can Çobanoğlu

Keyword(s):

Dimensionality Reduction ◽

Social Networking ◽

A Priori ◽

High Dimensional Data ◽

High Order ◽

High Dimensional ◽

Nonlinear Dimensionality Reduction ◽

Embedding Technique ◽

Class Labels ◽

Parent Child Relationships

AbstractDimensionality reduction (DR) is often integral when analyzing high-dimensional data across scientific, economic, and social networking applications. For data with a high order of complexity, nonlinear approaches are often needed to identify and represent the most important components. We propose a novel DR approach that can incorporate a known underlying hierarchy. Specifically, we extend the widely used t-Distributed Stochastic Neighbor Embedding technique (t-SNE) to include hierarchical information and demonstrate its use with known or unknown class labels. We term this approach “H-tSNE.” Such a strategy can aid in discovering and understanding underlying patterns of a dataset that is heavily influenced by parent-child relationships. Without integrating information that is known a priori, we suggest that DR cannot function as effectively. In this regard, we argue for a DR approach that enables the user to incorporate known, relevant relationships even if their representation is weakly expressed in the dataset.Availabilitygithub.com/Cobanoglu-Lab/h-tSNE

Download Full-text

Distance-preserving projection of high-dimensional data for nonlinear dimensionality reduction

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2004.66 ◽

2004 ◽

Vol 26 (9) ◽

pp. 1243-1246 ◽

Cited By ~ 28

Author(s):

Li Yang

Keyword(s):

Dimensionality Reduction ◽

High Dimensional Data ◽

High Dimensional ◽

Nonlinear Dimensionality Reduction

Download Full-text

Effective semi-supervised nonlinear dimensionality reduction for wood defects recognition

Computer Science and Information Systems ◽

10.2298/csis1001127z ◽

2010 ◽

Vol 7 (1) ◽

pp. 127-138 ◽

Cited By ~ 2

Author(s):

Zhao Zhang ◽

Ye Ning

Keyword(s):

Data Analysis ◽

Dimensionality Reduction ◽

Data Visualization ◽

Domain Knowledge ◽

High Dimensional Data ◽

Original Data ◽

Experimental Results ◽

High Dimensional ◽

Nonlinear Dimensionality Reduction ◽

Practical Usefulness

Dimensionality reduction is an important preprocessing step in high-dimensional data analysis without losing intrinsic information. The problem of semi-supervised nonlinear dimensionality reduction called KNDR is considered for wood defects recognition. In this setting, domain knowledge in forms of pairs constraints are used to specify whether pairs of instances belong to the same class or different classes. KNDR can project the data onto a set of 'useful' features and preserve the structure of labeled and unlabeled data as well as the constraints defined in the embedding space, under which the projections of the original data can be effectively partitioned from each other. We demonstrate the practical usefulness of KNDR for data visualization and wood defects recognition through extensive experiments. Experimental results show it achieves similar or even higher performances than some existing methods.

Download Full-text

Parallel Framework for Dimensionality Reduction of Large-Scale Datasets

Scientific Programming ◽

10.1155/2015/180214 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Sai Kiranmayee Samudrala ◽

Jaroslaw Zola ◽

Srinivas Aluru ◽

Baskar Ganapathysubramanian

Keyword(s):

Dimensionality Reduction ◽

Organic Solar Cells ◽

Large Scale ◽

Parallel Implementation ◽

High Dimensional Data ◽

Real Life ◽

Processing Parameters ◽

High Dimensional ◽

Morphology Evolution ◽

Reduction Techniques

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.

Download Full-text

Hierarchical Clustering of High-Dimensional Data Without Global Dimensionality Reduction

Lecture Notes in Computer Science - Foundations of Intelligent Systems ◽

10.1007/978-3-030-01851-1_23 ◽

2018 ◽

pp. 236-246

Author(s):

Ilari Kampman ◽

Tapio Elomaa

Keyword(s):

Dimensionality Reduction ◽

Hierarchical Clustering ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

Unsupervised Text Feature Learning via Deep Variational Auto-encoder

Information Technology And Control ◽

10.5755/j01.itc.49.3.25918 ◽

2020 ◽

Vol 49 (3) ◽

pp. 421-437

Author(s):

Genggeng Liu ◽

Lin Xie ◽

Chi-Hua Chen

Keyword(s):

Dimensionality Reduction ◽

High Dimensional Data ◽

Image Data ◽

Original Data ◽

Feature Representation ◽

High Dimensional ◽

Learning To Learn ◽

Text Feature ◽

Reduction Methods ◽

Low Dimensional

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.

Download Full-text

A sparse grid based method for generative dimensionality reduction of high-dimensional data

Journal of Computational Physics ◽

10.1016/j.jcp.2015.12.033 ◽

2016 ◽

Vol 309 ◽

pp. 1-17 ◽

Cited By ~ 8

Author(s):

Bastian Bohn ◽

Jochen Garcke ◽

Michael Griebel

Keyword(s):

Dimensionality Reduction ◽

High Dimensional Data ◽

Sparse Grid ◽

High Dimensional ◽

Grid Based

Download Full-text

Nonlinear dimensionality reduction for the acoustic field measured by a linear sensor array

MATEC Web of Conferences ◽

10.1051/matecconf/201928307009 ◽

2019 ◽

Vol 283 ◽

pp. 07009

Author(s):

Xinyao Zhang ◽

Pengyu Wang ◽

Ning Wang

Keyword(s):

Dimensionality Reduction ◽

Acoustic Field ◽

Degrees Of Freedom ◽

Sensor Array ◽

Learning Algorithm ◽

High Dimensional ◽

Compact Representation ◽

Nonlinear Dimensionality Reduction ◽

Complex Data ◽

Local Tangent

Dimensionality reduction is one of the central problems in machine learning and pattern recognition, which aims to develop a compact representation for complex data from high-dimensional observations. Here, we apply a nonlinear manifold learning algorithm, called local tangent space alignment (LTSA) algorithm, to high-dimensional acoustic observations and achieve nonlinear dimensionality reduction for the acoustic field measured by a linear senor array. By dimensionality reduction, the underlying physical degrees of freedom of acoustic field, such as the variations of sound source location and sound speed profiles, can be discovered. Two simulations are presented to verify the validity of the approach.

Download Full-text