scholarly journals H-tSNE: Hierarchical Nonlinear Dimensionality Reduction

2020 ◽  
Author(s):  
Kevin C. VanHorn ◽  
Murat Can Çobanoğlu

AbstractDimensionality reduction (DR) is often integral when analyzing high-dimensional data across scientific, economic, and social networking applications. For data with a high order of complexity, nonlinear approaches are often needed to identify and represent the most important components. We propose a novel DR approach that can incorporate a known underlying hierarchy. Specifically, we extend the widely used t-Distributed Stochastic Neighbor Embedding technique (t-SNE) to include hierarchical information and demonstrate its use with known or unknown class labels. We term this approach “H-tSNE.” Such a strategy can aid in discovering and understanding underlying patterns of a dataset that is heavily influenced by parent-child relationships. Without integrating information that is known a priori, we suggest that DR cannot function as effectively. In this regard, we argue for a DR approach that enables the user to incorporate known, relevant relationships even if their representation is weakly expressed in the dataset.Availabilitygithub.com/Cobanoglu-Lab/h-tSNE

2010 ◽  
Vol 7 (1) ◽  
pp. 127-138 ◽  
Author(s):  
Zhao Zhang ◽  
Ye Ning

Dimensionality reduction is an important preprocessing step in high-dimensional data analysis without losing intrinsic information. The problem of semi-supervised nonlinear dimensionality reduction called KNDR is considered for wood defects recognition. In this setting, domain knowledge in forms of pairs constraints are used to specify whether pairs of instances belong to the same class or different classes. KNDR can project the data onto a set of 'useful' features and preserve the structure of labeled and unlabeled data as well as the constraints defined in the embedding space, under which the projections of the original data can be effectively partitioned from each other. We demonstrate the practical usefulness of KNDR for data visualization and wood defects recognition through extensive experiments. Experimental results show it achieves similar or even higher performances than some existing methods.


2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Sai Kiranmayee Samudrala ◽  
Jaroslaw Zola ◽  
Srinivas Aluru ◽  
Baskar Ganapathysubramanian

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.


2020 ◽  
Vol 49 (3) ◽  
pp. 421-437
Author(s):  
Genggeng Liu ◽  
Lin Xie ◽  
Chi-Hua Chen

Dimensionality reduction plays an important role in the data processing of machine learning and data mining, which makes the processing of high-dimensional data more efficient. Dimensionality reduction can extract the low-dimensional feature representation of high-dimensional data, and an effective dimensionality reduction method can not only extract most of the useful information of the original data, but also realize the function of removing useless noise. The dimensionality reduction methods can be applied to all types of data, especially image data. Although the supervised learning method has achieved good results in the application of dimensionality reduction, its performance depends on the number of labeled training samples. With the growing of information from internet, marking the data requires more resources and is more difficult. Therefore, using unsupervised learning to learn the feature of data has extremely important research value. In this paper, an unsupervised multilayered variational auto-encoder model is studied in the text data, so that the high-dimensional feature to the low-dimensional feature becomes efficient and the low-dimensional feature can retain mainly information as much as possible. Low-dimensional feature obtained by different dimensionality reduction methods are used to compare with the dimensionality reduction results of variational auto-encoder (VAE), and the method can be significantly improved over other comparison methods.


2019 ◽  
Vol 283 ◽  
pp. 07009
Author(s):  
Xinyao Zhang ◽  
Pengyu Wang ◽  
Ning Wang

Dimensionality reduction is one of the central problems in machine learning and pattern recognition, which aims to develop a compact representation for complex data from high-dimensional observations. Here, we apply a nonlinear manifold learning algorithm, called local tangent space alignment (LTSA) algorithm, to high-dimensional acoustic observations and achieve nonlinear dimensionality reduction for the acoustic field measured by a linear senor array. By dimensionality reduction, the underlying physical degrees of freedom of acoustic field, such as the variations of sound source location and sound speed profiles, can be discovered. Two simulations are presented to verify the validity of the approach.


2020 ◽  
Vol 26 (4) ◽  
pp. 1661-1671 ◽  
Author(s):  
Dietrich Kammer ◽  
Mandy Keck ◽  
Thomas Grunder ◽  
Alexander Maasch ◽  
Thomas Thom ◽  
...  

2015 ◽  
Vol 150 ◽  
pp. 570-582 ◽  
Author(s):  
Hannah Kim ◽  
Jaegul Choo ◽  
Chandan K. Reddy ◽  
Haesun Park

Sign in / Sign up

Export Citation Format

Share Document