Dimensionality reduction for tensor data based on projection distance minimization and hilbert-schmidt independence criterion maximization1

Tensor data are becoming more and more common in machine learning. Compared with vector data, the curse of dimensionality of tensor data is more serious. The motivation of this paper is to combine Hilbert-Schmidt Independence Criterion (HSIC) and tensor algebra to create a new dimensionality reduction algorithm for tensor data. There are three contributions in this paper. (1) An HSIC-based algorithm is proposed in which the dimension-reduced tensor is determined by maximizing HSIC between the dimension-reduced and high-dimensional tensors. (2) A tensor algebra-based algorithm is proposed, in which the high-dimensional tensor are projected onto a subspace and the projection coordinate is set to be the dimension-reduced tensor. The subspace is determined by minimizing the distance between the high-dimensional tensor data and their projection in the subspace. (3) By combining the above two algorithms, a new dimensionality reduction algorithm, called PDMHSIC, is proposed, in which the dimensionality reduction must satisfy two criteria at the same time: HSIC maximization and subspace projection distance minimization. The proposed algorithm is a new attempt to combine HSIC with other algorithms to create new algorithms and has achieved better experimental results on 8 commonly-used datasets than the other 7 well-known algorithms.

Download Full-text

An adaptive and efficient dimensionality reduction algorithm for high-dimensional indexing

Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405) ◽

10.1109/icde.2003.1260784 ◽

2004 ◽

Cited By ~ 22

Author(s):

H. Jin ◽

B.C. Ooi ◽

H.T. Shen ◽

C. Yu ◽

Ao Ying Zhou

Keyword(s):

Dimensionality Reduction ◽

High Dimensional ◽

Reduction Algorithm ◽

High Dimensional Indexing

Download Full-text

Dimensionality reduction of tensor data based on local linear embedding and mode product

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202588 ◽

2021 ◽

pp. 1-18

Author(s):

Ting Gao ◽

Zhengming Ma ◽

Wenxu Gao ◽

Shuyu Liu

Keyword(s):

Dimensionality Reduction ◽

Learning Algorithm ◽

Vector Data ◽

Global Features ◽

Global Mode ◽

Local Linear Embedding ◽

Local Linear ◽

Clustering And Classification ◽

Linear Embedding ◽

Tensor Data

There are three contributions in this paper. (1) A tensor version of LLE (short for Local Linear Embedding algorithm) is deduced and presented. LLE is the most famous manifold learning algorithm. Since its proposal, various improvements to LLE have kept emerging without interruption. However, all these achievements are only suitable for vector data, not tensor data. The proposed tensor LLE can also be used a bridge for various improvements to LLE to transfer from vector data to tensor data. (2) A framework of tensor dimensionality reduction based on tensor mode product is proposed, in which the mode matrices can be determined according to specific criteria. (3) A novel dimensionality reduction algorithm for tensor data based on LLE and mode product (LLEMP-TDR) is proposed, in which LLE is used as a criterion to determine the mode matrices. Benefiting from local LLE and global mode product, the proposed LLEMP-TDR can preserve both local and global features of high-dimensional tenser data during dimensionality reduction. The experimental results on data clustering and classification tasks demonstrate that our method performs better than 5 other related algorithms published recently in top academic journals.

Download Full-text

SCDRHA: A scRNA-Seq Data Dimensionality Reduction Algorithm Based on Hierarchical Autoencoder

Frontiers in Genetics ◽

10.3389/fgene.2021.733906 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jianping Zhao ◽

Na Wang ◽

Haiyun Wang ◽

Chunhou Zheng ◽

Yansen Su

Keyword(s):

Dimensionality Reduction ◽

Data Visualization ◽

State Of The Art ◽

Dimensional Space ◽

High Dimensional ◽

Reduction Algorithm ◽

Cell Clustering ◽

Data Dimensionality Reduction ◽

Single Cell Rna Sequencing ◽

Low Dimensional

Dimensionality reduction of high-dimensional data is crucial for single-cell RNA sequencing (scRNA-seq) visualization and clustering. One prominent challenge in scRNA-seq studies comes from the dropout events, which lead to zero-inflated data. To address this issue, in this paper, we propose a scRNA-seq data dimensionality reduction algorithm based on a hierarchical autoencoder, termed SCDRHA. The proposed SCDRHA consists of two core modules, where the first module is a deep count autoencoder (DCA) that is used to denoise data, and the second module is a graph autoencoder that projects the data into a low-dimensional space. Experimental results demonstrate that SCDRHA has better performance than existing state-of-the-art algorithms on dimension reduction and noise reduction in five real scRNA-seq datasets. Besides, SCDRHA can also dramatically improve the performance of data visualization and cell clustering.

Download Full-text

Dimensionality Reduction of Tensors Based on Local Homeomorphism and Global Subspace Projection Distance Minimum

IEEE Access ◽

10.1109/access.2020.2997997 ◽

2020 ◽

Vol 8 ◽

pp. 116064-116077

Author(s):

Guokai Zhang ◽

Zhengming Ma ◽

Haidong Huang

Keyword(s):

Dimensionality Reduction ◽

Subspace Projection ◽

Local Homeomorphism ◽

Projection Distance

Download Full-text

Dimensionality reduction based on multi-local linear regression and global subspace projection distance minimum

Pattern Analysis and Applications ◽

10.1007/s10044-021-01022-7 ◽

2021 ◽

Author(s):

Haidong Huang ◽

Zhengming Ma ◽

Guokai Zhang ◽

Huibin Wu

Keyword(s):

Linear Regression ◽

Dimensionality Reduction ◽

Local Linear Regression ◽

Subspace Projection ◽

Local Linear ◽

Projection Distance

Download Full-text

RHDSI: A Novel Dimensionality Reduction Based Algorithm on High Dimensional Feature Selection with Interactions

Information Sciences ◽

10.1016/j.ins.2021.06.096 ◽

2021 ◽

Author(s):

Rahi Jain ◽

Wei Xu

Keyword(s):

Feature Selection ◽

Dimensionality Reduction ◽

High Dimensional

Download Full-text

A generalization of t-SNE and UMAP to single-cell multimodal omics

Genome Biology ◽

10.1186/s13059-021-02356-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Van Hoan Do ◽

Stefan Canzar

Keyword(s):

Dimensionality Reduction ◽

Single Cell ◽

Cell Types ◽

High Dimensional ◽

Omics Data ◽

Relative Contribution ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Concise Representation ◽

Cellular Identity

AbstractEmerging single-cell technologies profile multiple types of molecules within individual cells. A fundamental step in the analysis of the produced high-dimensional data is their visualization using dimensionality reduction techniques such as t-SNE and UMAP. We introduce j-SNE and j-UMAP as their natural generalizations to the joint visualization of multimodal omics data. Our approach automatically learns the relative contribution of each modality to a concise representation of cellular identity that promotes discriminative features but suppresses noise. On eight datasets, j-SNE and j-UMAP produce unified embeddings that better agree with known cell types and that harmonize RNA and protein velocity landscapes.

Download Full-text

Parallel Framework for Dimensionality Reduction of Large-Scale Datasets

Scientific Programming ◽

10.1155/2015/180214 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 3

Author(s):

Sai Kiranmayee Samudrala ◽

Jaroslaw Zola ◽

Srinivas Aluru ◽

Baskar Ganapathysubramanian

Keyword(s):

Dimensionality Reduction ◽

Organic Solar Cells ◽

Large Scale ◽

Parallel Implementation ◽

High Dimensional Data ◽

Real Life ◽

Processing Parameters ◽

High Dimensional ◽

Morphology Evolution ◽

Reduction Techniques

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties. Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data. However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions. In this paper, we propose a parallel framework for dimensionality reduction of large-scale data. We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation. We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods. To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.

Download Full-text

Dimensionality Reduction for Tensor Data Based on Local Decision Margin Maximization

IEEE Transactions on Image Processing ◽

10.1109/tip.2020.3034498 ◽

2021 ◽

Vol 30 ◽

pp. 234-248

Author(s):

Shujie Zhang ◽

Zhengming Ma ◽

Weichao Gan

Keyword(s):

Dimensionality Reduction ◽

Local Decision ◽

Margin Maximization ◽

Tensor Data

Download Full-text

Multi-Instance Dimensionality Reduction via Sparsity and Orthogonality

Neural Computation ◽

10.1162/neco_a_01140 ◽

2018 ◽

Vol 30 (12) ◽

pp. 3281-3308

Author(s):

Hong Zhu ◽

Li-Zhi Liao ◽

Michael K. Ng

Keyword(s):

Dimensionality Reduction ◽

Optimization Problem ◽

Augmented Lagrangian ◽

Main Idea ◽

Real Data ◽

Learning Performance ◽

High Dimensional ◽

Data Sets ◽

Outer Loop ◽

Orthogonality Constraints

We study a multi-instance (MI) learning dimensionality-reduction algorithm through sparsity and orthogonality, which is especially useful for high-dimensional MI data sets. We develop a novel algorithm to handle both sparsity and orthogonality constraints that existing methods do not handle well simultaneously. Our main idea is to formulate an optimization problem where the sparse term appears in the objective function and the orthogonality term is formed as a constraint. The resulting optimization problem can be solved by using approximate augmented Lagrangian iterations as the outer loop and inertial proximal alternating linearized minimization (iPALM) iterations as the inner loop. The main advantage of this method is that both sparsity and orthogonality can be satisfied in the proposed algorithm. We show the global convergence of the proposed iterative algorithm. We also demonstrate that the proposed algorithm can achieve high sparsity and orthogonality requirements, which are very important for dimensionality reduction. Experimental results on both synthetic and real data sets show that the proposed algorithm can obtain learning performance comparable to that of other tested MI learning algorithms.

Download Full-text