A generalization of t-SNE and UMAP to single-cell multimodal omics

AbstractEmerging single-cell technologies profile multiple types of molecules within individual cells. A fundamental step in the analysis of the produced high-dimensional data is their visualization using dimensionality reduction techniques such as t-SNE and UMAP. We introduce j-SNE and j-UMAP as their natural generalizations to the joint visualization of multimodal omics data. Our approach automatically learns the relative contribution of each modality to a concise representation of cellular identity that promotes discriminative features but suppresses noise. On eight datasets, j-SNE and j-UMAP produce unified embeddings that better agree with known cell types and that harmonize RNA and protein velocity landscapes.

Download Full-text

Recent Dimensionality Reduction Techniques for Visualizing High-Dimensional Parkinson’s Disease Omics Data

10.1109/bigdata52589.2021.9671736 ◽

2021 ◽

Author(s):

Marios G. Krokidis ◽

Georgios Dimitrakopoulos ◽

Aristidis G. Vrahatis ◽

Themis P. Exarchos ◽

Panagiotis Vlamos

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Dimensionality Reduction ◽

High Dimensional ◽

Omics Data ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Download Full-text

Discovering cell types using manifold learning and enhanced visualization of single-cell RNA-Seq data

Scientific Reports ◽

10.1038/s41598-021-03613-0 ◽

2022 ◽

Vol 12 (1) ◽

Author(s):

Akram Vasighizaker ◽

Saiteja Danda ◽

Luis Rueda

Keyword(s):

Dimensionality Reduction ◽

Single Cell ◽

Cell Types ◽

Gene Set Enrichment Analysis ◽

Rna Seq ◽

Reduction Techniques ◽

Non Linear ◽

Dimensionality Reduction Techniques ◽

Linear Dimensionality Reduction ◽

The Impact

AbstractIdentifying relevant disease modules such as target cell types is a significant step for studying diseases. High-throughput single-cell RNA-Seq (scRNA-seq) technologies have advanced in recent years, enabling researchers to investigate cells individually and understand their biological mechanisms. Computational techniques such as clustering, are the most suitable approach in scRNA-seq data analysis when the cell types have not been well-characterized. These techniques can be used to identify a group of genes that belong to a specific cell type based on their similar gene expression patterns. However, due to the sparsity and high-dimensionality of scRNA-seq data, classical clustering methods are not efficient. Therefore, the use of non-linear dimensionality reduction techniques to improve clustering results is crucial. We introduce a method that is used to identify representative clusters of different cell types by combining non-linear dimensionality reduction techniques and clustering algorithms. We assess the impact of different dimensionality reduction techniques combined with the clustering of thirteen publicly available scRNA-seq datasets of different tissues, sizes, and technologies. We further performed gene set enrichment analysis to evaluate the proposed method’s performance. As such, our results show that modified locally linear embedding combined with independent component analysis yields overall the best performance relative to the existing unsupervised methods across different datasets.

Download Full-text

A Quantitative Framework for Evaluating Single-Cell Data Structure Preservation by Dimensionality Reduction Techniques

Cell Reports ◽

10.1016/j.celrep.2020.107576 ◽

2020 ◽

Vol 31 (5) ◽

pp. 107576 ◽

Cited By ~ 4

Author(s):

Cody N. Heiser ◽

Ken S. Lau

Keyword(s):

Data Structure ◽

Dimensionality Reduction ◽

Single Cell ◽

Structure Preservation ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Cell Data

Download Full-text

A SURVEY ON THE CURES FOR THE CURSE OF DIMENSIONALITY IN BIG DATA

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.19755 ◽

2017 ◽

Vol 10 (13) ◽

pp. 355 ◽

Cited By ~ 1

Author(s):

Reshma Remesh ◽

Pattabiraman. V

Keyword(s):

Dimensionality Reduction ◽

Input Data ◽

Principal Component ◽

Kernel Principal Component Analysis ◽

High Dimensional ◽

Data Sets ◽

Learning Approaches ◽

Data Set ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Dimensionality reduction techniques are used to reduce the complexity for analysis of high dimensional data sets. The raw input data set may have large dimensions and it might consume time and lead to wrong predictions if unnecessary data attributes are been considered for analysis. So using dimensionality reduction techniques one can reduce the dimensions of input data towards accurate prediction with less cost. In this paper the different machine learning approaches used for dimensionality reductions such as PCA, SVD, LDA, Kernel Principal Component Analysis and Artificial Neural Network have been studied.

Download Full-text

Performance Analysis of Dimensionality Reduction Techniques in the Context of Clustering

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2019.8.s3.2084 ◽

2019 ◽

Vol 8 (S3) ◽

pp. 66-71

Author(s):

T. Sudha ◽

P. Nagendra Kumar

Keyword(s):

Principal Component Analysis ◽

Dimensionality Reduction ◽

High Dimensional Data ◽

Principal Component ◽

Component Analysis ◽

High Dimensional ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Low Dimensional ◽

Probabilistic Principal Component Analysis

Data mining is one of the major areas of research. Clustering is one of the main functionalities of datamining. High dimensionality is one of the main issues of clustering and Dimensionality reduction can be used as a solution to this problem. The present work makes a comparative study of dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis in the context of clustering. High dimensional data have been reduced to low dimensional data using dimensionality reduction techniques such as t-distributed stochastic neighbour embedding and probabilistic principal component analysis. Cluster analysis has been performed on the high dimensional data as well as the low dimensional data sets obtained through t-distributed stochastic neighbour embedding and Probabilistic principal component analysis with varying number of clusters. Mean squared error; time and space have been considered as parameters for comparison. The results obtained show that time taken to convert the high dimensional data into low dimensional data using probabilistic principal component analysis is higher than the time taken to convert the high dimensional data into low dimensional data using t-distributed stochastic neighbour embedding.The space required by the data set reduced through Probabilistic principal component analysis is less than the storage space required by the data set reduced through t-distributed stochastic neighbour embedding.

Download Full-text

Analysis of unsupervised dimensionality reduction techniques

Computer Science and Information Systems ◽

10.2298/csis0902217k ◽

2009 ◽

Vol 6 (2) ◽

pp. 217-227 ◽

Cited By ~ 29

Author(s):

Aswani Kumar

Keyword(s):

Dimensionality Reduction ◽

Approximation Error ◽

High Dimensional ◽

Retrieval Task ◽

Document Collections ◽

Noise Effects ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Text Images ◽

High Dimensional Datasets

Domains such as text, images etc contain large amounts of redundancies and ambiguities among the attributes which result in considerable noise effects (i.e. the data is high dimension). Retrieving the data from high dimensional datasets is a big challenge. Dimensionality reduction techniques have been a successful avenue for automatically extracting the latent concepts by removing the noise and reducing the complexity in processing the high dimensional data. In this paper we conduct a systematic study on comparing the unsupervised dimensionality reduction techniques for text retrieval task. We analyze these techniques from the view of complexity, approximation error and retrieval quality with experiments on four testing document collections.

Download Full-text

Performance Evaluation of Dimensionality Reduction Techniques on High Dimensional Data

2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) ◽

10.1109/icoei.2019.8862526 ◽

2019 ◽

Author(s):

Mandikal Vikram ◽

Rakesh Pavan ◽

Navadiya Dhruvikkumar Dineshbhai ◽

Biju Mohan

Keyword(s):

Performance Evaluation ◽

Dimensionality Reduction ◽

High Dimensional Data ◽

High Dimensional ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Download Full-text

A Review on Dimensionality Reduction Techniques

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419500174 ◽

2019 ◽

Vol 33 (10) ◽

pp. 1950017 ◽

Cited By ~ 5

Author(s):

Xuan Huang ◽

Lei Wu ◽

Yinsong Ye

Keyword(s):

Pattern Recognition ◽

Dimensionality Reduction ◽

Implementation Process ◽

Recognition System ◽

Research Trend ◽

High Dimensional ◽

Open Problems ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques ◽

Scope Of Application

High-dimensional data is ubiquitous in scientific research and industrial production fields. It brings a lot of information to people, at the same time, because of its sparse and redundancy, it also brings great challenges to data mining and pattern recognition. Dimensionality reduction can reduce redundancy and noise, reduce the complexity of learning algorithms, and improve the accuracy of classification, it is an important and key step in pattern recognition system. In this paper, we overview the classical techniques for dimensionality reduction and review their properties, and categorize these techniques according to their implementation process. We deduce each algorithm in detail and intuitively show their underlying mathematical principles. Thereby, the focus is to uncover the optimization process for each technique. We compare the characteristics and limitations of each technique and summarize the scope of application, discussing a number of open problems and a perspective of research trend in future.

Download Full-text

Dimensionality Reduction Techniques for High-dimensional Data in Precision Agriculture

10.1201/9780429197529-2 ◽

2021 ◽

pp. 28-39

Author(s):

Mostafa Reisi-Gahrooei ◽

James A. Whitehurst ◽

Yiannis Ampatzidis ◽

Panos Pardalos

Keyword(s):

Dimensionality Reduction ◽

Precision Agriculture ◽

High Dimensional Data ◽

High Dimensional ◽

Reduction Techniques ◽

Dimensionality Reduction Techniques

Download Full-text

Ensemble dimensionality reduction and feature gene extraction for single-cell RNA-seq data

Nature Communications ◽

10.1038/s41467-020-19465-7 ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Xiaoxiao Sun ◽

Yiwen Liu ◽

Lingling An

Keyword(s):

Dimensionality Reduction ◽

Single Cell ◽

Dimensional Space ◽

Essential Feature ◽

Empirical Studies ◽

Expression Patterns ◽

Cell Types ◽

Stochastic Gradient Descent ◽

Reduction Techniques ◽

Low Dimensional

AbstractSingle-cell RNA sequencing (scRNA-seq) technologies allow researchers to uncover the biological states of a single cell at high resolution. For computational efficiency and easy visualization, dimensionality reduction is necessary to capture gene expression patterns in low-dimensional space. Here we propose an ensemble method for simultaneous dimensionality reduction and feature gene extraction (EDGE) of scRNA-seq data. Different from existing dimensionality reduction techniques, the proposed method implements an ensemble learning scheme that utilizes massive weak learners for an accurate similarity search. Based on the similarity matrix constructed by those weak learners, the low-dimensional embedding of the data is estimated and optimized through spectral embedding and stochastic gradient descent. Comprehensive simulation and empirical studies show that EDGE is well suited for searching for meaningful organization of cells, detecting rare cell types, and identifying essential feature genes associated with certain cell types.

Download Full-text