supervised dimension reduction Latest Research Papers

Dimensionality reduction is standard practice for filtering noise and identifying relevant dimensions in large-scale data analyses. In biology, single-cell expression studies almost always begin with reduction to two or three dimensions to produce 'all-in-one' visuals of the data that are amenable to the human eye, and these are subsequently used for qualitative and quantitative analysis of cell relationships. However, there is little theoretical support for this practice. We examine the theoretical and practical implications of low-dimensional embedding of single-cell data, and find extensive distortions incurred on the global and local properties of biological patterns relative to the high-dimensional, ambient space. In lieu of this, we propose semi-supervised dimension reduction to higher dimension, and show that such targeted reduction guided by the metadata associated with single-cell experiments provides useful latent space representations for hypothesis-driven biological discovery.

Download Full-text

Identification and prediction of difficult-to-treat rheumatoid arthritis patients in structured and unstructured routine care data: results from a hackathon

Arthritis Research & Therapy ◽

10.1186/s13075-021-02560-5 ◽

2021 ◽

Vol 23 (1) ◽

Author(s):

Marianne A. Messelink ◽

Nadia M. T. Roodenrijs ◽

Bram van Es ◽

Cornelia A. R. Hulsbergen-Veelken ◽

Sebastiaan Jong ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Routine Care ◽

Cross Sectional Study ◽

Machine Learning Techniques ◽

Future Research ◽

Cross Sectional ◽

Feature Importance ◽

Importance Analysis ◽

Using Data ◽

Supervised Dimension Reduction

Abstract Background The new concept of difficult-to-treat rheumatoid arthritis (D2T RA) refers to RA patients who remain symptomatic after several lines of treatment, resulting in a high patient and economic burden. During a hackathon, we aimed to identify and predict D2T RA patients in structured and unstructured routine care data. Methods Routine care data of 1873 RA patients were extracted from the Utrecht Patient Oriented Database. Data from a previous cross-sectional study, in which 152 RA patients were clinically classified as either D2T or non-D2T, served as a validation set. Machine learning techniques, text mining, and feature importance analyses were performed to identify and predict D2T RA patients based on structured and unstructured routine care data. Results We identified 123 potentially new D2T RA patients by applying the D2T RA definition in structured and unstructured routine care data. Additionally, we developed a D2T RA identification model derived from a feature importance analysis of all available structured data (AUC-ROC 0.88 (95% CI 0.82–0.94)), and we demonstrated the potential of longitudinal hematological data to differentiate D2T from non-D2T RA patients using supervised dimension reduction. Lastly, using data up to the time of starting the first biological treatment, we predicted future development of D2TRA (AUC-ROC 0.73 (95% CI 0.71–0.75)). Conclusions During this hackathon, we have demonstrated the potential of different techniques for the identification and prediction of D2T RA patients in structured as well as unstructured routine care data. The results are promising and should be optimized and validated in future research.

Download Full-text

Label propagation with structured graph learning for semi-supervised dimension reduction

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.107130 ◽

2021 ◽

pp. 107130

Author(s):

Fei Wang ◽

Lei Zhu ◽

Liang Xie ◽

Zheng Zhang ◽

Mingyang Zhong

Keyword(s):

Dimension Reduction ◽

Label Propagation ◽

Graph Learning ◽

Supervised Dimension Reduction

Download Full-text

Bayesian inverse regression for supervised dimension reduction with small datasets

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2021.1909025 ◽

2021 ◽

pp. 1-16

Author(s):

Xin Cai ◽

Guang Lin ◽

Jinglai Li

Keyword(s):

Dimension Reduction ◽

Inverse Regression ◽

Supervised Dimension Reduction

Download Full-text

Dimension Reduction for Covariates in Network Data

Biometrika ◽

10.1093/biomet/asab006 ◽

2021 ◽

Author(s):

Junlong Zhao ◽

Xiumin Liu ◽

Hansheng Wang ◽

Chenlei Leng

Keyword(s):

Dimension Reduction ◽

Principal Component ◽

Network Effect ◽

Network Data ◽

Linear Discriminant ◽

Linkage Pattern ◽

Novel Approach ◽

Major Interest ◽

Low Dimensional ◽

Supervised Dimension Reduction

Summary A problem of major interest in network data analysis is to explain the strength of connections using context information. To achieve this, we introduce a novel approach named network-supervised dimension reduction by projecting covariates onto low-dimensional spaces for revealing the linkage pattern, without assuming a model.We propose a new loss function for estimating the parameters in the resulting linear projection, based on the notion that closer proximity in the low-dimension projection renders stronger connections. Interestingly, the convergence rate of our estimator is shown to depend on a network effect factor which is the smallest number that can partition a graph in a way similar to the graph coloring problem. Our methodology has interesting connections to principal component analysis and linear discriminant analysis, which we exploit for clustering and community detection. The methodology developed is further illustrated by numerical experiments and the analysis of a pulsar candidates data in astronomy.

Download Full-text

A Fully Bayesian Gradient-Free Supervised Dimension Reduction Method using Gaussian Processes

International Journal for Uncertainty Quantification ◽

10.1615/int.j.uncertaintyquantification.2021035621 ◽

2021 ◽

Author(s):

Raphaël Gautier ◽

Piyush Pandita ◽

Sayan Ghosh ◽

Dimitri Mavris

Keyword(s):

Dimension Reduction ◽

Gaussian Processes ◽

Reduction Method ◽

Dimension Reduction Method ◽

Supervised Dimension Reduction ◽

Fully Bayesian

Download Full-text

Supervised t-Distributed Stochastic Neighbor Embedding for Data Visualization and Classification

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.0961 ◽

2020 ◽

Author(s):

Yichen Cheng ◽

Xinlei Wang ◽

Yusen Xia

Keyword(s):

Dimension Reduction ◽

Sample Size ◽

High Dimensional Data ◽

Machine Learning Algorithms ◽

High Dimensional ◽

Data Sets ◽

Wearable Electronics ◽

Processing Technologies ◽

Leibler Divergence ◽

Supervised Dimension Reduction

We propose a novel supervised dimension-reduction method called supervised t-distributed stochastic neighbor embedding (St-SNE) that achieves dimension reduction by preserving the similarities of data points in both feature and outcome spaces. The proposed method can be used for both prediction and visualization tasks with the ability to handle high-dimensional data. We show through a variety of data sets that when compared with a comprehensive list of existing methods, St-SNE has superior prediction performance in the ultrahigh-dimensional setting in which the number of features p exceeds the sample size n and has competitive performance in the p ≤ n setting. We also show that St-SNE is a competitive visualization tool that is capable of capturing within-cluster variations. In addition, we propose a penalized Kullback–Leibler divergence criterion to automatically select the reduced-dimension size k for St-SNE. Summary of Contribution: With the fast development of data collection and data processing technologies, high-dimensional data have now become ubiquitous. Examples of such data include those collected from environmental sensors, personal mobile devices, and wearable electronics. High-dimensionality poses great challenges for data analytics routines, both methodologically and computationally. Many machine learning algorithms may fail to work for ultrahigh-dimensional data, where the number of the features p is (much) larger than the sample size n. We propose a novel method for dimension reduction that can (i) aid the understanding of high-dimensional data through visualization and (ii) create a small set of good predictors, which is especially useful for prediction using ultrahigh-dimensional data.

Download Full-text

Supervised dimension reduction for large-scale "omics" data with censored survival outcomes under possible non-proportional hazards

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2020.2965934 ◽

2020 ◽

pp. 1-1

Author(s):

Lauren Spirko-Burns ◽

Karthik Devarajan

Keyword(s):

Dimension Reduction ◽

Large Scale ◽

Proportional Hazards ◽

Survival Outcomes ◽

Omics Data ◽

Supervised Dimension Reduction

Download Full-text

Supervised Dimension Reduction by Local Neighborhood Optimization for Image Processing

Recent Patents on Engineering ◽

10.2174/1872212112666181116125033 ◽

2019 ◽

Vol 13 (4) ◽

pp. 334-347

Author(s):

Liyan Zhao ◽

Huan Wang ◽

Jing Wang

Keyword(s):

Dimension Reduction ◽

Seismic Data ◽

Noise Suppression ◽

Dimensional Space ◽

High Dimensional ◽

Local Neighborhood ◽

Seismic Image ◽

Data Points ◽

Low Dimensional ◽

Supervised Dimension Reduction

Background: Subspace learning-based dimensionality reduction algorithms are important and have been popularly applied in data mining, pattern recognition and computer vision applications. They show the successful dimension reduction when data points are evenly distributed in the high-dimensional space. However, some may distort the local geometric structure of the original dataset and result in a poor low-dimensional embedding while data samples show an uneven distribution in the original space. Methods: In this paper, we propose a supervised dimension reduction method by local neighborhood optimization to disposal the uneven distribution of high-dimensional data. It extends the widely used Locally Linear Embedding (LLE) framework, namely LNOLLE. The method considers the class label of the data to optimize local neighborhood, which achieves better separability inter-class distance of the data in the low-dimensional space with the aim to abstain holding together the data samples of different classes while mapping an uneven distributed data. This effectively preserves the geometric topological structure of the original data points. Results: We use the presented LNOLLE method to the image classification and face recognition, which achieves a good classification result and higher face recognition accuracy compared with existing manifold learning methods including popular supervised algorithms. In addition, we consider the reconstruction of the method to solve noise suppression for seismic image. To the best of our knowledge, this is the first manifold learning approach to solve high-dimensional nonlinear seismic data for noise suppression. Conclusion: The experimental results on forward model and real seismic data show that LNOLLE improves signal to noise ratio of seismic image compared with the widely used Singular Value Decomposition (SVD) filtering method.

Download Full-text

Multiple Kernel Feature Line Embedding for Hyperspectral Image Classification

Remote Sensing ◽

10.3390/rs11242892 ◽

2019 ◽

Vol 11 (24) ◽

pp. 2892 ◽

Cited By ~ 1

Author(s):

Ying-Nong Chen

Keyword(s):

Dimension Reduction ◽

Hyperspectral Image ◽

Kernel Method ◽

Kernel Learning ◽

Support Vector ◽

Kernel Pca ◽

Kernel Weights ◽

Multiple Kernel ◽

Feature Line ◽

Supervised Dimension Reduction

In this study, a novel multple kernel FLE (MKFLE) based on general nearest feature line embedding (FLE) transformation is proposed and applied to classify hyperspectral image (HSI) in which the advantage of multple kernel learning is considered. The FLE has successfully shown its discriminative capability in many applications. However, since the conventional linear-based principle component analysis (PCA) pre-processing method in FLE cannot effectively extract the nonlinear information, the multiple kernel PCA (MKPCA) based on the proposed multple kernel method was proposed to alleviate this problem. The proposed MKFLE dimension reduction framework was performed through two stages. In the first multple kernel PCA (MKPCA) stage, the multple kernel learning method based on between-class distance and support vector machine (SVM) was used to find the kernel weights. Based on these weights, a new weighted kernel function was constructed in a linear combination of some valid kernels. In the second FLE stage, the FLE method, which can preserve the nonlinear manifold structure, was applied for supervised dimension reduction using the kernel obtained in the first stage. The effectiveness of the proposed MKFLE algorithm was measured by comparing with various previous state-of-the-art works on three benchmark data sets. According to the experimental results: the performance of the proposed MKFLE is better than the other methods, and got the accuracy of 83.58%, 91.61%, and 97.68% in Indian Pines, Pavia University, and Pavia City datasets, respectively.

Download Full-text

supervised dimension reduction
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Specious Art of Single-Cell Genomics

Identification and prediction of difficult-to-treat rheumatoid arthritis patients in structured and unstructured routine care data: results from a hackathon

Label propagation with structured graph learning for semi-supervised dimension reduction

Bayesian inverse regression for supervised dimension reduction with small datasets

Dimension Reduction for Covariates in Network Data

A Fully Bayesian Gradient-Free Supervised Dimension Reduction Method using Gaussian Processes

Supervised t-Distributed Stochastic Neighbor Embedding for Data Visualization and Classification

Supervised dimension reduction for large-scale "omics" data with censored survival outcomes under possible non-proportional hazards

Supervised Dimension Reduction by Local Neighborhood Optimization for Image Processing

Multiple Kernel Feature Line Embedding for Hyperspectral Image Classification

Export Citation Format

supervised dimension reductionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

The Specious Art of Single-Cell Genomics

Identification and prediction of difficult-to-treat rheumatoid arthritis patients in structured and unstructured routine care data: results from a hackathon

Label propagation with structured graph learning for semi-supervised dimension reduction

Bayesian inverse regression for supervised dimension reduction with small datasets

Dimension Reduction for Covariates in Network Data

A Fully Bayesian Gradient-Free Supervised Dimension Reduction Method using Gaussian Processes

Supervised t-Distributed Stochastic Neighbor Embedding for Data Visualization and Classification

Supervised dimension reduction for large-scale "omics" data with censored survival outcomes under possible non-proportional hazards

Supervised Dimension Reduction by Local Neighborhood Optimization for Image Processing

Multiple Kernel Feature Line Embedding for Hyperspectral Image Classification

supervised dimension reduction
Recently Published Documents