Semi-supervised Multi-label Dimensionality Reduction via Low Rank Representation

Author(s):  
Yezi Liu
2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Baokai Zu ◽  
Kewen Xia ◽  
Shuidong Dai ◽  
Nelofar Aslam

Semisupervised Discriminant Analysis (SDA) aims at dimensionality reduction with both limited labeled data and copious unlabeled data, but it may fail to discover the intrinsic geometry structure when the data manifold is highly nonlinear. The kernel trick is widely used to map the original nonlinearly separable problem to an intrinsically larger dimensionality space where the classes are linearly separable. Inspired by low-rank representation (LLR), we proposed a novel kernel SDA method called low-rank kernel-based SDA (LRKSDA) algorithm where the LRR is used as the kernel representation. Since LRR can capture the global data structures and get the lowest rank representation in a parameter-free way, the low-rank kernel method is extremely effective and robust for kinds of data. Extensive experiments on public databases show that the proposed LRKSDA dimensionality reduction algorithm can achieve better performance than other related kernel SDA methods.


2019 ◽  
Vol 11 (12) ◽  
pp. 1485 ◽  
Author(s):  
Jinliang An ◽  
Jinhui Lei ◽  
Yuzhen Song ◽  
Xiangrong Zhang ◽  
Jinmei Guo

Dimensionality reduction is an essential and important issue in hyperspectral image processing. With the advantages of preserving the spatial neighborhood information and the global structure information, tensor analysis and low rank representation have been widely considered in this field and yielded satisfactory performance. In available tensor- and low rank-based methods, how to construct appropriate tensor samples and determine the optimal rank of hyperspectral images along each mode are still challenging issues. To address these drawbacks, an unsupervised tensor-based multiscale low rank decomposition (T-MLRD) method for hyperspectral images dimensionality reduction is proposed in this paper. By regarding the raw cube hyperspectral image as the only tensor sample, T-MLRD needs no labeled samples and avoids the processing of constructing tensor samples. In addition, a novel multiscale low rank estimating method is proposed to obtain the optimal rank along each mode of hyperspectral image which avoids the complicated rank computing. Finally, the multiscale low rank feature representation is fused to achieve dimensionality reduction. Experimental results on real hyperspectral datasets demonstrate the superiority of the proposed method over several state-of-the-art approaches.


2020 ◽  
Vol 10 ◽  
Author(s):  
Conghai Lu ◽  
Juan Wang ◽  
Jinxing Liu ◽  
Chunhou Zheng ◽  
Xiangzhen Kong ◽  
...  

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Joshua T. Vogelstein ◽  
Eric W. Bridgeford ◽  
Minh Tang ◽  
Da Zheng ◽  
Christopher Douville ◽  
...  

AbstractTo solve key biomedical problems, experimentalists now routinely measure millions or billions of features (dimensions) per sample, with the hope that data science techniques will be able to build accurate data-driven inferences. Because sample sizes are typically orders of magnitude smaller than the dimensionality of these data, valid inferences require finding a low-dimensional representation that preserves the discriminating information (e.g., whether the individual suffers from a particular disease). There is a lack of interpretable supervised dimensionality reduction methods that scale to millions of dimensions with strong statistical theoretical guarantees. We introduce an approach to extending principal components analysis by incorporating class-conditional moment estimates into the low-dimensional projection. The simplest version, Linear Optimal Low-rank projection, incorporates the class-conditional means. We prove, and substantiate with both synthetic and real data benchmarks, that Linear Optimal Low-Rank Projection and its generalizations lead to improved data representations for subsequent classification, while maintaining computational efficiency and scalability. Using multiple brain imaging datasets consisting of more than 150 million features, and several genomics datasets with more than 500,000 features, Linear Optimal Low-Rank Projection outperforms other scalable linear dimensionality reduction techniques in terms of accuracy, while only requiring a few minutes on a standard desktop computer.


2018 ◽  
Vol 27 (07) ◽  
pp. 1860013 ◽  
Author(s):  
Swair Shah ◽  
Baokun He ◽  
Crystal Maung ◽  
Haim Schweitzer

Principal Component Analysis (PCA) is a classical dimensionality reduction technique that computes a low rank representation of the data. Recent studies have shown how to compute this low rank representation from most of the data, excluding a small amount of outlier data. We show how to convert this problem into graph search, and describe an algorithm that solves this problem optimally by applying a variant of the A* algorithm to search for the outliers. The results obtained by our algorithm are optimal in terms of accuracy, and are shown to be more accurate than results obtained by the current state-of-the- art algorithms which are shown not to be optimal. This comes at the cost of running time, which is typically slower than the current state of the art. We also describe a related variant of the A* algorithm that runs much faster than the optimal variant and produces a solution that is guaranteed to be near the optimal. This variant is shown experimentally to be more accurate than the current state-of-the-art and has a comparable running time.


Sign in / Sign up

Export Citation Format

Share Document