effective dimension reduction
Recently Published Documents


TOTAL DOCUMENTS

29
(FIVE YEARS 10)

H-INDEX

7
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Lauren L Hsu ◽  
Aedin C Culhane

Effective dimension reduction is an essential step in analysis of single cell RNA-seq(scRNAseq) count data, which are high-dimensional, sparse, and noisy. Principal component analysis (PCA) is widely used in analytical pipelines, and since PCA requires continuous data, it is often coupled with log-transformation in scRNAseq applications. However, log-transformation of scRNAseq counts distorts the data, and can obscure meaningful variation. We describe correspondence analysis (CA) for dimension reduction of scRNAseq data, which is a performant alternative to PCA.Designed for use with counts, CA is based on decomposition of a chi-squared residual matrix and does not require log-transformation of scRNAseq counts. We extend beyond standard CA (decomposition of Pearson residuals computed on the contingency table) and propose variations of CA, including an alternative chi-squared statistic, that address overdispersion and high sparsity in scRNAseq data. The performance of five variations of CA and standard CA are benchmarked on 10 datasets and compared to glmPCA. CA variations are fast, scalable, and outperforms standard CA and glmPCA, to compute embeddings with more performant or comparable clustering accuracy in 8 out of 9 datasets. Of the variations we considered,CA using the Freeman-Tukey chi-squared residual was most performant overall in scRNAseq data. Our analyses also showed that variance stabilizing transformations applied in conjunction with standard CA (using Pearson residuals) and the use of power deflation smoothing both improve performance in downstream clustering tasks, as compared to standard CA alone. CA has advantages including visual illustration of associations between genes and cell populations in a 'CA biplot' and easy extension to multi-table analysis enabling integrative dimension reduction. We introduce corralm, a CA-based method for multi-table batch integration of scRNAseq data in shared latent space, and we propose a new approach for assessing batch integration. We implement CA for scRNAseq in the corral R/Bioconductor package(https://www.bioconductor.org/packages/corral) that interfaces directly with widely used single cell classes in Bioconductor, allowing for easy integration into scRNAseq pipelines.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hamideh Soltani ◽  
Zahra Einalou ◽  
Mehrdad Dadgostar ◽  
Keivan Maghooli

AbstractBrain computer interface (BCI) systems have been regarded as a new way of communication for humans. In this research, common methods such as wavelet transform are applied in order to extract features. However, genetic algorithm (GA), as an evolutionary method, is used to select features. Finally, classification was done using the two approaches support vector machine (SVM) and Bayesian method. Five features were selected and the accuracy of Bayesian classification was measured to be 80% with dimension reduction. Ultimately, the classification accuracy reached 90.4% using SVM classifier. The results of the study indicate a better feature selection and the effective dimension reduction of these features, as well as a higher percentage of classification accuracy in comparison with other studies.


2021 ◽  
Vol 2021 ◽  
pp. 1-21
Author(s):  
Xin Wang ◽  
Guoqiang Wang

Band selection is a direct and effective dimension reduction method and is one of the hotspots in hyperspectral remote sensing research. However, most of the methods ignore the orderliness and correlation of the selected bands and construct band subsets only according to the number of clustering centers desired by band sequencing. To address this issue, this article proposes a band selection method based on adaptive neighborhood grouping and local structure correlation (ANG-LSC). An adaptive subspace method is adopted to segment hyperspectral image cubes in space to avoid obtaining highly correlated subsets. Then, the product of local density and distance factor is utilized to sort each band and select the desired cluster center number. Finally, through the information entropy and correlation analysis of bands in different clusters, the most representative bands are selected from each cluster. Regarding evaluating the effectiveness of the proposed method, comparative experiments with the state-of-the-art methods are conducted on three public hyperspectral datasets. Experimental results demonstrate the superiority and robustness of ANG-LSC.


2020 ◽  
Author(s):  
Hamideh Soltani ◽  
Zahra Einalou ◽  
Keivan Maghooli

Abstract In recent years, brain-computer communication systems have been regarded as a new way of communication for humans. One of the applications of brain-computer communication is the development of systems which facilitates communication. To this end, it is necessary to extract the visually evoked signals from the EEG signal and classify it. In this research, common methods such as wavelet transform are applied in order to extract features. However, genetic algorithm, as an evolutionary method, is used to select features. Finally, after selecting features, the classification was done using the two approaches support vector machine and Bayesian method. Five features were selected and the accuracy of Bayesian classification was measured to be 80% with dimension reduction, and 78% without dimension reduction. Ultimately, the classification accuracy reached 90.4% using SVM classifier. The results of the study indicate a better feature selection and the effective dimension reduction of these features, as well as a higher percentage of classification accuracy in comparison with other studies.


2019 ◽  
Vol 17 (05) ◽  
pp. 715-736
Author(s):  
Ning Zhang ◽  
Zhou Yu ◽  
Qiang Wu

Sliced inverse regression (SIR) is a pioneer tool for supervised dimension reduction. It identifies the effective dimension reduction space, the subspace of significant factors with intrinsic lower dimensionality. In this paper, we propose to refine the SIR algorithm through an overlapping slicing scheme. The new algorithm, called overlapping SIR (OSIR), is able to estimate the effective dimension reduction space and determine the number of effective factors more accurately. We show that such overlapping procedure has the potential to identify the information contained in the derivatives of the inverse regression curve, which helps to explain the superiority of OSIR. We also prove that OSIR algorithm is [Formula: see text]-consistent and verify its effectiveness by simulations and real applications.


Sign in / Sign up

Export Citation Format

Share Document