Generalized kernel-based inverse regression methods for sufficient dimension reduction

Summary Recent efforts to characterize the human microbiome and its relation to chronic diseases have led to a surge in statistical development for compositional data. We develop likelihood-based sufficient dimension reduction methods (SDR) to find linear combinations that contain all the information in the compositional data on an outcome variable, i.e., are sufficient for modeling and prediction of the outcome. We consider several models for the inverse regression of the compositional vector or transformations of it, as a function of outcome. They include normal, multinomial, and Poisson graphical models that allow for complex dependencies among observed counts. These methods yield efficient estimators of the reduction and can be applied to continuous or categorical outcomes. We incorporate variable selection into the estimation via penalties and address important invariance issues arising from the compositional nature of the data. We illustrate and compare our methods and some established methods for analyzing microbiome data in simulations and using data from the Human Microbiome Project. Displaying the data in the coordinate system of the SDR linear combinations allows visual inspection and facilitates comparisons across studies.

Download Full-text

On expectile-assisted inverse regression estimation for sufficient dimension reduction

Journal of Statistical Planning and Inference ◽

10.1016/j.jspi.2020.11.004 ◽

2021 ◽

Vol 213 ◽

pp. 80-92

Author(s):

Abdul-Nasah Soale ◽

Yuexiao Dong

Keyword(s):

Dimension Reduction ◽

Regression Estimation ◽

Sufficient Dimension Reduction ◽

Inverse Regression

Download Full-text

Analysis of microarray right-censored data through fused sliced inverse regression

Scientific Reports ◽

10.1038/s41598-019-51441-0 ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

Jae Keun Yoo

Keyword(s):

Dimension Reduction ◽

Censored Data ◽

Sliced Inverse Regression ◽

High Dimensional ◽

Sufficient Dimension Reduction ◽

Inverse Regression ◽

Right Censored Data ◽

Practical Advantage ◽

Linear Projection ◽

Lower Dimensional

Abstract Sufficient dimension reduction (SDR) for a regression pursue a replacement of the original p-dimensional predictors with its lower-dimensional linear projection. The so-called sliced inverse regression (SIR; [5]) arguably has the longest history in SDR methodologies, but it is still one of the most popular one. The SIR is known to be easily affected by the number of slices, which is one of its critical deficits. Recently, a fused approach for SIR is proposed to relieve this weakness, which fuses the kernel matrices computed by the SIR application from various numbers of slices. In the paper, the fused SIR is applied to a large-p-small n regression of a high-dimensional microarray right-censored data to show its practical advantage over usual SIR application. Through model validation, it is confirmed that the fused SIR outperforms the SIR with any number of slices under consideration.

Download Full-text

Sufficient Dimension Reduction via Inverse Regression

Journal of the American Statistical Association ◽

10.1198/016214504000001501 ◽

2005 ◽

Vol 100 (470) ◽

pp. 410-428 ◽

Cited By ~ 157

Author(s):

R. Dennis Cook ◽

Liqiang Ni

Keyword(s):

Dimension Reduction ◽

Sufficient Dimension Reduction ◽

Inverse Regression

Download Full-text

Sliced Inverse Regression and Independence in Random Marked Sets with Covariates

Advances in Applied Probability ◽

10.1017/s0001867800006510 ◽

2013 ◽

Vol 45 (03) ◽

pp. 626-644

Author(s):

Ondřej Šedivý ◽

Jakub Stanek ◽

Blažena Kratochvílová ◽

Viktor Beneš

Keyword(s):

Dimension Reduction ◽

Random Fields ◽

Point Processes ◽

Real Data ◽

Gaussian Random Fields ◽

Sliced Inverse Regression ◽

Simulation Studies ◽

Inverse Regression ◽

Regression Methods ◽

Central Subspace

Dimension reduction of multivariate data was developed by Y. Guan for point processes with Gaussian random fields as covariates. The generalization to fibre and surface processes is straightforward. In inverse regression methods, we suggest slicing based on geometrical marks. An investigation of the properties of this method is presented in simulation studies of random marked sets. In a refined model for dimension reduction, the second-order central subspace is analyzed in detail. A real data pattern is tested for independence of a covariate.

Download Full-text

Sliced Inverse Regression and Independence in Random Marked Sets with Covariates

Advances in Applied Probability ◽

10.1239/aap/1377868532 ◽

2013 ◽

Vol 45 (3) ◽

pp. 626-644 ◽

Cited By ~ 3

Author(s):

Ondřej Šedivý ◽

Jakub Stanek ◽

Blažena Kratochvílová ◽

Viktor Beneš

Keyword(s):

Dimension Reduction ◽

Random Fields ◽

Point Processes ◽

Real Data ◽

Gaussian Random Fields ◽

Sliced Inverse Regression ◽

Simulation Studies ◽

Inverse Regression ◽

Regression Methods ◽

Central Subspace

Dimension reduction of multivariate data was developed by Y. Guan for point processes with Gaussian random fields as covariates. The generalization to fibre and surface processes is straightforward. In inverse regression methods, we suggest slicing based on geometrical marks. An investigation of the properties of this method is presented in simulation studies of random marked sets. In a refined model for dimension reduction, the second-order central subspace is analyzed in detail. A real data pattern is tested for independence of a covariate.

Download Full-text

Cumulative Median Estimation for Sufficient Dimension Reduction

Stats ◽

10.3390/stats4010011 ◽

2021 ◽

Vol 4 (1) ◽

pp. 138-145

Author(s):

Stephen Babos ◽

Andreas Artemiou

Keyword(s):

Dimension Reduction ◽

Real Data ◽

Sufficient Dimension Reduction

In this paper, we present the Cumulative Median Estimation (CUMed) algorithm for robust sufficient dimension reduction. Compared with non-robust competitors, this algorithm performs better when there are outliers present in the data and comparably when outliers are not present. This is demonstrated in simulated and real data experiments.

Download Full-text