A Survey of Tuning Parameter Selection for High-Dimensional Regression

Penalized (or regularized) regression, as represented by lasso and its variants, has become a standard technique for analyzing high-dimensional data when the number of variables substantially exceeds the sample size. The performance of penalized regression relies crucially on the choice of the tuning parameter, which determines the amount of regularization and hence the sparsity level of the fitted model. The optimal choice of tuning parameter depends on both the structure of the design matrix and the unknown random error distribution (variance, tail behavior, etc.). This article reviews the current literature of tuning parameter selection for high-dimensional regression from both the theoretical and practical perspectives. We discuss various strategies that choose the tuning parameter to achieve prediction accuracy or support recovery. We also review several recently proposed methods for tuning-free high-dimensional regression.

Download Full-text

Consistent tuning parameter selection in high-dimensional group-penalized regression

Science China Mathematics ◽

10.1007/s11425-017-9189-9 ◽

2018 ◽

Vol 62 (4) ◽

pp. 751-770

Author(s):

Yaguang Li ◽

Yaohua Wu ◽

Baisuo Jin

Keyword(s):

Penalized Regression ◽

Parameter Selection ◽

Tuning Parameter ◽

High Dimensional ◽

Dimensional Group

Download Full-text

Outlier-resistant high-dimensional regression modelling based on distribution-free outlier detection and tuning parameter selection

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2017.1287186 ◽

2017 ◽

Vol 87 (9) ◽

pp. 1799-1812 ◽

Cited By ~ 1

Author(s):

Heewon Park

Keyword(s):

Outlier Detection ◽

Parameter Selection ◽

Tuning Parameter ◽

High Dimensional ◽

Distribution Free ◽

Regression Modelling ◽

High Dimensional Regression

Download Full-text

A study on tuning parameter selection for the high-dimensional lasso

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2018.1491575 ◽

2018 ◽

Vol 88 (15) ◽

pp. 2865-2892 ◽

Cited By ~ 2

Author(s):

Darren Homrighausen ◽

Daniel J. McDonald

Keyword(s):

Parameter Selection ◽

Tuning Parameter ◽

High Dimensional ◽

Selection For

Download Full-text

Tuning parameter selection in high dimensional penalized likelihood

Journal of the Royal Statistical Society Series B (Statistical Methodology) ◽

10.1111/rssb.12001 ◽

2012 ◽

Vol 75 (3) ◽

pp. 531-552 ◽

Cited By ~ 97

Author(s):

Yingying Fan ◽

Cheng Yong Tang

Keyword(s):

Penalized Likelihood ◽

Parameter Selection ◽

Tuning Parameter ◽

High Dimensional

Download Full-text

A penalized regression approach for DNA copy number study using the sequencing data

Statistical Applications in Genetics and Molecular Biology ◽

10.1515/sagmb-2018-0001 ◽

2019 ◽

Vol 18 (4) ◽

Cited By ~ 1

Author(s):

Jaeeun Lee ◽

Jie Chen

Keyword(s):

Copy Number ◽

Information Criterion ◽

Penalized Regression ◽

Parameter Selection ◽

Information Criteria ◽

Tuning Parameter ◽

Dimensional Structure ◽

Sequencing Data ◽

Dna Copy Number ◽

Regression Approach

Abstract Modeling the high-throughput next generation sequencing (NGS) data, resulting from experiments with the goal of profiling tumor and control samples for the study of DNA copy number variants (CNVs), remains to be a challenge in various ways. In this application work, we provide an efficient method for detecting multiple CNVs using NGS reads ratio data. This method is based on a multiple statistical change-points model with the penalized regression approach, 1d fused LASSO, that is designed for ordered data in a one-dimensional structure. In addition, since the path algorithm traces the solution as a function of a tuning parameter, the number and locations of potential CNV region boundaries can be estimated simultaneously in an efficient way. For tuning parameter selection, we then propose a new modified Bayesian information criterion, called JMIC, and compare the proposed JMIC with three different Bayes information criteria used in the literature. Simulation results have shown the better performance of JMIC for tuning parameter selection, in comparison with the other three criterion. We applied our approach to the sequencing data of reads ratio between the breast tumor cell lines HCC1954 and its matched normal cell line BL 1954 and the results are in-line with those discovered in the literature.

Download Full-text

Tuning parameter selection for the adaptive LASSO in the autoregressive model

Journal of the Korean Statistical Society ◽

10.1016/j.jkss.2016.10.005 ◽

2017 ◽

Vol 46 (2) ◽

pp. 285-297 ◽

Cited By ~ 5

Author(s):

Sunghoon Kwon ◽

Sangin Lee ◽

Okyoung Na

Keyword(s):

Autoregressive Model ◽

Parameter Selection ◽

Adaptive Lasso ◽

Tuning Parameter ◽

Selection For

Download Full-text

Two tales of variable selection for high dimensional regression: Screening and model building

Statistical Analysis and Data Mining The ASA Data Science Journal ◽

10.1002/sam.11219 ◽

2014 ◽

Vol 7 (2) ◽

pp. 140-159 ◽

Cited By ~ 6

Author(s):

Cong Liu ◽

Tao Shi ◽

Yoonkyung Lee

Keyword(s):

Variable Selection ◽

Model Building ◽

High Dimensional ◽

High Dimensional Regression ◽

Selection For

Download Full-text

Tuning Parameter Selection for Underdetermined Reduced-Rank Regression

IEEE Signal Processing Letters ◽

10.1109/lsp.2013.2272463 ◽

2013 ◽

Vol 20 (9) ◽

pp. 881-884 ◽

Cited By ~ 5

Author(s):

Magnus O. Ulfarsson ◽

Victor Solo

Keyword(s):

Parameter Selection ◽

Tuning Parameter ◽

Reduced Rank Regression ◽

Rank Regression ◽

Reduced Rank ◽

Selection For

Download Full-text

Ensemble Linear Subspace Analysis of High-Dimensional Data

Entropy ◽

10.3390/e23030324 ◽

2021 ◽

Vol 23 (3) ◽

pp. 324

Author(s):

S. Ejaz Ahmed ◽

Saeid Amiri ◽

Kjell Doksum

Keyword(s):

Linear Models ◽

Real Data ◽

Penalty Methods ◽

Tuning Parameter ◽

High Dimensional ◽

Prediction Errors ◽

Tuning Parameters ◽

Ensemble Approach ◽

High Dimensional Regression ◽

Multivariate Mutual Information

Regression models provide prediction frameworks for multivariate mutual information analysis that uses information concepts when choosing covariates (also called features) that are important for analysis and prediction. We consider a high dimensional regression framework where the number of covariates (p) exceed the sample size (n). Recent work in high dimensional regression analysis has embraced an ensemble subspace approach that consists of selecting random subsets of covariates with fewer than p covariates, doing statistical analysis on each subset, and then merging the results from the subsets. We examine conditions under which penalty methods such as Lasso perform better when used in the ensemble approach by computing mean squared prediction errors for simulations and a real data example. Linear models with both random and fixed designs are considered. We examine two versions of penalty methods: one where the tuning parameter is selected by cross-validation; and one where the final predictor is a trimmed average of individual predictors corresponding to the members of a set of fixed tuning parameters. We find that the ensemble approach improves on penalty methods for several important real data and model scenarios. The improvement occurs when covariates are strongly associated with the response, when the complexity of the model is high. In such cases, the trimmed average version of ensemble Lasso is often the best predictor.

Download Full-text

Tuning Parameter Selection for Sentinel-2 Sharpening Using Wald's Protocol

10.1109/igarss47720.2021.9553346 ◽

2021 ◽

Author(s):

Sveinn E. Armannsson ◽

Jakob Sigurdsson ◽

Johannes R. Sveinsson ◽

Magnus O. Ulfarsson

Keyword(s):

Parameter Selection ◽

Tuning Parameter ◽

Selection For ◽

Sentinel 2

Download Full-text