Non-asymptotic oracle inequalities for the Lasso and Group Lasso in high dimensional logistic model

2020 ◽

Vol 2020 (1) ◽

Author(s):

Yijun Xiao ◽

Ting Yan ◽

Huiming Zhang ◽

Yuanyuan Zhang

Keyword(s):

Estimation Error ◽

Group Lasso ◽

High Dimensional ◽

Group Sparsity ◽

Oracle Inequalities ◽

Cox Models ◽

Special Cases ◽

Coefficient Vector ◽

General Norm ◽

Time Dependent Covariates

AbstractWe study the nonasymptotic properties of a general norm penalized estimator, which include Lasso, weighted Lasso, and group Lasso as special cases, for sparse high-dimensional misspecified Cox models with time-dependent covariates. Under suitable conditions on the true regression coefficients and random covariates, we provide oracle inequalities for prediction and estimation error based on the group sparsity of the true coefficient vector. The nonasymptotic oracle inequalities show that the penalized estimator has good sparse approximation of the true model and enables to select a few meaningful structure variables among the set of features.

Download Full-text

Adaptive logistic group Lasso method for predicting the no-reflow among the multiple types of high-dimensional variables with missing data

2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS) ◽

10.1109/icsess.2016.7883254 ◽

2016 ◽

Author(s):

Xianglin Yang ◽

Yunhai Tong ◽

Xiangfeng Meng ◽

Shuai Zhao ◽

Zhi Xu ◽

...

Keyword(s):

Missing Data ◽

Group Lasso ◽

High Dimensional ◽

No Reflow ◽

Lasso Method

Download Full-text

Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

BioMed Research International ◽

10.1155/2015/703768 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 11

Author(s):

Jin-Jia Wang ◽

Fang Xue ◽

Hui Li

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Group Lasso ◽

High Dimensional ◽

Test Accuracy ◽

Gradient Descent Method ◽

Feature Subset ◽

Eeg Signals ◽

Sparse Group Lasso ◽

Selection Of

Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs). Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.

Download Full-text

Non-asymptotic oracle inequalities for the high-dimensional cox regression via lasso

Statistica Sinica ◽

10.5705/ss.2012.240 ◽

2013 ◽

Cited By ~ 1

Author(s):

Shengchun Kong ◽

Bin Nan

Keyword(s):

Cox Regression ◽

High Dimensional ◽

Oracle Inequalities

Download Full-text

Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent

BMC Bioinformatics ◽

10.1186/s12859-020-03725-w ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Jan Klosa ◽

Noah Simon ◽

Pål Olof Westermark ◽

Volkmar Liebscher ◽

Dörte Wittenburg

Keyword(s):

Linear Regression ◽

Regression Models ◽

Gradient Descent ◽

Methylation Status ◽

R Package ◽

Group Lasso ◽

High Dimensional ◽

Linear Regression Models ◽

Sparse Group Lasso ◽

Proximal Gradient Descent

Abstract Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R2 > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.

Download Full-text

Oracle inequalities for high-dimensional prediction

Bernoulli ◽

10.3150/18-bej1019 ◽

2019 ◽

Vol 25 (2) ◽

pp. 1225-1255 ◽

Cited By ~ 2

Author(s):

Johannes Lederer ◽

Lu Yu ◽

Irina Gaynanova

Keyword(s):

High Dimensional ◽

Oracle Inequalities

Download Full-text

On Oracle Inequalities Related to High Dimensional Linear Models

Topics in Stochastic Analysis and Nonparametric Estimation - The IMA Volumes in Mathematics and its Applications ◽

10.1007/978-0-387-75111-5_6 ◽

2008 ◽

pp. 105-122

Author(s):

Yuri Golubev

Keyword(s):

Linear Models ◽

High Dimensional ◽

Oracle Inequalities

Download Full-text

Oracle Inequalities for High Dimensional Vector Autoregressions

SSRN Electronic Journal ◽

10.2139/ssrn.2073319 ◽

2012 ◽

Cited By ~ 6

Author(s):

Anders Bredahl Kock ◽

Laurent Callot

Keyword(s):

High Dimensional ◽

Dimensional Vector ◽

Vector Autoregressions ◽

Oracle Inequalities

Download Full-text

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification

Entropy ◽

10.3390/e22050543 ◽

2020 ◽

Vol 22 (5) ◽

pp. 543 ◽

Cited By ~ 2

Author(s):

Konrad Furmańczyk ◽

Wojciech Rejchel

Keyword(s):

Logistic Regression ◽

Variable Selection ◽

Logistic Model ◽

Binary Classification ◽

Model Misspecification ◽

High Dimensional ◽

Classification Models ◽

Computationally Efficient ◽

Class Labels ◽

Penalized Logistic Regression

In this paper, we consider prediction and variable selection in the misspecified binary classification models under the high-dimensional scenario. We focus on two approaches to classification, which are computationally efficient, but lead to model misspecification. The first one is to apply penalized logistic regression to the classification data, which possibly do not follow the logistic model. The second method is even more radical: we just treat class labels of objects as they were numbers and apply penalized linear regression. In this paper, we investigate thoroughly these two approaches and provide conditions, which guarantee that they are successful in prediction and variable selection. Our results hold even if the number of predictors is much larger than the sample size. The paper is completed by the experimental results.

Download Full-text

The Oracle Inequalities on Simultaneous Lasso and Dantzig Selector in High-Dimensional Nonparametric Regression

Mathematical Problems in Engineering ◽

10.1155/2013/571361 ◽

2013 ◽

Vol 2013 ◽

pp. 1-6 ◽

Cited By ~ 1

Author(s):

Shiqing Wang ◽

Limin Su

Keyword(s):

Nonparametric Regression ◽

High Dimensional ◽

Oracle Inequalities ◽

Dantzig Selector

Download Full-text

Non-asymptotic oracle inequalities for the Lasso and Group Lasso in high dimensional logistic model

Oracle inequalities for weighted group lasso in high-dimensional misspecified Cox models

Adaptive logistic group Lasso method for predicting the no-reflow among the multiple types of high-dimensional variables with missing data

Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

Non-asymptotic oracle inequalities for the high-dimensional cox regression via lasso

Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent

Oracle inequalities for high-dimensional prediction

On Oracle Inequalities Related to High Dimensional Linear Models

Oracle Inequalities for High Dimensional Vector Autoregressions

Prediction and Variable Selection in High-Dimensional Misspecified Binary Classification

The Oracle Inequalities on Simultaneous Lasso and Dantzig Selector in High-Dimensional Nonparametric Regression

Export Citation Format