Sparse group lasso and high dimensional multinomial classification

Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs). Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.

Download Full-text

Seagull: lasso, group lasso and sparse-group lasso regularization for linear regression models via proximal gradient descent

BMC Bioinformatics ◽

10.1186/s12859-020-03725-w ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Jan Klosa ◽

Noah Simon ◽

Pål Olof Westermark ◽

Volkmar Liebscher ◽

Dörte Wittenburg

Keyword(s):

Linear Regression ◽

Regression Models ◽

Gradient Descent ◽

Methylation Status ◽

R Package ◽

Group Lasso ◽

High Dimensional ◽

Linear Regression Models ◽

Sparse Group Lasso ◽

Proximal Gradient Descent

Abstract Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R2 > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.

Download Full-text

A Novel Convex Clustering Method for High-Dimensional Data Using Semiproximal ADMM

Mathematical Problems in Engineering ◽

10.1155/2020/9216351 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Huangyue Chen ◽

Lingchen Kong ◽

Yan Li

Keyword(s):

High Dimensional Data ◽

Group Lasso ◽

High Dimensional ◽

Clustering Methods ◽

Finite Sample ◽

Clustering Method ◽

Sparse Group Lasso ◽

Clustering Model ◽

Sample Error ◽

Convex Clustering

Clustering is an important ingredient of unsupervised learning; classical clustering methods include K-means clustering and hierarchical clustering. These methods may suffer from instability because of their tendency prone to sink into the local optimal solutions of the nonconvex optimization model. In this paper, we propose a new convex clustering method for high-dimensional data based on the sparse group lasso penalty, which can simultaneously group observations and eliminate noninformative features. In this method, the number of clusters can be learned from the data instead of being given in advance as a parameter. We theoretically prove that the proposed method has desirable statistical properties, including a finite sample error bound and feature screening consistency. Furthermore, the semiproximal alternating direction method of multipliers is designed to solve the sparse group lasso convex clustering model, and its convergence analysis is established without any conditions. Finally, the effectiveness of the proposed method is thoroughly demonstrated through simulated experiments and real applications.

Download Full-text

The sparse group lasso for high-dimensional integrative linear discriminant analysis with application to alzheimer's disease prediction

Journal of Statistical Computation and Simulation ◽

10.1080/00949655.2020.1800011 ◽

2020 ◽

Vol 90 (17) ◽

pp. 3218-3231

Author(s):

Hao Chen ◽

Yong He ◽

Jiadong Ji ◽

Yufeng Shi

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Discriminant Analysis ◽

Linear Discriminant Analysis ◽

Group Lasso ◽

High Dimensional ◽

Disease Prediction ◽

Linear Discriminant ◽

Sparse Group Lasso

Download Full-text

Change-point estimation in high dimensional linear regression models via sparse group Lasso

2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) ◽

10.1109/allerton.2015.7447090 ◽

2015 ◽

Cited By ~ 2

Author(s):

Bingwen Zhang ◽

Jun Geng ◽

Lifeng Lai

Keyword(s):

Linear Regression ◽

Change Point ◽

Regression Models ◽

Group Lasso ◽

Point Estimation ◽

High Dimensional ◽

Linear Regression Models ◽

Sparse Group Lasso ◽

Change Point Estimation

Download Full-text

Inverse Sparse Group Lasso Model for Robust Object Tracking

IEEE Transactions on Multimedia ◽

10.1109/tmm.2017.2689918 ◽

2017 ◽

Vol 19 (8) ◽

pp. 1798-1810 ◽

Cited By ~ 20

Author(s):

Yun Zhou ◽

Jianghong Han ◽

Xiaohui Yuan ◽

Zhenchun Wei ◽

Richang Hong

Keyword(s):

Object Tracking ◽

Group Lasso ◽

Sparse Group Lasso

Download Full-text

Adaptive logistic group Lasso method for predicting the no-reflow among the multiple types of high-dimensional variables with missing data

2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS) ◽

10.1109/icsess.2016.7883254 ◽

2016 ◽

Author(s):

Xianglin Yang ◽

Yunhai Tong ◽

Xiangfeng Meng ◽

Shuai Zhao ◽

Zhi Xu ◽

...

Keyword(s):

Missing Data ◽

Group Lasso ◽

High Dimensional ◽

No Reflow ◽

Lasso Method

Download Full-text

Video-to-shot tag allocation by weighted sparse group lasso

Proceedings of the 19th ACM international conference on Multimedia - MM '11 ◽

10.1145/2072298.2072050 ◽

2011 ◽

Cited By ~ 5

Author(s):

Xiaofeng Zhu ◽

Zi Huang ◽

Heng Tao Shen

Keyword(s):

Group Lasso ◽

Sparse Group Lasso

Download Full-text

A Novel Method for Enhancing Bearing Fault Feature Based on Sparse Group Lasso

10.1109/sdpc52933.2021.9563554 ◽

2021 ◽

Author(s):

Changkun Han ◽

Wei Lu ◽

Pengxin Wang ◽

Liuyang Song ◽

Huaqing Wang

Keyword(s):

Group Lasso ◽

Bearing Fault ◽

Sparse Group Lasso ◽

Feature Based ◽

Novel Method

Download Full-text

Rule Extraction from Decision Trees Ensembles: New Algorithms Based on Heuristic Search and Sparse Group Lasso Methods

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622017500055 ◽

2017 ◽

Vol 16 (06) ◽

pp. 1707-1727 ◽

Cited By ~ 9

Author(s):

Morteza Mashayekhi ◽

Robin Gras

Keyword(s):

Decision Trees ◽

Predictive Accuracy ◽

Weight Vector ◽

Rule Extraction ◽

Group Lasso ◽

Hill Climbing ◽

Data Sets ◽

Sparse Group Lasso ◽

Rule Set ◽

Interpretable Models

Decision trees are examples of easily interpretable models whose predictive accuracy is normally low. In comparison, decision tree ensembles (DTEs) such as random forest (RF) exhibit high predictive accuracy while being regarded as black-box models. We propose three new rule extraction algorithms from DTEs. The RF[Formula: see text]DHC method, a hill climbing method with downhill moves (DHC), is used to search for a rule set that decreases the number of rules dramatically. In the RF[Formula: see text]SGL and RF[Formula: see text]MSGL methods, the sparse group lasso (SGL) method, and the multiclass SGL (MSGL) method are employed respectively to find a sparse weight vector corresponding to the rules generated by RF. Experimental results with 24 data sets show that the proposed methods outperform similar state-of-the-art methods, in terms of human comprehensibility, by greatly reducing the number of rules and limiting the number of antecedents in the retained rules, while preserving the same level of accuracy.

Download Full-text