Accelerated Proximal Gradient Descent in Metric Learning for Kernel Regression

Author(s):  
Hector Gonzalez ◽  
Carlos Morell ◽  
Francesc J. Ferri
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Jan Klosa ◽  
Noah Simon ◽  
Pål Olof Westermark ◽  
Volkmar Liebscher ◽  
Dörte Wittenburg

Abstract Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R2 > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.


2020 ◽  
Author(s):  
Jan Klosa ◽  
Noah Simon ◽  
Pål O. Westermark ◽  
Volkmar Liebscher ◽  
Dörte Wittenburg

SummaryStatistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalisation approaches are often the methods of choice. They are especially useful in case of multicollinearity which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalised by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realised by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, the package produces complete regularisation paths.Availability and implementationseagull is an R package that is freely available on the Comprehensive R Archive Network (CRAN; https://CRAN.R-project.org/package=seagull; vignette included). The source code is available on https://github.com/jklosa/[email protected]


2021 ◽  
Author(s):  
D N S Ravi Kumar ◽  
G T Sundarrajan ◽  
S D Sundarsingh Jebaseelan ◽  
M. Pushpavalli ◽  
A Rameshbabu ◽  
...  

2020 ◽  
Vol 34 (04) ◽  
pp. 4012-4019
Author(s):  
Xiuwen Gong ◽  
Dong Yuan ◽  
Wei Bao

Existing research into online multi-label classification, such as online sequential multi-label extreme learning machine (OSML-ELM) and stochastic gradient descent (SGD), has achieved promising performance. However, these works lack an analysis of loss function and do not consider label dependency. Accordingly, to fill the current research gap, we propose a novel online metric learning paradigm for multi-label classification. More specifically, we first project instances and labels into a lower dimension for comparison, then leverage the large margin principle to learn a metric with an efficient optimization algorithm. Moreover, we provide theoretical analysis on the upper bound of the cumulative loss for our method. Comprehensive experiments on a number of benchmark multi-label datasets validate our theoretical approach and illustrate that our proposed online metric learning (OML) algorithm outperforms state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document