scholarly journals Proximal Gradient Methods with Adaptive Subspace Sampling

Author(s):  
Dmitry Grishchenko ◽  
Franck Iutzeler ◽  
Jérôme Malick

Many applications in machine learning or signal processing involve nonsmooth optimization problems. This nonsmoothness brings a low-dimensional structure to the optimal solutions. In this paper, we propose a randomized proximal gradient method harnessing this underlying structure. We introduce two key components: (i) a random subspace proximal gradient algorithm; and (ii) an identification-based sampling of the subspaces. Their interplay brings a significant performance improvement on typical learning problems in terms of dimensions explored.

2021 ◽  
Vol 78 (3) ◽  
pp. 705-740
Author(s):  
Caroline Geiersbach ◽  
Teresa Scarinci

AbstractFor finite-dimensional problems, stochastic approximation methods have long been used to solve stochastic optimization problems. Their application to infinite-dimensional problems is less understood, particularly for nonconvex objectives. This paper presents convergence results for the stochastic proximal gradient method applied to Hilbert spaces, motivated by optimization problems with partial differential equation (PDE) constraints with random inputs and coefficients. We study stochastic algorithms for nonconvex and nonsmooth problems, where the nonsmooth part is convex and the nonconvex part is the expectation, which is assumed to have a Lipschitz continuous gradient. The optimization variable is an element of a Hilbert space. We show almost sure convergence of strong limit points of the random sequence generated by the algorithm to stationary points. We demonstrate the stochastic proximal gradient algorithm on a tracking-type functional with a $$L^1$$ L 1 -penalty term constrained by a semilinear PDE and box constraints, where input terms and coefficients are subject to uncertainty. We verify conditions for ensuring convergence of the algorithm and show a simulation.


2020 ◽  
Vol 2020 (1) ◽  
Author(s):  
Peichao Duan ◽  
Yiqun Zhang ◽  
Qinxiong Bu

AbstractThe proximal gradient method is a highly powerful tool for solving the composite convex optimization problem. In this paper, firstly, we propose inexact inertial acceleration methods based on the viscosity approximation and proximal scaled gradient algorithm to accelerate the convergence of the algorithm. Under reasonable parameters, we prove that our algorithms strongly converge to some solution of the problem, which is the unique solution of a variational inequality problem. Secondly, we propose an inexact alternated inertial proximal point algorithm. Under suitable conditions, the weak convergence theorem is proved. Finally, numerical results illustrate the performances of our algorithms and present a comparison with related algorithms. Our results improve and extend the corresponding results reported by many authors recently.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


2019 ◽  
Vol 35 (3) ◽  
pp. 371-378
Author(s):  
PORNTIP PROMSINCHAI ◽  
NARIN PETROT ◽  
◽  
◽  

In this paper, we consider convex constrained optimization problems with composite objective functions over the set of a minimizer of another function. The main aim is to test numerically a new algorithm, namely a stochastic block coordinate proximal-gradient algorithm with penalization, by comparing both the number of iterations and CPU times between this introduced algorithm and the other well-known types of block coordinate descent algorithm for finding solutions of the randomly generated optimization problems with regularization term.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Zhan Wang ◽  
Pengyuan Li ◽  
Xiangrong Li ◽  
Hongtruong Pham

Conjugate gradient methods are well-known methods which are widely applied in many practical fields. CD conjugate gradient method is one of the classical types. In this paper, a modified three-term type CD conjugate gradient algorithm is proposed. Some good features are presented as follows: (i) A modified three-term type CD conjugate gradient formula is presented. (ii) The given algorithm possesses sufficient descent property and trust region property. (iii) The algorithm has global convergence with the modified weak Wolfe–Powell (MWWP) line search technique and projection technique for general function. The new algorithm has made great progress in numerical experiments. It shows that the modified three-term type CD conjugate gradient method is more competitive than the classical CD conjugate gradient method.


2019 ◽  
Vol 13 (04) ◽  
pp. 2050081
Author(s):  
Badreddine Sellami ◽  
Mohamed Chiheb Eddine Sellami

In this paper, we are concerned with the conjugate gradient methods for solving unconstrained optimization problems. we propose a modified Fletcher–Reeves (abbreviated FR) [Function minimization by conjugate gradients, Comput. J. 7 (1964) 149–154] conjugate gradient algorithm satisfying a parametrized sufficient descent condition with a parameter [Formula: see text] is proposed. The parameter [Formula: see text] is computed by means of the conjugacy condition, thus an algorithm which is a positive multiplicative modification of the Hestenes and Stiefel (abbreviated HS) [Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Standards Sec. B 48 (1952) 409–436] algorithm is obtained, which produces a descent search direction at every iteration that the line search satisfies the Wolfe conditions. Under appropriate conditions, we show that the modified FR method with the strong Wolfe line search is globally convergent of uniformly convex functions. We also present extensive preliminary numerical experiments to show the efficiency of the proposed method.


Author(s):  
A. V. Luita ◽  
S. O. Zhilina ◽  
V. V. Semenov

In this paper, problems of bi-level convex minimization in a Hilbert space are considered. The bi-level convex minimization problem is to minimize the first convex function on the set of minima of the second convex function. This setting has many applications, but the implicit constraints generated by the internal problem make it difficult to obtain optimality conditions and construct algorithms. Multilevel optimization problems are formulated in a similar way, the source of which is the operation research problems (optimization according to sequentially specified criteria or lexicographic optimization). Attention is focused on problem solving using two proximal methods. The main theoretical results are theorems on the convergence of methods in various situations. The first of the methods is obtained by combining the penalty function method and the proximal method. Strong convergence is proved in the case of strong convexity of the function of the exterior problem. In the general case, only weak convergence has been proved. The second, the so-called proximal-gradient method, is a combination of one of the variants of the fast proximal-gradient algorithm with the method of penalty functions. The rates of convergence of the proximal-gradient method and its weak convergence are proved.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


2020 ◽  
Vol 25 (4) ◽  
pp. 66
Author(s):  
Seifu Endris Yimer ◽  
Poom Kumam ◽  
Anteneh Getachew Gebrie

In this paper, we consider a bilevel optimization problem as a task of finding the optimum of the upper-level problem subject to the solution set of the split feasibility problem of fixed point problems and optimization problems. Based on proximal and gradient methods, we propose a strongly convergent iterative algorithm with an inertia effect solving the bilevel optimization problem under our consideration. Furthermore, we present a numerical example of our algorithm to illustrate its applicability.


Sign in / Sign up

Export Citation Format

Share Document