scholarly journals Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Author(s):  
Ehsan Kazemi ◽  
Liqiang Wang

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.

Author(s):  
O.B. Akinduko

In this paper, by linearly combining the numerator and denominator terms of the Dai-Liao (DL) and Bamigbola-Ali-Nwaeze (BAN) conjugate gradient methods (CGMs), a general form of DL-BAN method has been proposed. From this general form, a new hybrid CGM, which was found to possess a sufficient descent property is generated. Numerical experiment was carried out on the new CGM in comparison with four existing CGMs, using some set of large scale unconstrained optimization problems. The result showed a superior performance of new method over majority of the existing methods.


2021 ◽  
Vol 6 (10) ◽  
pp. 10742-10764
Author(s):  
Ibtisam A. Masmali ◽  
◽  
Zabidin Salleh ◽  
Ahmad Alhawarat ◽  
◽  
...  

<abstract> <p>The conjugate gradient (CG) method is a method to solve unconstrained optimization problems. Moreover CG method can be applied in medical science, industry, neural network, and many others. In this paper a new three term CG method is proposed. The new CG formula is constructed based on DL and WYL CG formulas to be non-negative and inherits the properties of HS formula. The new modification satisfies the convergence properties and the sufficient descent property. The numerical results show that the new modification is more efficient than DL, WYL, and CG-Descent formulas. We use more than 200 functions from CUTEst library to compare the results between these methods in term of number of iterations, function evaluations, gradient evaluations, and CPU time.</p> </abstract>


Algorithms ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 146
Author(s):  
Aleksei Vakhnin ◽  
Evgenii Sopov

Modern real-valued optimization problems are complex and high-dimensional, and they are known as “large-scale global optimization (LSGO)” problems. Classic evolutionary algorithms (EAs) perform poorly on this class of problems because of the curse of dimensionality. Cooperative Coevolution (CC) is a high-performed framework for performing the decomposition of large-scale problems into smaller and easier subproblems by grouping objective variables. The efficiency of CC strongly depends on the size of groups and the grouping approach. In this study, an improved CC (iCC) approach for solving LSGO problems has been proposed and investigated. iCC changes the number of variables in subcomponents dynamically during the optimization process. The SHADE algorithm is used as a subcomponent optimizer. We have investigated the performance of iCC-SHADE and CC-SHADE on fifteen problems from the LSGO CEC’13 benchmark set provided by the IEEE Congress of Evolutionary Computation. The results of numerical experiments have shown that iCC-SHADE outperforms, on average, CC-SHADE with a fixed number of subcomponents. Also, we have compared iCC-SHADE with some state-of-the-art LSGO metaheuristics. The experimental results have shown that the proposed algorithm is competitive with other efficient metaheuristics.


2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Min Sun ◽  
Jing Liu

Recently, Zhang et al. proposed a sufficient descent Polak-Ribière-Polyak (SDPRP) conjugate gradient method for large-scale unconstrained optimization problems and proved its global convergence in the sense thatlim infk→∞∥∇f(xk)∥=0when an Armijo-type line search is used. In this paper, motivated by the line searches proposed by Shi et al. and Zhang et al., we propose two new Armijo-type line searches and show that the SDPRP method has strong convergence in the sense thatlimk→∞∥∇f(xk)∥=0under the two new line searches. Numerical results are reported to show the efficiency of the SDPRP with the new Armijo-type line searches in practical computation.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 758
Author(s):  
Andrea Ferigo ◽  
Giovanni Iacca

The ever-increasing complexity of industrial and engineering problems poses nowadays a number of optimization problems characterized by thousands, if not millions, of variables. For instance, very large-scale problems can be found in chemical and material engineering, networked systems, logistics and scheduling. Recently, Deb and Myburgh proposed an evolutionary algorithm capable of handling a scheduling optimization problem with a staggering number of variables: one billion. However, one important limitation of this algorithm is its memory consumption, which is in the order of 120 GB. Here, we follow up on this research by applying to the same problem a GPU-enabled “compact” Genetic Algorithm, i.e., an Estimation of Distribution Algorithm that instead of using an actual population of candidate solutions only requires and adapts a probabilistic model of their distribution in the search space. We also introduce a smart initialization technique and custom operators to guide the search towards feasible solutions. Leveraging the compact optimization concept, we show how such an algorithm can optimize efficiently very large-scale problems with millions of variables, with limited memory and processing power. To complete our analysis, we report the results of the algorithm on very large-scale instances of the OneMax problem.


Author(s):  
Ion Necoara ◽  
Martin Takáč

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.


2014 ◽  
Vol 19 (4) ◽  
pp. 469-490 ◽  
Author(s):  
Hamid Esmaeili ◽  
Morteza Kimiaei

In this study, we propose a trust-region-based procedure to solve unconstrained optimization problems that take advantage of the nonmonotone technique to introduce an efficient adaptive radius strategy. In our approach, the adaptive technique leads to decreasing the total number of iterations, while utilizing the structure of nonmonotone formula helps us to handle large-scale problems. The new algorithm preserves the global convergence and has quadratic convergence under suitable conditions. Preliminary numerical experiments on standard test problems indicate the efficiency and robustness of the proposed approach for solving unconstrained optimization problems.


Sign in / Sign up

Export Citation Format

Share Document