Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011528 ◽

2019 ◽

Vol 33 ◽

pp. 1528-1535

Author(s):

Ehsan Kazemi ◽

Liqiang Wang

Keyword(s):

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Coordinate Descent ◽

Performance Guarantee ◽

Nonsmooth Problems ◽

Descent Property ◽

Large Scale Problems ◽

Nonconvex And Nonsmooth Optimization ◽

Sufficient Descent

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.

Download Full-text

Modified nonlinear conjugate gradient methods with sufficient descent property for large-scale optimization problems

Optimization Letters ◽

10.1007/s11590-008-0086-5 ◽

2008 ◽

Vol 3 (1) ◽

pp. 11-21 ◽

Cited By ~ 95

Author(s):

Gonglin Yuan

Keyword(s):

Large Scale ◽

Optimization Problems ◽

Gradient Methods ◽

Conjugate Gradient Methods ◽

Large Scale Optimization ◽

Descent Property ◽

Nonlinear Conjugate Gradient ◽

Sufficient Descent ◽

Scale Optimization ◽

Nonlinear Conjugate Gradient Methods

Download Full-text

A New Conjugate Gradient Method with Sufficient Descent Property

Earthline Journal of Mathematical Sciences ◽

10.34198/ejms.6121.163174 ◽

2021 ◽

pp. 163-174

Author(s):

O.B. Akinduko

Keyword(s):

Conjugate Gradient ◽

Large Scale ◽

Optimization Problems ◽

Gradient Methods ◽

Superior Performance ◽

Sufficient Descent Property ◽

Unconstrained Optimization Problems ◽

Descent Property ◽

Sufficient Descent ◽

Large Scale Unconstrained Optimization

In this paper, by linearly combining the numerator and denominator terms of the Dai-Liao (DL) and Bamigbola-Ali-Nwaeze (BAN) conjugate gradient methods (CGMs), a general form of DL-BAN method has been proposed. From this general form, a new hybrid CGM, which was found to possess a sufficient descent property is generated. Numerical experiment was carried out on the new CGM in comparison with four existing CGMs, using some set of large scale unconstrained optimization problems. The result showed a superior performance of new method over majority of the existing methods.

Download Full-text

A decent three term conjugate gradient method with global convergence properties for large scale unconstrained optimization problems

Mathematical Biosciences and Engineering ◽

10.3934/math.2021624 ◽

2021 ◽

Vol 6 (10) ◽

pp. 10742-10764

Author(s):

Ibtisam A. Masmali ◽

◽

Zabidin Salleh ◽

Ahmad Alhawarat ◽

◽

...

Keyword(s):

Unconstrained Optimization ◽

Conjugate Gradient ◽

Large Scale ◽

Optimization Problems ◽

Medical Science ◽

Convergence Properties ◽

Unconstrained Optimization Problems ◽

Descent Property ◽

Sufficient Descent ◽

Large Scale Unconstrained Optimization

<abstract> <p>The conjugate gradient (CG) method is a method to solve unconstrained optimization problems. Moreover CG method can be applied in medical science, industry, neural network, and many others. In this paper a new three term CG method is proposed. The new CG formula is constructed based on DL and WYL CG formulas to be non-negative and inherits the properties of HS formula. The new modification satisfies the convergence properties and the sufficient descent property. The numerical results show that the new modification is more efficient than DL, WYL, and CG-Descent formulas. We use more than 200 functions from CUTEst library to compare the results between these methods in term of number of iterations, function evaluations, gradient evaluations, and CPU time.</p> </abstract>

Download Full-text

Investigation of Improved Cooperative Coevolution for Large-Scale Global Optimization Problems

Algorithms ◽

10.3390/a14050146 ◽

2021 ◽

Vol 14 (5) ◽

pp. 146

Author(s):

Aleksei Vakhnin ◽

Evgenii Sopov

Keyword(s):

Global Optimization ◽

Evolutionary Algorithms ◽

Numerical Experiments ◽

Large Scale ◽

Optimization Problems ◽

State Of The Art ◽

Fixed Number ◽

High Dimensional ◽

Cooperative Coevolution ◽

Large Scale Problems

Modern real-valued optimization problems are complex and high-dimensional, and they are known as “large-scale global optimization (LSGO)” problems. Classic evolutionary algorithms (EAs) perform poorly on this class of problems because of the curse of dimensionality. Cooperative Coevolution (CC) is a high-performed framework for performing the decomposition of large-scale problems into smaller and easier subproblems by grouping objective variables. The efficiency of CC strongly depends on the size of groups and the grouping approach. In this study, an improved CC (iCC) approach for solving LSGO problems has been proposed and investigated. iCC changes the number of variables in subcomponents dynamically during the optimization process. The SHADE algorithm is used as a subcomponent optimizer. We have investigated the performance of iCC-SHADE and CC-SHADE on fifteen problems from the LSGO CEC’13 benchmark set provided by the IEEE Congress of Evolutionary Computation. The results of numerical experiments have shown that iCC-SHADE outperforms, on average, CC-SHADE with a fixed number of subcomponents. Also, we have compared iCC-SHADE with some state-of-the-art LSGO metaheuristics. The experimental results have shown that the proposed algorithm is competitive with other efficient metaheuristics.

Download Full-text

On the Strong Convergence of a Sufficient Descent Polak-Ribière-Polyak Conjugate Gradient Method

Abstract and Applied Analysis ◽

10.1155/2014/283215 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7

Author(s):

Min Sun ◽

Jing Liu

Keyword(s):

Strong Convergence ◽

Conjugate Gradient Method ◽

Conjugate Gradient ◽

Gradient Method ◽

Large Scale ◽

Optimization Problems ◽

Line Searches ◽

Unconstrained Optimization Problems ◽

Sufficient Descent ◽

Large Scale Unconstrained Optimization

Recently, Zhang et al. proposed a sufficient descent Polak-Ribière-Polyak (SDPRP) conjugate gradient method for large-scale unconstrained optimization problems and proved its global convergence in the sense thatlim infk→∞∥∇f(xk)∥=0when an Armijo-type line search is used. In this paper, motivated by the line searches proposed by Shi et al. and Zhang et al., we propose two new Armijo-type line searches and show that the SDPRP method has strong convergence in the sense thatlimk→∞∥∇f(xk)∥=0under the two new line searches. Numerical results are reported to show the efficiency of the SDPRP with the new Armijo-type line searches in practical computation.

Download Full-text

The Strength of Nesterov's Extrapolation2019

10.36227/techrxiv.11653218.v1 ◽

2020 ◽

Author(s):

Qing Tao

Keyword(s):

Machine Learning ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Gradient Methods ◽

Learning Problems ◽

Smooth Convex ◽

Simple Modification ◽

Convex Problems ◽

Hinge Loss

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.

Download Full-text

Sufficient descent conjugate gradient methods for large-scale optimization problems

International Journal of Computer Mathematics ◽

10.1080/00207160.2011.592938 ◽

2011 ◽

Vol 88 (16) ◽

pp. 3436-3447 ◽

Cited By ~ 2

Author(s):

Xiuyun Zheng ◽

Hongwei Liu ◽

Aiguo Lu

Keyword(s):

Conjugate Gradient ◽

Large Scale ◽

Optimization Problems ◽

Gradient Methods ◽

Conjugate Gradient Methods ◽

Large Scale Optimization ◽

Sufficient Descent ◽

Scale Optimization

Download Full-text

A GPU-Enabled Compact Genetic Algorithm for Very Large-Scale Optimization Problems

Mathematics ◽

10.3390/math8050758 ◽

2020 ◽

Vol 8 (5) ◽

pp. 758

Author(s):

Andrea Ferigo ◽

Giovanni Iacca

Keyword(s):

Genetic Algorithm ◽

Large Scale ◽

Optimization Problems ◽

Search Space ◽

Large Scale Problems ◽

Processing Power ◽

Compact Genetic Algorithm ◽

Estimation Of Distribution ◽

Feasible Solutions ◽

Compact Optimization

The ever-increasing complexity of industrial and engineering problems poses nowadays a number of optimization problems characterized by thousands, if not millions, of variables. For instance, very large-scale problems can be found in chemical and material engineering, networked systems, logistics and scheduling. Recently, Deb and Myburgh proposed an evolutionary algorithm capable of handling a scheduling optimization problem with a staggering number of variables: one billion. However, one important limitation of this algorithm is its memory consumption, which is in the order of 120 GB. Here, we follow up on this research by applying to the same problem a GPU-enabled “compact” Genetic Algorithm, i.e., an Estimation of Distribution Algorithm that instead of using an actual population of candidate solutions only requires and adapts a probabilistic model of their distribution in the search space. We also introduce a smart initialization technique and custom operators to guide the search towards feasible solutions. Leveraging the compact optimization concept, we show how such an algorithm can optimize efficiently very large-scale problems with millions of variables, with limited memory and processing power. To complete our analysis, we report the results of the algorithm on very large-scale instances of the OneMax problem.

Download Full-text

Randomized sketch descent methods for non-separable linearly constrained optimization

IMA Journal of Numerical Analysis ◽

10.1093/imanum/draa018 ◽

2020 ◽

Author(s):

Ion Necoara ◽

Martin Takáč

Keyword(s):

Objective Function ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Sufficient Conditions ◽

Linear Constraints ◽

Descent Methods ◽

Constrained Problems ◽

Special Cases ◽

Linearly Constrained

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.

Download Full-text

AN IMPROVED ADAPTIVE TRUST-REGION METHOD FOR UNCONSTRAINED OPTIMIZATION

Mathematical Modelling and Analysis ◽

10.3846/13926292.2014.956237 ◽

2014 ◽

Vol 19 (4) ◽

pp. 469-490 ◽

Cited By ~ 7

Author(s):

Hamid Esmaeili ◽

Morteza Kimiaei

Keyword(s):

Unconstrained Optimization ◽

Large Scale ◽

Optimization Problems ◽

Trust Region Method ◽

Trust Region ◽

Standard Test ◽

Test Problems ◽

Unconstrained Optimization Problems ◽

Large Scale Problems ◽

Adaptive Radius

In this study, we propose a trust-region-based procedure to solve unconstrained optimization problems that take advantage of the nonmonotone technique to introduce an efficient adaptive radius strategy. In our approach, the adaptive technique leads to decreasing the total number of iterations, while utilizing the structure of nonmonotone formula helps us to handle large-scale problems. The new algorithm preserves the global convergence and has quadratic convergence under suitable conditions. Preliminary numerical experiments on standard test problems indicate the efficiency and robustness of the proposed approach for solving unconstrained optimization problems.

Download Full-text