Katyusha Acceleration for Convex Finite-Sum Compositional Optimization

INFORMS Journal on Optimization ◽

10.1287/ijoo.2021.0055 ◽

2021 ◽

Author(s):

Yibo Xu ◽

Yangyang Xu

Keyword(s):

Convergence Rate ◽

Optimization Problem ◽

Optimization Problems ◽

Sample Average Approximation ◽

Training Data ◽

Easy Access ◽

Strongly Convex ◽

Compositional Structure ◽

Complexity Results ◽

Finite Sum

Structured optimization problems arise in many applications. To efficiently solve these problems, it is important to leverage the structure information in the algorithmic design. This paper focuses on convex problems with a finite-sum compositional structure. Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and also arise in machine learning with a huge amount of training data. One popularly used numerical approach for finite-sum problems is the stochastic gradient method (SGM). However, the additional compositional structure prohibits easy access to unbiased stochastic approximation of the gradient, so directly applying the SGM to a finite-sum compositional optimization problem (COP) is often inefficient. We design new algorithms for solving strongly convex and also convex two-level finite-sum COPs. Our design incorporates the Katyusha acceleration technique and adopts the mini-batch sampling from both outer-level and inner-level finite-sum. We first analyze the algorithm for strongly convex finite-sum COPs. Similar to a few existing works, we obtain linear convergence rate in terms of the expected objective error; from the convergence rate result, we then establish complexity results of the algorithm to produce an ε-solution. Our complexity results have the same dependence on the number of component functions as existing works. However, because of the use of Katyusha acceleration, our results have better dependence on the condition number κ and improve to [Formula: see text] from the best-known [Formula: see text]. Finally, we analyze the algorithm for convex finite-sum COPs, which uses as a subroutine the algorithm for strongly convex finite-sum COPs. Again, we obtain better complexity results than existing works in terms of the dependence on ε, improving to [Formula: see text] from the best-known [Formula: see text].

Download Full-text

Solving Mono- and Multi-Objective Problems Using Hybrid Evolutionary Algorithms and Nelder-Mead Method

International Journal of Applied Metaheuristic Computing ◽

10.4018/ijamc.2021100106 ◽

2021 ◽

Vol 12 (4) ◽

pp. 98-116

Author(s):

Noureddine Boukhari ◽

Fatima Debbat ◽

Nicolas Monmarché ◽

Mohamed Slimane

Keyword(s):

Convergence Rate ◽

Optimization Problem ◽

Optimization Problems ◽

State Of The Art ◽

Hybrid Approach ◽

Optimization Methods ◽

Global Optimum ◽

Stochastic Methods ◽

Local Optima ◽

Portfolio Optimization Problem

Evolution strategies (ES) are a family of strong stochastic methods for global optimization and have proved their capability in avoiding local optima more than other optimization methods. Many researchers have investigated different versions of the original evolution strategy with good results in a variety of optimization problems. However, the convergence rate of the algorithm to the global optimum stays asymptotic. In order to accelerate the convergence rate, a hybrid approach is proposed using the nonlinear simplex method (Nelder-Mead) and an adaptive scheme to control the local search application, and the authors demonstrate that such combination yields significantly better convergence. The new proposed method has been tested on 15 complex benchmark functions and applied to the bi-objective portfolio optimization problem and compared with other state-of-the-art techniques. Experimental results show that the performance is improved by this hybridization in terms of solution eminence and strong convergence.

Download Full-text

DISTRIBUTED PROXIMAL-GRADIENT METHOD FOR CONVEX OPTIMIZATION WITH INEQUALITY CONSTRAINTS

The ANZIAM Journal ◽

10.1017/s1446181114000273 ◽

2014 ◽

Vol 56 (2) ◽

pp. 160-178 ◽

Cited By ~ 4

Author(s):

JUEYOU LI ◽

CHANGZHI WU ◽

ZHIYOU WU ◽

QIANG LONG ◽

XIANGYU WANG

Keyword(s):

Convergence Rate ◽

Optimization Problem ◽

Optimization Problems ◽

Distributed Optimization ◽

Inequality Constraints ◽

Gradient Algorithm ◽

Subgradient Methods ◽

Penalty Function Method ◽

Proximal Gradient Method ◽

Primal Dual

AbstractWe consider a distributed optimization problem over a multi-agent network, in which the sum of several local convex objective functions is minimized subject to global convex inequality constraints. We first transform the constrained optimization problem to an unconstrained one, using the exact penalty function method. Our transformed problem has a smaller number of variables and a simpler structure than the existing distributed primal–dual subgradient methods for constrained distributed optimization problems. Using the special structure of this problem, we then propose a distributed proximal-gradient algorithm over a time-changing connectivity network, and establish a convergence rate depending on the number of iterations, the network topology and the number of agents. Although the transformed problem is nonsmooth by nature, our method can still achieve a convergence rate, ${\mathcal{O}}(1/k)$, after $k$ iterations, which is faster than the rate, ${\mathcal{O}}(1/\sqrt{k})$, of existing distributed subgradient-based methods. Simulation experiments on a distributed state estimation problem illustrate the excellent performance of our proposed method.

Download Full-text

Fast Stochastic Variance Reduced ADMM for Stochastic Composition Optimization

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/470 ◽

2017 ◽

Cited By ~ 1

Author(s):

Yue Yu ◽

Longbo Huang

Keyword(s):

Machine Learning ◽

Convergence Rate ◽

Optimization Problem ◽

Strongly Convex ◽

Composition Optimization

We consider the stochastic composition optimization problem proposed in \cite{wang2017stochastic}, which has applications ranging from estimation to statistical and machine learning. We propose the first ADMM based algorithm named com SVR ADMM, and show that com SVR ADMM converges linearly for strongly convex and Lipschitz smooth objectives, and has a convergence rate of $O(\logS/S)$, which improves upon the $O(S^{-4/9})$ rate in \cite{wang2016accelerating} when the objective is convex and Lipschitz smooth. Moreover, com SVR ADMM possesses a rate of $O(1/\sqrt{S})$ when the objective is convex but without Lipschitz smoothness. We also conduct experiments and show that it outperforms existing algorithms.

Download Full-text

Asynchronous Proximal Stochastic Gradient Algorithm for Composition Optimization Problems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011633 ◽

2019 ◽

Vol 33 ◽

pp. 1633-1640 ◽

Cited By ~ 2

Author(s):

Pengfei Wang ◽

Risheng Liu ◽

Nenggan Zheng ◽

Zhefeng Gong

Keyword(s):

Convergence Rate ◽

Optimization Problem ◽

Variance Reduction ◽

Optimization Problems ◽

Parallel Implementation ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Composition Gradient ◽

Composition Optimization ◽

Asynchronous Parallel

In machine learning research, many emerging applications can be (re)formulated as the composition optimization problem with nonsmooth regularization penalty. To solve this problem, traditional stochastic gradient descent (SGD) algorithm and its variants either have low convergence rate or are computationally expensive. Recently, several stochastic composition gradient algorithms have been proposed, however, these methods are still inefficient and not scalable to large-scale composition optimization problem instances. To address these challenges, we propose an asynchronous parallel algorithm, named Async-ProxSCVR, which effectively combines asynchronous parallel implementation and variance reduction method. We prove that the algorithm admits the fastest convergence rate for both strongly convex and general nonconvex cases. Furthermore, we analyze the query complexity of the proposed algorithm and prove that linear speedup is accessible when we increase the number of processors. Finally, we evaluate our algorithm Async-ProxSCVR on two representative composition optimization problems including value function evaluation in reinforcement learning and sparse mean-variance optimization problem. Experimental results show that the algorithm achieves significant speedups and is much faster than existing compared methods.

Download Full-text

On the transient growth of Nesterov’s accelerated method for strongly convex optimization problems

2020 59th IEEE Conference on Decision and Control (CDC) ◽

10.1109/cdc42340.2020.9304412 ◽

2020 ◽

Author(s):

Samantha Samuelson ◽

Hesameddin Mohammadi ◽

Mihailo R. Jovanovic

Keyword(s):

Convex Optimization ◽

Optimization Problems ◽

Transient Growth ◽

Strongly Convex ◽

Convex Optimization Problems ◽

Strongly Convex Optimization

Download Full-text

ONE IMPROVED AGENT GENETIC ALGORITHM — RING-LIKE AGENT GENETIC ALGORITHM FOR GLOBAL NUMERICAL OPTIMIZATION

Asia Pacific Journal of Operational Research ◽

10.1142/s0217595909002316 ◽

2009 ◽

Vol 26 (04) ◽

pp. 479-502 ◽

Cited By ~ 2

Author(s):

BIN LIU ◽

TEQI DUAN ◽

YONGMING LI

Keyword(s):

Genetic Algorithm ◽

Numerical Optimization ◽

Optimization Problem ◽

Optimization Problems ◽

Genetic Operators ◽

Global Numerical Optimization ◽

Candidate Solution ◽

Low Dimension ◽

Multi Agent ◽

Agent Structure

In this paper, a novel genetic algorithm — dynamic ring-like agent genetic algorithm (RAGA) is proposed for solving global numerical optimization problem. The RAGA combines the ring-like agent structure and dynamic neighboring genetic operators together to get better optimization capability. An agent in ring-like agent structure represents a candidate solution to the optimization problem. Any agent interacts with neighboring agents to evolve. With dynamic neighboring genetic operators, they compete and cooperate with their neighbors, and they can also use knowledge to increase energies. Global numerical optimization problems are the most important ones to verify the performance of evolutionary algorithm, especially of genetic algorithm and are mostly of interest to the corresponding researchers. In the corresponding experiments, several complex benchmark functions were used for optimization, several popular GAs were used for comparison. In order to better compare two agents GAs (MAGA: multi-agent genetic algorithm and RAGA), the several dimensional experiments (from low dimension to high dimension) were done. These experimental results show that RAGA not only is suitable for optimization problems, but also has more precise and more stable optimization results.

Download Full-text

A Dynamics-Based Optimal Trajectory Generation for Controlling an Automated Excavator

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1243/09544062jmes2032 ◽

2010 ◽

Vol 224 (10) ◽

pp. 2109-2119 ◽

Cited By ~ 4

Author(s):

S Yoo ◽

C-G Park ◽

S-H You ◽

B Lim

Keyword(s):

Optimal Trajectory ◽

Optimization Problem ◽

Optimization Problems ◽

Linear Constraints ◽

Control Points ◽

Weighting Functions ◽

B Splines ◽

Finite Dimensional ◽

Motion Optimization ◽

Save Energy

This article presents a new methodology to generate optimal trajectories in controlling an automated excavator. By parameterizing all the actuator displacements with B-splines of the same order and with the same number of control points, the coupled actuator limits, associated with the maximum pump flowrate, are described as the finite-dimensional set of linear constraints to the motion optimization problem. Several weighting functions are introduced on the generalized actuator torque so that the solution to each optimization problems contains the physical meaning. Numerical results showing that the generated motions of the excavator are fairly smooth and effectively save energy, which can prevent mechanical wearing and possibly save fuel consumption, are presented. A typical operator's manoeuvre from experiments is referred to bring out the standing features of the optimized motion.

Download Full-text

Adaptive Grouping Brain Storm Optimization for Multimodal Optimization Problems

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2021100105 ◽

2021 ◽

Vol 12 (4) ◽

pp. 81-100

Author(s):

Yao Peng ◽

Zepeng Shen ◽

Shiqi Wang

Keyword(s):

Optimization Problem ◽

Optimization Problems ◽

Optimal Algorithm ◽

Peak Detection ◽

Multimodal Optimization ◽

Optimal Solutions ◽

Grouping Strategy ◽

Different Dimensions ◽

Global Optimal ◽

Localization Ability

Multimodal optimization problem exists in multiple global and many local optimal solutions. The difficulty of solving these problems is finding as many local optimal peaks as possible on the premise of ensuring global optimal precision. This article presents adaptive grouping brainstorm optimization (AGBSO) for solving these problems. In this article, adaptive grouping strategy is proposed for achieving adaptive grouping without providing any prior knowledge by users. For enhancing the diversity and accuracy of the optimal algorithm, elite reservation strategy is proposed to put central particles into an elite pool, and peak detection strategy is proposed to delete particles far from optimal peaks in the elite pool. Finally, this article uses testing functions with different dimensions to compare the convergence, accuracy, and diversity of AGBSO with BSO. Experiments verify that AGBSO has great localization ability for local optimal solutions while ensuring the accuracy of the global optimal solutions.

Download Full-text

Benchmarking and Field-Testing of the Distributed Quasi-Newton Derivative-Free Optimization Method for Field Development Optimization

10.2118/206267-ms ◽

2021 ◽

Author(s):

Faruk Alpak ◽

Yixuan Wang ◽

Guohua Gao ◽

Vivek Jain

Keyword(s):

Optimization Problems ◽

Field Testing ◽

Optimization Method ◽

Training Data ◽

Local Optima ◽

Field Development ◽

Derivative Free Optimization ◽

Derivative Free ◽

Data Points ◽

Quasi Newton

Abstract Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir performance optimization problems including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to effectively locate multiple local optima of highly nonlinear optimization problems. However, its performance has neither been validated by realistic applications nor compared to other DFO methods. We have integrated DQN into a versatile field-development optimization platform designed specifically for iterative workflows enabled through distributed-parallel flow simulations. DQN is benchmarked against alternative DFO techniques, namely, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method hybridized with Direct Pattern Search (BFGS-DPS), Mesh Adaptive Direct Search (MADS), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). DQN is a multi-thread optimization method that distributes an ensemble of optimization tasks among multiple high-performance-computing nodes. Thus, it can locate multiple optima of the objective function in parallel within a single run. Simulation results computed from one DQN optimization thread are shared with others by updating a unified set of training data points composed of responses (implicit variables) of all successful simulation jobs. The sensitivity matrix at the current best solution of each optimization thread is approximated by a linear-interpolation technique using all or a subset of training-data points. The gradient of the objective function is analytically computed using the estimated sensitivities of implicit variables with respect to explicit variables. The Hessian matrix is then updated using the quasi-Newton method. A new search point for each thread is solved from a trust-region subproblem for the next iteration. In contrast, other DFO methods rely on a single-thread optimization paradigm that can only locate a single optimum. To locate multiple optima, one must repeat the same optimization process multiple times starting from different initial guesses for such methods. Moreover, simulation results generated from a single-thread optimization task cannot be shared with other tasks. Benchmarking results are presented for synthetic yet challenging WLO and WCO problems. Finally, DQN method is field-tested on two realistic applications. DQN identifies the global optimum with the least number of simulations and the shortest run time on a synthetic problem with known solution. On other benchmarking problems without a known solution, DQN identified compatible local optima with reasonably smaller numbers of simulations compared to alternative techniques. Field-testing results reinforce the auspicious computational attributes of DQN. Overall, the results indicate that DQN is a novel and effective parallel algorithm for field-scale development optimization problems.

Download Full-text

On the Convergence Rate of Distributed Gradient Methods for Finite-Sum Optimization under Communication Delays

Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems - SIGMETRICS '18 ◽

10.1145/3219617.3219654 ◽

2018 ◽

Cited By ~ 2

Author(s):

Thinh T. Doan ◽

Carolyn L. Beck ◽

R. Srikant

Keyword(s):

Convergence Rate ◽

Gradient Methods ◽

Communication Delays ◽

Finite Sum

Download Full-text