scholarly journals Katyusha Acceleration for Convex Finite-Sum Compositional Optimization

Author(s):  
Yibo Xu ◽  
Yangyang Xu

Structured optimization problems arise in many applications. To efficiently solve these problems, it is important to leverage the structure information in the algorithmic design. This paper focuses on convex problems with a finite-sum compositional structure. Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and also arise in machine learning with a huge amount of training data. One popularly used numerical approach for finite-sum problems is the stochastic gradient method (SGM). However, the additional compositional structure prohibits easy access to unbiased stochastic approximation of the gradient, so directly applying the SGM to a finite-sum compositional optimization problem (COP) is often inefficient. We design new algorithms for solving strongly convex and also convex two-level finite-sum COPs. Our design incorporates the Katyusha acceleration technique and adopts the mini-batch sampling from both outer-level and inner-level finite-sum. We first analyze the algorithm for strongly convex finite-sum COPs. Similar to a few existing works, we obtain linear convergence rate in terms of the expected objective error; from the convergence rate result, we then establish complexity results of the algorithm to produce an ε-solution. Our complexity results have the same dependence on the number of component functions as existing works. However, because of the use of Katyusha acceleration, our results have better dependence on the condition number κ and improve to [Formula: see text] from the best-known [Formula: see text]. Finally, we analyze the algorithm for convex finite-sum COPs, which uses as a subroutine the algorithm for strongly convex finite-sum COPs. Again, we obtain better complexity results than existing works in terms of the dependence on ε, improving to [Formula: see text] from the best-known [Formula: see text].

2021 ◽  
Vol 12 (4) ◽  
pp. 98-116
Author(s):  
Noureddine Boukhari ◽  
Fatima Debbat ◽  
Nicolas Monmarché ◽  
Mohamed Slimane

Evolution strategies (ES) are a family of strong stochastic methods for global optimization and have proved their capability in avoiding local optima more than other optimization methods. Many researchers have investigated different versions of the original evolution strategy with good results in a variety of optimization problems. However, the convergence rate of the algorithm to the global optimum stays asymptotic. In order to accelerate the convergence rate, a hybrid approach is proposed using the nonlinear simplex method (Nelder-Mead) and an adaptive scheme to control the local search application, and the authors demonstrate that such combination yields significantly better convergence. The new proposed method has been tested on 15 complex benchmark functions and applied to the bi-objective portfolio optimization problem and compared with other state-of-the-art techniques. Experimental results show that the performance is improved by this hybridization in terms of solution eminence and strong convergence.


2014 ◽  
Vol 56 (2) ◽  
pp. 160-178 ◽  
Author(s):  
JUEYOU LI ◽  
CHANGZHI WU ◽  
ZHIYOU WU ◽  
QIANG LONG ◽  
XIANGYU WANG

AbstractWe consider a distributed optimization problem over a multi-agent network, in which the sum of several local convex objective functions is minimized subject to global convex inequality constraints. We first transform the constrained optimization problem to an unconstrained one, using the exact penalty function method. Our transformed problem has a smaller number of variables and a simpler structure than the existing distributed primal–dual subgradient methods for constrained distributed optimization problems. Using the special structure of this problem, we then propose a distributed proximal-gradient algorithm over a time-changing connectivity network, and establish a convergence rate depending on the number of iterations, the network topology and the number of agents. Although the transformed problem is nonsmooth by nature, our method can still achieve a convergence rate, ${\mathcal{O}}(1/k)$, after $k$ iterations, which is faster than the rate, ${\mathcal{O}}(1/\sqrt{k})$, of existing distributed subgradient-based methods. Simulation experiments on a distributed state estimation problem illustrate the excellent performance of our proposed method.


Author(s):  
Yue Yu ◽  
Longbo Huang

We consider the stochastic composition optimization problem proposed in \cite{wang2017stochastic}, which has applications ranging from estimation to statistical and machine learning. We propose the first ADMM based algorithm named com SVR ADMM, and show that com SVR ADMM converges linearly for strongly convex and Lipschitz smooth objectives, and has a convergence rate of $O(\logS/S)$, which improves upon the $O(S^{-4/9})$ rate in \cite{wang2016accelerating} when the objective is convex and Lipschitz smooth. Moreover, com SVR ADMM possesses a rate of $O(1/\sqrt{S})$ when the objective is convex but without Lipschitz smoothness. We also conduct experiments and show that it outperforms existing algorithms.


Author(s):  
Pengfei Wang ◽  
Risheng Liu ◽  
Nenggan Zheng ◽  
Zhefeng Gong

In machine learning research, many emerging applications can be (re)formulated as the composition optimization problem with nonsmooth regularization penalty. To solve this problem, traditional stochastic gradient descent (SGD) algorithm and its variants either have low convergence rate or are computationally expensive. Recently, several stochastic composition gradient algorithms have been proposed, however, these methods are still inefficient and not scalable to large-scale composition optimization problem instances. To address these challenges, we propose an asynchronous parallel algorithm, named Async-ProxSCVR, which effectively combines asynchronous parallel implementation and variance reduction method. We prove that the algorithm admits the fastest convergence rate for both strongly convex and general nonconvex cases. Furthermore, we analyze the query complexity of the proposed algorithm and prove that linear speedup is accessible when we increase the number of processors. Finally, we evaluate our algorithm Async-ProxSCVR on two representative composition optimization problems including value function evaluation in reinforcement learning and sparse mean-variance optimization problem. Experimental results show that the algorithm achieves significant speedups and is much faster than existing compared methods.


2009 ◽  
Vol 26 (04) ◽  
pp. 479-502 ◽  
Author(s):  
BIN LIU ◽  
TEQI DUAN ◽  
YONGMING LI

In this paper, a novel genetic algorithm — dynamic ring-like agent genetic algorithm (RAGA) is proposed for solving global numerical optimization problem. The RAGA combines the ring-like agent structure and dynamic neighboring genetic operators together to get better optimization capability. An agent in ring-like agent structure represents a candidate solution to the optimization problem. Any agent interacts with neighboring agents to evolve. With dynamic neighboring genetic operators, they compete and cooperate with their neighbors, and they can also use knowledge to increase energies. Global numerical optimization problems are the most important ones to verify the performance of evolutionary algorithm, especially of genetic algorithm and are mostly of interest to the corresponding researchers. In the corresponding experiments, several complex benchmark functions were used for optimization, several popular GAs were used for comparison. In order to better compare two agents GAs (MAGA: multi-agent genetic algorithm and RAGA), the several dimensional experiments (from low dimension to high dimension) were done. These experimental results show that RAGA not only is suitable for optimization problems, but also has more precise and more stable optimization results.


Author(s):  
S Yoo ◽  
C-G Park ◽  
S-H You ◽  
B Lim

This article presents a new methodology to generate optimal trajectories in controlling an automated excavator. By parameterizing all the actuator displacements with B-splines of the same order and with the same number of control points, the coupled actuator limits, associated with the maximum pump flowrate, are described as the finite-dimensional set of linear constraints to the motion optimization problem. Several weighting functions are introduced on the generalized actuator torque so that the solution to each optimization problems contains the physical meaning. Numerical results showing that the generated motions of the excavator are fairly smooth and effectively save energy, which can prevent mechanical wearing and possibly save fuel consumption, are presented. A typical operator's manoeuvre from experiments is referred to bring out the standing features of the optimized motion.


2021 ◽  
Vol 12 (4) ◽  
pp. 81-100
Author(s):  
Yao Peng ◽  
Zepeng Shen ◽  
Shiqi Wang

Multimodal optimization problem exists in multiple global and many local optimal solutions. The difficulty of solving these problems is finding as many local optimal peaks as possible on the premise of ensuring global optimal precision. This article presents adaptive grouping brainstorm optimization (AGBSO) for solving these problems. In this article, adaptive grouping strategy is proposed for achieving adaptive grouping without providing any prior knowledge by users. For enhancing the diversity and accuracy of the optimal algorithm, elite reservation strategy is proposed to put central particles into an elite pool, and peak detection strategy is proposed to delete particles far from optimal peaks in the elite pool. Finally, this article uses testing functions with different dimensions to compare the convergence, accuracy, and diversity of AGBSO with BSO. Experiments verify that AGBSO has great localization ability for local optimal solutions while ensuring the accuracy of the global optimal solutions.


2021 ◽  
Author(s):  
Faruk Alpak ◽  
Yixuan Wang ◽  
Guohua Gao ◽  
Vivek Jain

Abstract Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir performance optimization problems including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to effectively locate multiple local optima of highly nonlinear optimization problems. However, its performance has neither been validated by realistic applications nor compared to other DFO methods. We have integrated DQN into a versatile field-development optimization platform designed specifically for iterative workflows enabled through distributed-parallel flow simulations. DQN is benchmarked against alternative DFO techniques, namely, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method hybridized with Direct Pattern Search (BFGS-DPS), Mesh Adaptive Direct Search (MADS), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). DQN is a multi-thread optimization method that distributes an ensemble of optimization tasks among multiple high-performance-computing nodes. Thus, it can locate multiple optima of the objective function in parallel within a single run. Simulation results computed from one DQN optimization thread are shared with others by updating a unified set of training data points composed of responses (implicit variables) of all successful simulation jobs. The sensitivity matrix at the current best solution of each optimization thread is approximated by a linear-interpolation technique using all or a subset of training-data points. The gradient of the objective function is analytically computed using the estimated sensitivities of implicit variables with respect to explicit variables. The Hessian matrix is then updated using the quasi-Newton method. A new search point for each thread is solved from a trust-region subproblem for the next iteration. In contrast, other DFO methods rely on a single-thread optimization paradigm that can only locate a single optimum. To locate multiple optima, one must repeat the same optimization process multiple times starting from different initial guesses for such methods. Moreover, simulation results generated from a single-thread optimization task cannot be shared with other tasks. Benchmarking results are presented for synthetic yet challenging WLO and WCO problems. Finally, DQN method is field-tested on two realistic applications. DQN identifies the global optimum with the least number of simulations and the shortest run time on a synthetic problem with known solution. On other benchmarking problems without a known solution, DQN identified compatible local optima with reasonably smaller numbers of simulations compared to alternative techniques. Field-testing results reinforce the auspicious computational attributes of DQN. Overall, the results indicate that DQN is a novel and effective parallel algorithm for field-scale development optimization problems.


Sign in / Sign up

Export Citation Format

Share Document