Compactness and convergence rates in the combinatorial integral approximation decomposition

AbstractThe combinatorial integral approximation decomposition splits the optimization of a discrete-valued control into two steps: solving a continuous relaxation of the discrete control problem, and computing a discrete-valued approximation of the relaxed control. Different algorithms exist for the second step to construct piecewise constant discrete-valued approximants that are defined on given decompositions of the domain. It is known that the resulting discrete controls can be constructed such that they converge to a relaxed control in the $$\hbox {weak}^*$$ weak ∗ topology of $$L^\infty $$ L ∞ if the grid constant of this decomposition is driven to zero. We exploit this insight to formulate a general approximation result for optimization problems, which feature discrete and distributed optimization variables, and which are governed by a compact control-to-state operator. We analyze the topology induced by the grid refinements and prove convergence rates of the control vectors for two problem classes. We use a reconstruction problem from signal processing to demonstrate both the applicability of the method outside the scope of differential equations, the predominant case in the literature, and the effectiveness of the approach.

Download Full-text

Search Patterns Based on Trajectories Extracted from the Response of Second-Order Systems

Applied Sciences ◽

10.3390/app11083430 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3430

Author(s):

Erik Cuevas ◽

Héctor Becerra ◽

Héctor Escobar ◽

Alberto Luque-Chang ◽

Marco Pérez ◽

...

Keyword(s):

Convergence Rates ◽

Optimization Problems ◽

Search Space ◽

Second Order ◽

Order System ◽

Search Patterns ◽

Multiple Behaviors ◽

Complex Optimization ◽

Second Order Systems ◽

Order Algorithm

Recently, several new metaheuristic schemes have been introduced in the literature. Although all these approaches consider very different phenomena as metaphors, the search patterns used to explore the search space are very similar. On the other hand, second-order systems are models that present different temporal behaviors depending on the value of their parameters. Such temporal behaviors can be conceived as search patterns with multiple behaviors and simple configurations. In this paper, a set of new search patterns are introduced to explore the search space efficiently. They emulate the response of a second-order system. The proposed set of search patterns have been integrated as a complete search strategy, called Second-Order Algorithm (SOA), to obtain the global solution of complex optimization problems. To analyze the performance of the proposed scheme, it has been compared in a set of representative optimization problems, including multimodal, unimodal, and hybrid benchmark formulations. Numerical results demonstrate that the proposed SOA method exhibits remarkable performance in terms of accuracy and high convergence rates.

Download Full-text

The Strength of Nesterov's Extrapolation2019

10.36227/techrxiv.11653218.v1 ◽

2020 ◽

Author(s):

Qing Tao

Keyword(s):

Machine Learning ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Gradient Methods ◽

Learning Problems ◽

Smooth Convex ◽

Simple Modification ◽

Convex Problems ◽

Hinge Loss

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.

Download Full-text

Randomized sketch descent methods for non-separable linearly constrained optimization

IMA Journal of Numerical Analysis ◽

10.1093/imanum/draa018 ◽

2020 ◽

Author(s):

Ion Necoara ◽

Martin Takáč

Keyword(s):

Objective Function ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Sufficient Conditions ◽

Linear Constraints ◽

Descent Methods ◽

Constrained Problems ◽

Special Cases ◽

Linearly Constrained

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.

Download Full-text

Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011528 ◽

2019 ◽

Vol 33 ◽

pp. 1528-1535

Author(s):

Ehsan Kazemi ◽

Liqiang Wang

Keyword(s):

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Coordinate Descent ◽

Performance Guarantee ◽

Nonsmooth Problems ◽

Descent Property ◽

Large Scale Problems ◽

Nonconvex And Nonsmooth Optimization ◽

Sufficient Descent

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.

Download Full-text

Parallel sequential Monte Carlo for stochastic gradient-free nonconvex optimization

Statistics and Computing ◽

10.1007/s11222-020-09964-4 ◽

2020 ◽

Vol 30 (6) ◽

pp. 1645-1663

Author(s):

Ömer Deniz Akyildiz ◽

Dan Crisan ◽

Joaquín Míguez

Keyword(s):

Monte Carlo ◽

Cost Function ◽

Global Minimum ◽

Sequential Monte Carlo ◽

Convergence Rates ◽

Optimization Problems ◽

Search Space ◽

Gradient Based ◽

Multiple Minima ◽

The Cost

Abstract We introduce and analyze a parallel sequential Monte Carlo methodology for the numerical solution of optimization problems that involve the minimization of a cost function that consists of the sum of many individual components. The proposed scheme is a stochastic zeroth-order optimization algorithm which demands only the capability to evaluate small subsets of components of the cost function. It can be depicted as a bank of samplers that generate particle approximations of several sequences of probability measures. These measures are constructed in such a way that they have associated probability density functions whose global maxima coincide with the global minima of the original cost function. The algorithm selects the best performing sampler and uses it to approximate a global minimum of the cost function. We prove analytically that the resulting estimator converges to a global minimum of the cost function almost surely and provide explicit convergence rates in terms of the number of generated Monte Carlo samples and the dimension of the search space. We show, by way of numerical examples, that the algorithm can tackle cost functions with multiple minima or with broad “flat” regions which are hard to minimize using gradient-based techniques.

Download Full-text

A boundary piecewise constant level set method for boundary control of eigenvalue optimization problems

Journal of Computational Physics ◽

10.1016/j.jcp.2010.10.005 ◽

2011 ◽

Vol 230 (2) ◽

pp. 458-473 ◽

Cited By ~ 8

Author(s):

Zheng-fang Zhang ◽

Xiao-liang Cheng

Keyword(s):

Level Set Method ◽

Boundary Control ◽

Level Set ◽

Optimization Problems ◽

Constant Level ◽

Eigenvalue Optimization ◽

Piecewise Constant

Download Full-text

A New Homotopy Proximal Variable-Metric Framework for Composite Convex Minimization

Mathematics of Operations Research ◽

10.1287/moor.2021.1138 ◽

2021 ◽

Author(s):

Quoc Tran-Dinh ◽

Ling Liang ◽

Kim-Chuan Toh

Keyword(s):

Convergence Rates ◽

Optimization Problems ◽

Linear Convergence ◽

Convex Minimization ◽

Variable Metric ◽

Complexity Bounds ◽

Convex Problems ◽

Convex Minimization Problems ◽

Global Iteration ◽

Primal Dual

This paper suggests two novel ideas to develop new proximal variable-metric methods for solving a class of composite convex optimization problems. The first idea is to utilize a new parameterization strategy of the optimality condition to design a class of homotopy proximal variable-metric algorithms that can achieve linear convergence and finite global iteration-complexity bounds. We identify at least three subclasses of convex problems in which our approach can apply to achieve linear convergence rates. The second idea is a new primal-dual-primal framework for implementing proximal Newton methods that has attractive computational features for a subclass of nonsmooth composite convex minimization problems. We specialize the proposed algorithm to solve a covariance estimation problem in order to demonstrate its computational advantages. Numerical experiments on the four concrete applications are given to illustrate the theoretical and computational advances of the new methods compared with other state-of-the-art algorithms.

Download Full-text

The Value of Randomized Solutions in Mixed-Integer Distributionally Robust Optimization Problems

INFORMS Journal on Computing ◽

10.1287/ijoc.2020.1042 ◽

2021 ◽

Author(s):

Erick Delage ◽

Ahmed Saif

Keyword(s):

Robust Optimization ◽

Facility Location ◽

Column Generation ◽

Location Problem ◽

Optimization Problems ◽

Facility Location Problem ◽

Mixed Integer ◽

Distributionally Robust Optimization ◽

Continuous Relaxation ◽

Distributionally Robust

Randomized decision making refers to the process of making decisions randomly according to the outcome of an independent randomization device, such as a dice roll or a coin flip. The concept is unconventional, and somehow counterintuitive, in the domain of mathematical programming, in which deterministic decisions are usually sought even when the problem parameters are uncertain. However, it has recently been shown that using a randomized, rather than a deterministic, strategy in nonconvex distributionally robust optimization (DRO) problems can lead to improvements in their objective values. It is still unknown, though, what is the magnitude of improvement that can be attained through randomization or how to numerically find the optimal randomized strategy. In this paper, we study the value of randomization in mixed-integer DRO problems and show that it is bounded by the improvement achievable through its continuous relaxation. Furthermore, we identify conditions under which the bound is tight. We then develop algorithmic procedures, based on column generation, for solving both single- and two-stage linear DRO problems with randomization that can be used with both moment-based and Wasserstein ambiguity sets. Finally, we apply the proposed algorithm to solve three classical discrete DRO problems: the assignment problem, the uncapacitated facility location problem, and the capacitated facility location problem and report numerical results that show the quality of our bounds, the computational efficiency of the proposed solution method, and the magnitude of performance improvement achieved by randomized decisions. Summary of Contribution: In this paper, we present both theoretical results and algorithmic tools to identify optimal randomized strategies for discrete distributionally robust optimization (DRO) problems and evaluate the performance improvements that can be achieved when using them rather than classical deterministic strategies. On the theory side, we provide improvement bounds based on continuous relaxation and identify the conditions under which these bound are tight. On the algorithmic side, we propose a finitely convergent, two-layer, column-generation algorithm that iterates between identifying feasible solutions and finding extreme realizations of the uncertain parameter. The proposed algorithm was implemented to solve distributionally robust stochastic versions of three classical optimization problems and extensive numerical results are reported. The paper extends a previous, purely theoretical work of the first author on the idea of randomized strategies in nonconvex DRO problems by providing useful bounds and algorithms to solve this kind of problems.

Download Full-text

Artificial Higher Order Neural Networks for Modeling Combinatorial Optimization Problems

Artificial Higher Order Neural Networks for Modeling and Simulation ◽

10.4018/978-1-4666-2175-6.ch003 ◽

2013 ◽

pp. 44-57

Author(s):

Yuxin Ding

Keyword(s):

Neural Networks ◽

Combinatorial Optimization ◽

Convergence Rates ◽

Optimization Problems ◽

High Order ◽

Hopfield Network ◽

Hopfield Networks ◽

Combinatorial Optimization Problems ◽

Complete Problems ◽

Higher Order Neural Networks

Traditional Hopfield networking has been widely used to solve combinatorial optimization problems. However, high order Hopfiled networks, as an expansion of traditional Hopfield networks, are seldom used to solve combinatorial optimization problems. In theory, compared with low order networks, high order networks have better properties, such as stronger approximations and faster convergence rates. In this chapter, the authors focus on how to use high order networks to model combinatorial optimization problems. Firstly, the high order discrete Hopfield Network is introduced, then the authors discuss how to find the high order inputs of a neuron. Finally, the construction method of energy function and the neural computing algorithm are presented. In this chapter, the N queens problem and the crossbar switch problem, which are NP-complete problems, are used as examples to illustrate how to model practical problems using high order neural networks. The authors also discuss the performance of high order networks for modeling the two combinatorial optimization problems.

Download Full-text

Convergence Rates of Quasi-Newton Algorithms for Some Nonsmooth Optimization Problems

SIAM Journal on Control and Optimization ◽

10.1137/0323026 ◽

1985 ◽

Vol 23 (3) ◽

pp. 401-418 ◽

Cited By ~ 6

Author(s):

Ekkehard Sachs

Keyword(s):

Nonsmooth Optimization ◽

Convergence Rates ◽

Optimization Problems ◽

Quasi Newton

Download Full-text