Randomized sketch descent methods for non-separable linearly constrained optimization

IMA Journal of Numerical Analysis ◽

10.1093/imanum/draa018 ◽

2020 ◽

Author(s):

Ion Necoara ◽

Martin Takáč

Keyword(s):

Objective Function ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Sufficient Conditions ◽

Linear Constraints ◽

Descent Methods ◽

Constrained Problems ◽

Special Cases ◽

Linearly Constrained

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.

Get full-text (via PubEx)

Numerical methods for large-scale nonlinear optimization

Acta Numerica ◽

10.1017/s0962492904000248 ◽

2005 ◽

Vol 14 ◽

pp. 299-361 ◽

Cited By ~ 67

Author(s):

Nick Gould ◽

Dominique Orban ◽

Philippe Toint

Keyword(s):

Numerical Methods ◽

Nonlinear Optimization ◽

Large Scale ◽

Optimization Problems ◽

State Of The Art ◽

Constrained Problems ◽

Recent Developments ◽

Linearly Constrained ◽

Linearly Constrained Problems

Recent developments in numerical methods for solving large differentiable nonlinear optimization problems are reviewed. State-of-the-art algorithms for solving unconstrained, bound-constrained, linearly constrained and non-linearly constrained problems are discussed. As well as important conceptual advances and theoretical aspects, emphasis is also placed on more practical issues, such as software availability.

Get full-text (via PubEx)

Nonlinear Optimal Design of Dynamic Mechanical Systems

22nd Biennial Mechanisms Conference: Mechanism Design and Synthesis ◽

10.1115/detc1992-0350 ◽

1992 ◽

Author(s):

T. E. Potter ◽

K. D. Willmert ◽

M. Sathyamoorthy

Keyword(s):

Objective Function ◽

Optimization Problems ◽

Iteration Process ◽

Optimization Method ◽

Linear Constraints ◽

Sum Of Squares ◽

Function Evaluation ◽

Deformation Analysis ◽

Nonlinear Constraints ◽

Quadratic Constraints

Abstract Mechanism path generation problems which use link deformations to improve the design lead to optimization problems involving a nonlinear sum-of-squares objective function subjected to a set of linear and nonlinear constraints. Inclusion of the deformation analysis causes the objective function evaluation to be computationally expensive. An optimization method is presented which requires relatively few objective function evaluations. The algorithm, based on the Gauss method for unconstrained problems, is developed as an extension of the Gauss constrained technique for linear constraints and revises the Gauss nonlinearly constrained method for quadratic constraints. The derivation of the algorithm, using a Lagrange multiplier approach, is based on the Kuhn-Tucker conditions so that when the iteration process terminates, these conditions are automatically satisfied. Although the technique was developed for mechanism problems, it is applicable to any optimization problem having the form of a sum of squares objective function subjected to nonlinear constraints.

Get full-text (via PubEx)

The Strength of Nesterov's Extrapolation2019

10.36227/techrxiv.11653218.v1 ◽

2020 ◽

Author(s):

Qing Tao

Keyword(s):

Machine Learning ◽

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Gradient Methods ◽

Learning Problems ◽

Smooth Convex ◽

Simple Modification ◽

Convex Problems ◽

Hinge Loss

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.

Get full-text (via PubEx)

A Sequential Approximation Method for Structural Optimization Using Logarithmic Barriers

19th Design Automation Conference: Volume 1 — Mechanical System Dynamics; Concurrent and Robust Design; Design for Assembly and Manufacture; Genetic Algorithms in Design and Structural Optimization ◽

10.1115/detc1993-0366 ◽

1993 ◽

Author(s):

Ashok V. Kumar ◽

David C. Gossard

Keyword(s):

Structural Optimization ◽

Objective Function ◽

Line Search ◽

Optimization Problems ◽

Approximation Technique ◽

Barrier Method ◽

Non Linear Programming ◽

Linearly Constrained ◽

Sequential Approximation ◽

And Function

Abstract A sequential approximation technique for non-linear programming is presented here that is particularly suited for problems in engineering design and structural optimization, where the number of variables are very large and function and sensitivity evaluations are computationally expensive. A sequence of sub-problems are iteratively generated using a linear approximation for the objective function and setting move limits on the variables using a barrier method. These sub-problems are strictly convex. Computation per iteration is significantly reduced by not solving the sub-problems exactly. Instead at each iteration, a few Newton-steps are taken for the sub-problem. A criteria for moving the move limit, is described that reduces or eliminates stepsize reduction during line search. The method was found to perform well for unconstrained and linearly constrained optimization problems. It requires very few function evaluations, does not require the hessian of the objective function and evaluates its gradient only once per iteration.

Get full-text (via PubEx)

Asynchronous Delay-Aware Accelerated Proximal Coordinate Descent for Nonconvex Nonsmooth Problems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011528 ◽

2019 ◽

Vol 33 ◽

pp. 1528-1535

Author(s):

Ehsan Kazemi ◽

Liqiang Wang

Keyword(s):

Large Scale ◽

Convergence Rates ◽

Optimization Problems ◽

Coordinate Descent ◽

Performance Guarantee ◽

Nonsmooth Problems ◽

Descent Property ◽

Large Scale Problems ◽

Nonconvex And Nonsmooth Optimization ◽

Sufficient Descent

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.

Get full-text (via PubEx)

Convergence Analysis on an Accelerated Proximal Point Algorithm for Linearly Constrained Optimization Problems

Mathematical Problems in Engineering ◽

10.1155/2020/8873507 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Sha Lu ◽

Zengxin Wei

Keyword(s):

Machine Learning ◽

Linear Programming ◽

Proximal Point Algorithm ◽

Original Problem ◽

Optimization Problems ◽

Linear Constraints ◽

Proximal Point ◽

Constrained Optimization Problems ◽

Linearly Constrained ◽

Linear Programming Problems

Proximal point algorithm is a type of method widely used in solving optimization problems and some practical problems such as machine learning in recent years. In this paper, a framework of accelerated proximal point algorithm is presented for convex minimization with linear constraints. The algorithm can be seen as an extension to G u ¨ ler’s methods for unconstrained optimization and linear programming problems. We prove that the sequence generated by the algorithm converges to a KKT solution of the original problem under appropriate conditions with the convergence rate of O 1 / k 2 .

Get full-text (via PubEx)

Novel Interior Point Algorithms for Solving Nonlinear Convex Optimization Problems

Advances in Operations Research ◽

10.1155/2015/487271 ◽

2015 ◽

Vol 2015 ◽

pp. 1-7

Author(s):

Sakineh Tahmasebzadeh ◽

Hamidreza Navidi ◽

Alaeddin Malek

Keyword(s):

Convex Optimization ◽

Objective Function ◽

Interior Point ◽

Optimization Problems ◽

Numerical Algorithms ◽

Optimal Solution ◽

Linear Constraints ◽

Convex Optimization Problems ◽

Novel Algorithms ◽

Interior Point Technique

This paper proposes three numerical algorithms based on Karmarkar’s interior point technique for solving nonlinear convex programming problems subject to linear constraints. The first algorithm uses the Karmarkar idea and linearization of the objective function. The second and third algorithms are modification of the first algorithm using the Schrijver and Malek-Naseri approaches, respectively. These three novel schemes are tested against the algorithm of Kebiche-Keraghel-Yassine (KKY). It is shown that these three novel algorithms are more efficient and converge to the correct optimal solution, while the KKY algorithm fails in some cases. Numerical results are given to illustrate the performance of the proposed algorithms.

Get full-text (via PubEx)

Faster convergence of a randomized coordinate descent method for linearly constrained optimization problems

Analysis and Applications ◽

10.1142/s0219530518500082 ◽

2018 ◽

Vol 16 (05) ◽

pp. 741-755 ◽

Cited By ~ 3

Author(s):

Qin Fang ◽

Min Xu ◽

Yiming Ying

Keyword(s):

Constrained Optimization ◽

Optimization Problems ◽

Descent Method ◽

Coordinate Descent ◽

Descent Methods ◽

Coordinate Descent Method ◽

Linearly Constrained Optimization ◽

Main Challenge ◽

Linearly Constrained ◽

Coupled Constraints

The problem of minimizing a separable convex function under linearly coupled constraints arises from various application domains such as economic systems, distributed control, and network flow. The main challenge for solving this problem is that the size of data is very large, which makes usual gradient-based methods infeasible. Recently, Necoara, Nesterov and Glineur [Random block coordinate descent methods for linearly constrained optimization over networks, J. Optim. Theory Appl. 173(1) (2017) 227–254] proposed an efficient randomized coordinate descent method to solve this type of optimization problems and presented an appealing convergence analysis. In this paper, we develop new techniques to analyze the convergence of such algorithms, which are able to greatly improve the results presented in the above. This refined result is achieved by extending Nesterov’s second technique [Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM J. Optim. 22 (2012) 341–362] to the general optimization problems with linearly coupled constraints. A novel technique in our analysis is to establish the basis vectors for the subspace of the linear constraints.

Get full-text (via PubEx)

Ж.-Л. ЛАГРАНЖ КАК ОДИН ИЗ ОСНОВОПОЛОЖНИКОВ ТЕОРИИ ЭКСТРЕМУМОВ ФУНКЦИЙ МНОГИХ ПЕРЕМЕННЫХ

RADIOELECTRONIC AND COMPUTER SYSTEMS ◽

10.32620/reks.2020.1.10 ◽

2020 ◽

pp. 103-111

Author(s):

Ольга Михайловна Прохорова

Keyword(s):

Mathematical Analysis ◽

Quadratic Forms ◽

Optimization Problems ◽

Sufficient Conditions ◽

Variational Calculus ◽

Physical Content ◽

Least Action ◽

Conditional Extremum ◽

Special Cases ◽

Mathematical Foundations

In the article we carried out a detailed analysis of the results obtained by J.-L. Lagrange in his first. The theory of extrema of functions of many variables, as part of mathematical analysis, refers to the mathematical foundations of the study of operations. In turn, many optimization problems are actually problems on the conditional extremum of the function of many variables. The relevance of this topic is determined by the fact that the methods for solving problems on the extremum of the function of many variables obtained in the mid 18th - early 20th centuries are used in solving modern problems. A special place here is occupied by L. Euler and J.-L. Lagrange. The aim of the article is to study the conditions for the maximum and minimum functions of many variables obtained by J.-L. Lagrange, and a comparison of its results with the presentation of this topic in modern textbooks on higher mathematics and mathematical analysis. It was established that in his first printed work he first formulated and proved sufficient conditions for the existence of an extremum of the function of many variables by actually establishing a criterion for the positive (negative) definiteness of quadratic forms, long before it appeared in J. Sylvester in the mid-19th century. A comparative analysis of the results of L. Euler and J.-L. Lagrange. It was found that sufficient conditions for the existence of an extremum of functions of many variables obtained in the first printed work of the young Lagrange are included in all modern textbooks in virtually the same form. The examples shown illustrate his theory. These are tasks of geometric and physical content. Special cases are considered in detail: functions of two and three variables. It is noted that this article became programmatic for the young Lagrange, although it remained unnoticed by his contemporaries. Subsequently, based on the method he obtained, he created the variational calculus, using the principle of least action and the theory of extrema, derived the basic laws of mechanics, the rule of factors for finding the conditional extremum of functions of many variables, which is named after him.

Get full-text (via PubEx)

A Sequential Optimization Algorithm Using Logarithmic Barriers: Applications to Structural Optimization

Journal of Mechanical Design ◽

10.1115/1.1288363 ◽

1999 ◽

Vol 122 (3) ◽

pp. 271-277 ◽

Cited By ~ 7

Author(s):

Ashok V. Kumar

Keyword(s):

Structural Optimization ◽

Objective Function ◽

Line Search ◽

Optimization Problems ◽

Sequential Optimization ◽

Barrier Method ◽

Step Size ◽

Linearly Constrained ◽

Sequential Approximation ◽

And Function

A sequential approximation algorithm is presented here that is particularly suited for problems in engineering design and structural optimization, where the number of variables is very large and function and sensitivity evaluations are computationally expensive. A sequence of sub-problems are generated using a linear approximation for the objective function and setting move limits on the variables using a barrier method. These sub-problems are strictly convex and computation per iteration is significantly reduced by not solving the sub-problems exactly. Instead a few Newton-steps are taken for each sub-problem generated. A criterion, for setting the move limit, is described that reduces or eliminates step size reduction during line search. The method was found to perform well for unconstrained and linearly constrained optimization problems. It is particularly suitable for application to design of optimal shape and topology of structures by minimizing their compliance since it requires very few function evaluations, does not require the hessian of the objective function and evaluates its gradient only once for every sub-problem generated. [S1050-0472(00)01603-2]

Get full-text (via PubEx)