The SEISCOPE optimization toolbox: A large-scale nonlinear optimization library based on reverse communication

Geophysics ◽  
2016 ◽  
Vol 81 (2) ◽  
pp. F1-F15 ◽  
Author(s):  
Ludovic Métivier ◽  
Romain Brossier

The SEISCOPE optimization toolbox is a set of FORTRAN 90 routines, which implement first-order methods (steepest-descent and nonlinear conjugate gradient) and second-order methods ([Formula: see text]-BFGS and truncated Newton), for the solution of large-scale nonlinear optimization problems. An efficient line-search strategy ensures the robustness of these implementations. The routines are proposed as black boxes easy to interface with any computational code, where such large-scale minimization problems have to be solved. Traveltime tomography, least-squares migration, or full-waveform inversion are examples of such problems in the context of geophysics. Integrating the toolbox for solving this class of problems presents two advantages. First, it helps to separate the routines depending on the physics of the problem from the ones related to the minimization itself, thanks to the reverse communication protocol. This enhances flexibility in code development and maintenance. Second, it allows us to switch easily between different optimization algorithms. In particular, it reduces the complexity related to the implementation of second-order methods. Because the latter benefit from faster convergence rates compared to first-order methods, significant improvements in terms of computational efforts can be expected.

Geophysics ◽  
2021 ◽  
pp. 1-147
Author(s):  
Peng Yong ◽  
Romain Brossier ◽  
Ludovic Métivier

In order to exploit Hessian information in Full Waveform Inversion (FWI), the matrix-free truncated Newton method can be used. In such a method, Hessian-vector product computation is one of the major concerns due to the huge memory requirements and demanding computational cost. Using the adjoint-state method, the Hessian-vector product can be estimated by zero-lag cross-correlation of the first-order/second-order incident wavefields and the second-order/first-order adjoint wavefields. Different from the implementation in frequency-domain FWI, Hessian-vector product construction in the time domain becomes much more challenging as it is not affordable to store the entire time-dependent wavefields. The widely used wavefield recomputation strategy leads to computationally intensive tasks. We present an efficient alternative approach to computing the Hessian-vector product for time-domain FWI. In our method, discrete Fourier transform is applied to extract frequency-domain components of involved wavefields, which are used to compute wavefield cross-correlation in the frequency domain. This makes it possible to avoid reconstructing the first-order and second-order incident wavefields. In addition, a full-scattered-field approximation is proposed to efficiently simplify the second-order incident and adjoint wavefields computation, which enables us to refrain from repeatedly solving the first-order incident and adjoint equations for the second-order incident and adjoint wavefields (re)computation. With the proposed method, the computational time can be reduced by 70% and 80% in viscous media for Gauss-Newton and full-Newton Hessian-vector product construction, respectively. The effectiveness of our method is also verified in the frame of a 2D multi-parameter inversion, in which the proposed method almost reaches the same iterative convergence of the conventional time-domain implementation.


Author(s):  
Qiushi Cao ◽  
Prakash Krishnaswami

Abstract The vast majority of applied optimization falls into the category of first order optimization. This paper attempts to make the case for increased use of second order optimization techniques. Some of the most serious criticisms against second order methods are discussed and are shown to have lost some of their validity in recent years. In addition, some positive advantages of second order methods are also presented. These advantages include computational efficiency, compatibility with new advances in hardware and spill-over benefits in areas such as minimum sensitivity design. A simple second order constrained optimization algorithm is developed and several examples are solved using this method. A comparison is made with first order methods in terms of the number of function evaluations. The results show that the second order method performs much better than the first order methods in this regard. The paper also suggests some directions for future research in second order optimization.


Author(s):  
Hao Luo ◽  
Long Chen

AbstractConvergence analysis of accelerated first-order methods for convex optimization problems are developed from the point of view of ordinary differential equation solvers. A new dynamical system, called Nesterov accelerated gradient (NAG) flow, is derived from the connection between acceleration mechanism and A-stability of ODE solvers, and the exponential decay of a tailored Lyapunov function along with the solution trajectory is proved. Numerical discretizations of NAG flow are then considered and convergence rates are established via a discrete Lyapunov function. The proposed differential equation solver approach can not only cover existing accelerated methods, such as FISTA, Güler’s proximal algorithm and Nesterov’s accelerated gradient method, but also produce new algorithms for composite convex optimization that possess accelerated convergence rates. Both the convex and the strongly convex cases are handled in a unified way in our approach.


1999 ◽  
Vol 9 (3) ◽  
pp. 755-778 ◽  
Author(s):  
Paul T. Boggs ◽  
Anthony J. Kearsley ◽  
Jon W. Tolle

2021 ◽  
Vol 11 (8) ◽  
pp. 3430
Author(s):  
Erik Cuevas ◽  
Héctor Becerra ◽  
Héctor Escobar ◽  
Alberto Luque-Chang ◽  
Marco Pérez ◽  
...  

Recently, several new metaheuristic schemes have been introduced in the literature. Although all these approaches consider very different phenomena as metaphors, the search patterns used to explore the search space are very similar. On the other hand, second-order systems are models that present different temporal behaviors depending on the value of their parameters. Such temporal behaviors can be conceived as search patterns with multiple behaviors and simple configurations. In this paper, a set of new search patterns are introduced to explore the search space efficiently. They emulate the response of a second-order system. The proposed set of search patterns have been integrated as a complete search strategy, called Second-Order Algorithm (SOA), to obtain the global solution of complex optimization problems. To analyze the performance of the proposed scheme, it has been compared in a set of representative optimization problems, including multimodal, unimodal, and hybrid benchmark formulations. Numerical results demonstrate that the proposed SOA method exhibits remarkable performance in terms of accuracy and high convergence rates.


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Darae Jeong ◽  
Yibao Li ◽  
Chaeyoung Lee ◽  
Junxiang Yang ◽  
Yongho Choi ◽  
...  

In this paper, we propose a verification method for the convergence rates of the numerical solutions for parabolic equations. Specifically, we consider the numerical convergence rates of the heat equation, the Allen–Cahn equation, and the Cahn–Hilliard equation. Convergence test results show that if we refine the spatial and temporal steps at the same time, then we have the second-order convergence rate for the second-order scheme. However, in the case of the first-order in time and the second-order in space scheme, we may have the first-order or the second-order convergence rates depending on starting spatial and temporal step sizes. Therefore, for a rigorous numerical convergence test, we need to perform the spatial and the temporal convergence tests separately.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


Author(s):  
Ion Necoara ◽  
Martin Takáč

Abstract In this paper we consider large-scale smooth optimization problems with multiple linear coupled constraints. Due to the non-separability of the constraints, arbitrary random sketching would not be guaranteed to work. Thus, we first investigate necessary and sufficient conditions for the sketch sampling to have well-defined algorithms. Based on these sampling conditions we develop new sketch descent methods for solving general smooth linearly constrained problems, in particular, random sketch descent (RSD) and accelerated random sketch descent (A-RSD) methods. To our knowledge, this is the first convergence analysis of RSD algorithms for optimization problems with multiple non-separable linear constraints. For the general case, when the objective function is smooth and non-convex, we prove for the non-accelerated variant sublinear rate in expectation for an appropriate optimality measure. In the smooth convex case, we derive for both algorithms, non-accelerated and A-RSD, sublinear convergence rates in the expected values of the objective function. Additionally, if the objective function satisfies a strong convexity type condition, both algorithms converge linearly in expectation. In special cases, where complexity bounds are known for some particular sketching algorithms, such as coordinate descent methods for optimization problems with a single linear coupled constraint, our theory recovers the best known bounds. Finally, we present several numerical examples to illustrate the performances of our new algorithms.


Author(s):  
Ehsan Kazemi ◽  
Liqiang Wang

Nonconvex and nonsmooth problems have recently attracted considerable attention in machine learning. However, developing efficient methods for the nonconvex and nonsmooth optimization problems with certain performance guarantee remains a challenge. Proximal coordinate descent (PCD) has been widely used for solving optimization problems, but the knowledge of PCD methods in the nonconvex setting is very limited. On the other hand, the asynchronous proximal coordinate descent (APCD) recently have received much attention in order to solve large-scale problems. However, the accelerated variants of APCD algorithms are rarely studied. In this paper, we extend APCD method to the accelerated algorithm (AAPCD) for nonsmooth and nonconvex problems that satisfies the sufficient descent property, by comparing between the function values at proximal update and a linear extrapolated point using a delay-aware momentum value. To the best of our knowledge, we are the first to provide stochastic and deterministic accelerated extension of APCD algorithms for general nonconvex and nonsmooth problems ensuring that for both bounded delays and unbounded delays every limit point is a critical point. By leveraging Kurdyka-Łojasiewicz property, we will show linear and sublinear convergence rates for the deterministic AAPCD with bounded delays. Numerical results demonstrate the practical efficiency of our algorithm in speed.


Sign in / Sign up

Export Citation Format

Share Document