stationary point
Recently Published Documents


TOTAL DOCUMENTS

387
(FIVE YEARS 53)

H-INDEX

25
(FIVE YEARS 3)

Author(s):  
Zhaosong Lu ◽  
Zhe Sun ◽  
Zirui Zhou

In this paper, we consider a class of structured nonsmooth difference-of-convex (DC) constrained DC programs in which the first convex component of the objective and constraints is the sum of a smooth and a nonsmooth function, and their second convex component is the supremum of finitely many convex smooth functions. The existing methods for this problem usually have a weak convergence guarantee or require a feasible initial point. Inspired by the recent work by Pang et al. [Pang J-S, Razaviyayn M, Alvarado A (2017) Computing B-stationary points of nonsmooth DC programs. Math. Oper. Res. 42(1):95–118.], in this paper, we propose two infeasible methods with a strong convergence guarantee for the considered problem. The first one is a penalty method that consists of finding an approximate D-stationary point of a sequence of penalty subproblems. We show that any feasible accumulation point of the solution sequence generated by such a penalty method is a B-stationary point of the problem under a weakest possible assumption that it satisfies a pointwise Slater constraint qualification (PSCQ). The second one is an augmented Lagrangian (AL) method that consists of finding an approximate D-stationary point of a sequence of AL subproblems. Under the same PSCQ condition as for the penalty method, we show that any feasible accumulation point of the solution sequence generated by such an AL method is a B-stationary point of the problem, and moreover, it satisfies a Karush–Kuhn–Tucker type of optimality condition for the problem, together with any accumulation point of the sequence of a set of auxiliary Lagrangian multipliers. We also propose an efficient successive convex approximation method for computing an approximate D-stationary point of the penalty and AL subproblems. Finally, some numerical experiments are conducted to demonstrate the efficiency of our proposed methods.


2021 ◽  
Vol 38 (1) ◽  
pp. 015001
Author(s):  
Yanan Zhao ◽  
Chunlin Wu ◽  
Qiaoli Dong ◽  
Yufei Zhao

Abstract We consider a wavelet based image reconstruction model with the ℓ p (0 < p < 1) quasi-norm regularization, which is a non-convex and non-Lipschitz minimization problem. For solving this model, Figueiredo et al (2007 IEEE Trans. Image Process. 16 2980–2991) utilized the classical majorization-minimization framework and proposed the so-called Isoft algorithm. This algorithm is computationally efficient, but whether it converges or not has not been concluded yet. In this paper, we propose a new algorithm to accelerate the Isoft algorithm, which is based on Nesterov’s extrapolation technique. Furthermore, a complete convergence analysis for the new algorithm is established. We prove that the whole sequence generated by this algorithm converges to a stationary point of the objective function. This convergence result contains the convergence of Isoft algorithm as a special case. Numerical experiments demonstrate good performance of our new algorithm.


2021 ◽  
Author(s):  
Edith Gabriel ◽  
Francisco Rodriguez-Cortes ◽  
Jérôme Coville ◽  
Jorge Mateu ◽  
Joël Chadoeuf

Abstract Seismic networks provide data that are used as basis both for public safety decisions and for scientific research. Their configuration affects the data completeness, which in turn, critically affects several seismological scientific targets (e.g., earthquake prediction, seismic hazard...). In this context, a key aspect is how to map earthquakes density in seismogenic areas from censored data or even in areas that are not covered by the network. We propose to predict the spatial distribution of earthquakes from the knowledge of presence locations and geological relationships, taking into account any interactions between records. Namely, in a more general setting, we aim to estimate the intensity function of a point process, conditional to its censored realization, as in geostatistics for continuous processes. We define a predictor as the best linear unbiased combination of the observed point pattern. We show that the weight function associated to the predictor is the solution of a Fredholm equation of second kind. Both the kernel and the source term of the Fredholm equation are related to the first- and second-order characteristics of the point process through the intensity and the pair correlation function. Results are presented and illustrated on simulated non-stationary point processes and real data for mapping Greek Hellenic seismicity in a region with unreliable and incomplete records.


2021 ◽  
Author(s):  
Xin-long Luo ◽  
Hang Xiao

Abstract The global minimum point of an optimization problem is of interest in engineering fields and it is difficult to be solved, especially for a nonconvex large-scale optimization problem. In this article, we consider the continuation Newton method with the deflation technique and the quasi-genetic evolution for this problem. Firstly, we use the continuation Newton method with the deflation technique to find the stationary points from several determined initial points as many as possible. Then, we use those found stationary points as the initial evolutionary seeds of the quasi-genetic algorithm. After it evolves into several generations, we obtain a suboptimal point of the optimization problem. Finally, we use the continuation Newton method with this suboptimal point as the initial point to obtain the stationary point, and output the minimizer between this final stationary point and the found suboptimal point of the quasi-genetic algorithm. Finally, we compare it with the multi-start method (the built-in subroutine GlobalSearch.m of the MATLAB R2020a environment) and the differential evolution algorithm (the DE method, the subroutine de.m of the MATLAB Central File Exchange 2021), respectively. Numerical results show that the proposed method performs well for the large-scale global optimization problems, especially the problems of which are difficult to be solved by the known global optimization methods.


Author(s):  
Hadi Abbaszadehpeivasti ◽  
Etienne de Klerk ◽  
Moslem Zamani

AbstractIn this paper, we study the convergence rate of the gradient (or steepest descent) method with fixed step lengths for finding a stationary point of an L-smooth function. We establish a new convergence rate, and show that the bound may be exact in some cases, in particular when all step lengths lie in the interval (0, 1/L]. In addition, we derive an optimal step length with respect to the new bound.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-26
Author(s):  
Congliang Chen ◽  
Li Shen ◽  
Haozhi Huang ◽  
Wei Liu

In this article, we present a distributed variant of an adaptive stochastic gradient method for training deep neural networks in the parameter-server model. To reduce the communication cost among the workers and server, we incorporate two types of quantization schemes, i.e., gradient quantization and weight quantization, into the proposed distributed Adam. In addition, to reduce the bias introduced by quantization operations, we propose an error-feedback technique to compensate for the quantized gradient. Theoretically, in the stochastic nonconvex setting, we show that the distributed adaptive gradient method with gradient quantization and error feedback converges to the first-order stationary point, and that the distributed adaptive gradient method with weight quantization and error feedback converges to the point related to the quantized level under both the single-worker and multi-worker modes. Last, we apply the proposed distributed adaptive gradient methods to train deep neural networks. Experimental results demonstrate the efficacy of our methods.


2021 ◽  
Vol 32 (5) ◽  
pp. 847-864
Author(s):  
A. Budylin

The ( 2 × 2 ) (2\times 2) matrix conjugacy problem (the Riemann–Hilbert problem) with rapidly oscillating off-diagonal entries and quadratic phase function is considered, specifically, the case when one of the diagonal entries vanishes at a stationary point. For solutions of this problem, the leading term of the asymptotics is found. However, the method allows us to construct complete expansions in power orders. These asymptotics can be used, for example, to construct the asymptotics of solutions of the Cauchy problem for the nonlinear Schrödinger equation for large times in the case of the so-called collisionless shock region.


2021 ◽  
Author(s):  
Alessandra Stella ◽  
Peter Bouss ◽  
Günther Palm ◽  
Sonja Grün

Spatio-temporal spike patterns were suggested as indications of active cell assemblies. We developed the SPADE method to detect significant spatio-temporal patterns (STPs) with ms accuracy. STPs are defined as identically repeating spike patterns across neurons with temporal delays between the spikes. The significance of STPs is derived by comparison to the null-hypothesis of independence implemented by surrogate data. SPADE binarizes the spike trains and examines the data for STPs by counting repeated patterns using frequent itemset mining. The significance of STPs is evaluated by comparison to pattern counts derived from surrogate data, i.e., modifications of the original data with destroyed potential spike correlations but under conservation of the firing rate profiles. To avoid erroneous results, surrogate data are required to retain the statistical properties of the original data as much as possible. A classically chosen surrogate technique is Uniform Dithering (UD), which displaces each spike independently according to a uniform distribution. We find that binarized UD surrogates of our experimental data (motor cortex) contain fewer spikes than the binarized original data. As a consequence, false positives occur. Here, we identify the reason for the spike reduction, which is the lack of conservation of short ISIs. To overcome this problem, we study five alternative surrogate techniques and examine their statistical properties such as spike loss, ISI characteristics, effective movement of spikes, and arising false positives when applied to different ground truth data sets: first, on stationary point process models, and then on non-stationary point processes mimicking statistical properties of experimental data. We conclude that trial-shifting best preserves the features of the original data and has a low expected false-positive rate. Finally, the analysis of the experimental data provides consistent STPs across the alternative surrogates.


Sign in / Sign up

Export Citation Format

Share Document