gradient estimation
Recently Published Documents


TOTAL DOCUMENTS

275
(FIVE YEARS 59)

H-INDEX

22
(FIVE YEARS 4)

2022 ◽  
Vol 0 (0) ◽  
pp. 0
Author(s):  
Esmail Abdul Fattah ◽  
Janet Van Niekerk ◽  
Håvard Rue

<p style='text-indent:20px;'>Computing the gradient of a function provides fundamental information about its behavior. This information is essential for several applications and algorithms across various fields. One common application that requires gradients are optimization techniques such as stochastic gradient descent, Newton's method and trust region methods. However, these methods usually require a numerical computation of the gradient at every iteration of the method which is prone to numerical errors. We propose a simple limited-memory technique for improving the accuracy of a numerically computed gradient in this gradient-based optimization framework by exploiting (1) a coordinate transformation of the gradient and (2) the history of previously taken descent directions. The method is verified empirically by extensive experimentation on both test functions and on real data applications. The proposed method is implemented in the <inline-formula><tex-math id="M1">\begin{document}$\texttt{R} $\end{document}</tex-math></inline-formula> package <inline-formula><tex-math id="M2">\begin{document}$ \texttt{smartGrad}$\end{document}</tex-math></inline-formula> and in C<inline-formula><tex-math id="M3">\begin{document}$ \texttt{++} $\end{document}</tex-math></inline-formula>.</p>


2021 ◽  
Vol 13 (22) ◽  
pp. 4564
Author(s):  
Liming Pu ◽  
Xiaoling Zhang ◽  
Zenan Zhou ◽  
Liang Li ◽  
Liming Zhou ◽  
...  

Phase unwrapping is a critical step in synthetic aperture radar interferometry (InSAR) data processing chains. In almost all phase unwrapping methods, estimating the phase gradient according to the phase continuity assumption (PGE-PCA) is an essential step. The phase continuity assumption is not always satisfied due to the presence of noise and abrupt terrain changes; therefore, it is difficult to get the correct phase gradient. In this paper, we propose a robust least squares phase unwrapping method that works via a phase gradient estimation network based on the encoder–decoder architecture (PGENet) for InSAR. In this method, from a large number of wrapped phase images with topography features and different levels of noise, the deep convolutional neural network can learn global phase features and the phase gradient between adjacent pixels, so a more accurate and robust phase gradient can be predicted than that obtained by PGE-PCA. To get the phase unwrapping result, we use the traditional least squares solver to minimize the difference between the gradient obtained by PGENet and the gradient of the unwrapped phase. Experiments on simulated and real InSAR data demonstrated that the proposed method outperforms the other five well-established phase unwrapping methods and is robust to noise.


Author(s):  
Yijie Peng ◽  
Li Xiao ◽  
Bernd Heidergott ◽  
L. Jeff Hong ◽  
Henry Lam

We investigate a new approach to compute the gradients of artificial neural networks (ANNs), based on the so-called push-out likelihood ratio method. Unlike the widely used backpropagation (BP) method that requires continuity of the loss function and the activation function, our approach bypasses this requirement by injecting artificial noises into the signals passed along the neurons. We show how this approach has a similar computational complexity as BP, and moreover is more advantageous in terms of removing the backward recursion and eliciting transparent formulas. We also formalize the connection between BP, a pivotal technique for training ANNs, and infinitesimal perturbation analysis, a classic path-wise derivative estimation approach, so that both our new proposed methods and BP can be better understood in the context of stochastic gradient estimation. Our approach allows efficient training for ANNs with more flexibility on the loss and activation functions, and shows empirical improvements on the robustness of ANNs under adversarial attacks and corruptions of natural noises. Summary of Contribution: Stochastic gradient estimation has been studied actively in simulation for decades and becomes more important in the era of machine learning and artificial intelligence. The stochastic gradient descent is a standard technique for training the artificial neural networks (ANNs), a pivotal problem in deep learning. The most popular stochastic gradient estimation technique is the backpropagation method. We find that the backpropagation method lies in the family of infinitesimal perturbation analysis, a path-wise gradient estimation technique in simulation. Moreover, we develop a new likelihood ratio-based method, another popular family of gradient estimation technique in simulation, for training more general ANNs, and demonstrate that the new training method can improve the robustness of the ANN.


2021 ◽  
Author(s):  
Lars Emil Haslund ◽  
Shamal Surain Kurukuladithya ◽  
Malmindi Ariyasinghe ◽  
Matthias Bo Stuart ◽  
Marie Sand Traberg ◽  
...  

2021 ◽  
pp. 1-36
Author(s):  
Chenze Shao ◽  
Yang Feng ◽  
Jinchao Zhang ◽  
Fandong Meng ◽  
Jie Zhou

Abstract In recent years, Neural Machine Translation (NMT) has achieved notable results in various translation tasks. However, the word-by-word generation manner determined by the autoregressive mechanism leads to high translation latency of the NMT and restricts its low-latency applications. Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup through generating target words independently and simultaneously. Nevertheless, NAT still takes the word-level cross-entropy loss as the training objective, which is not optimal because the output of NAT cannot be properly evaluated due to the multimodality problem. In this article, we propose using sequence-level training objectives to train NAT models, which evaluate the NAT outputs as a whole and correlates well with the real translation quality. Firstly, we propose training NAT models to optimize sequence-level evaluation metrics (e.g., BLEU) based on several novel reinforcement algorithms customized for NAT, which outperforms the conventional method by reducing the variance of gradient estimation. Secondly, we introduce a novel training objective for NAT models, which aims to minimize the Bag-of-Ngrams (BoN) difference between the model output and the reference sentence. The BoN training objective is differentiable and can be calculated efficiently without doing any approximations. Finally, we apply a three-stage training strategy to combine these two methods to train the NAT model.We validate our approach on four translation tasks (WMT14 En↔De, WMT16 En↔Ro), which shows that our approach largely outperforms NAT baselines and achieves remarkable performance on all translation tasks. The source code is available at https://github.com/ictnlp/Seq-NAT.


2021 ◽  
Vol 105 ◽  
pp. 223-235
Author(s):  
Ixbalank Torres-Zúñiga ◽  
Fernando López-Caamal ◽  
Héctor Hernández-Escoto ◽  
Víctor Alcaraz-González

Sign in / Sign up

Export Citation Format

Share Document