scholarly journals Minimizing Average of Loss Functions Using Gradient Descent and Stochastic Gradient Descent

2016 ◽  
Vol 64 (2) ◽  
pp. 141-145
Author(s):  
Md Rajib Arefin ◽  
M Asadujjaman

This paper deals with minimizing average of loss functions using Gradient Descent (GD) and Stochastic Gradient Descent (SGD). We present these two algorithms for minimizing average of a large number of smooth convex functions. We provide some discussions on their complexity analysis, also illustrate the algorithms geometrically. At the end, we compare their performance through numerical experiments. Dhaka Univ. J. Sci. 64(2): 141-145, 2016 (July)

2020 ◽  
pp. 1-41 ◽  
Author(s):  
Benny Avelin ◽  
Kaj Nyström

In this paper, we prove that, in the deep limit, the stochastic gradient descent on a ResNet type deep neural network, where each layer shares the same weight matrix, converges to the stochastic gradient descent for a Neural ODE and that the corresponding value/loss functions converge. Our result gives, in the context of minimization by stochastic gradient descent, a theoretical foundation for considering Neural ODEs as the deep limit of ResNets. Our proof is based on certain decay estimates for associated Fokker–Planck equations.


Sign in / Sign up

Export Citation Format

Share Document