scholarly journals Parameter calibration with stochastic gradient descent for interacting particle systems driven by neural networks

Author(s):  
Simone Göttlich ◽  
Claudia Totzeck

AbstractWe propose a neural network approach to model general interaction dynamics and an adjoint-based stochastic gradient descent algorithm to calibrate its parameters. The parameter calibration problem is considered as optimal control problem that is investigated from a theoretical and numerical point of view. We prove the existence of optimal controls, derive the corresponding first-order optimality system and formulate a stochastic gradient descent algorithm to identify parameters for given data sets. To validate the approach, we use real data sets from traffic and crowd dynamics to fit the parameters. The results are compared to forces corresponding to well-known interaction models such as the Lighthill–Whitham–Richards model for traffic and the social force model for crowd motion.

Author(s):  
Fan Zhou ◽  
Guojing Cong

We adopt and analyze a synchronous K-step averaging stochastic gradient descent algorithm which we call K-AVG  for solving large scale machine learning problems. We establish the convergence results of K-AVG for nonconvex objectives. Our analysis of K-AVG applies to many existing variants of synchronous SGD.  We explain why the K-step delay is necessary and leads to better performance than traditional parallel stochastic gradient descent which is equivalent to K-AVG with $K=1$. We also show that K-AVG scales better with the number of learners than asynchronous stochastic gradient descent (ASGD). Another advantage of K-AVG over ASGD is that it allows larger stepsizes and facilitates faster convergence. On a cluster of $128$ GPUs, K-AVG is faster than ASGD implementations and achieves better accuracies and faster convergence for training with the CIFAR-10 dataset.


Sign in / Sign up

Export Citation Format

Share Document