Theoretical Analysis of Function of Derivative Term in On-Line Gradient Descent Learning

Author(s):  
Kazuyuki Hara ◽  
Kentaro Katahira ◽  
Kazuo Okanoya ◽  
Masato Okada
2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Haili Qiao ◽  
Aijie Cheng

AbstractIn this paper, we consider the time fractional diffusion equation with Caputo fractional derivative. Due to the singularity of the solution at the initial moment, it is difficult to achieve an ideal convergence rate when the time discretization is performed on uniform meshes. Therefore, in order to improve the convergence order, the Caputo time fractional derivative term is discretized by the {L2-1_{\sigma}} format on non-uniform meshes, with {\sigma=1-\frac{\alpha}{2}}, while the spatial derivative term is approximated by the classical central difference scheme on uniform meshes. According to the summation formula of positive integer k power, and considering {k=3,4,5}, we propose three non-uniform meshes for time discretization. Through theoretical analysis, different time convergence orders {O(N^{-\min\{k\alpha,2\}})} can be obtained, where N denotes the number of time splits. Finally, the theoretical analysis is verified by several numerical examples.


2009 ◽  
Vol 77 (2-3) ◽  
pp. 195-224 ◽  
Author(s):  
Chun-Nan Hsu ◽  
Han-Shen Huang ◽  
Yu-Ming Chang ◽  
Yuh-Jye Lee

1998 ◽  
Vol 81 (24) ◽  
pp. 5461-5464 ◽  
Author(s):  
Magnus Rattray ◽  
David Saad ◽  
Shun-ichi Amari

2000 ◽  
Vol 12 (4) ◽  
pp. 881-901 ◽  
Author(s):  
Tom Heskes

Several studies have shown that natural gradient descent for on-line learning is much more efficient than standard gradient descent. In this article, we derive natural gradients in a slightly different manner and discuss implications for batch-mode learning and pruning, linking them to existing algorithms such as Levenberg-Marquardt optimization and optimal brain surgeon. The Fisher matrix plays an important role in all these algorithms. The second half of the article discusses a layered approximation of the Fisher matrix specific to multilayered perceptrons. Using this approximation rather than the exact Fisher matrix, we arrive at much faster “natural” learning algorithms and more robust pruning procedures.


Sign in / Sign up

Export Citation Format

Share Document