Fast gradient-descent methods for temporal-difference learning with linear function approximation

Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09 ◽

10.1145/1553374.1553501 ◽

2009 ◽

Cited By ~ 103

Author(s):

Richard S. Sutton ◽

Hamid Reza Maei ◽

Doina Precup ◽

Shalabh Bhatnagar ◽

David Silver ◽

...

Keyword(s):

Linear Function ◽

Function Approximation ◽

Gradient Descent ◽

Descent Methods ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Gradient Descent Methods ◽

Linear Function Approximation ◽

Fast Gradient

Download Full-text

A Finite Time Analysis of Temporal Difference Learning with Linear Function Approximation

Operations Research ◽

10.1287/opre.2020.2024 ◽

2021 ◽

Author(s):

Jalaj Bhandari ◽

Daniel Russo ◽

Raghav Singal

Keyword(s):

Linear Function ◽

Finite Time ◽

Function Approximation ◽

Gradient Descent ◽

Convergence Rates ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Convergence Results ◽

Linear Function Approximation ◽

Markov Reward

Temporal difference learning (TD) is a simple iterative algorithm widely used for policy evaluation in Markov reward processes. Bhandari et al. prove finite time convergence rates for TD learning with linear function approximation. The analysis follows using a key insight that establishes rigorous connections between TD updates and those of online gradient descent. In a model where observations are corrupted by i.i.d. noise, convergence results for TD follow by essentially mirroring the analysis for online gradient descent. Using an information-theoretic technique, the authors also provide results for the case when TD is applied to a single Markovian data stream where the algorithm’s updates can be severely biased. Their analysis seamlessly extends to the study of TD learning with eligibility traces and Q-learning for high-dimensional optimal stopping problems.

Download Full-text

Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling

Machine Learning ◽

10.1007/s10994-020-05912-5 ◽

2021 ◽

Author(s):

L. A. Prashanth ◽

Nathaniel Korda ◽

Rémi Munos

Keyword(s):

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Uniform Sampling ◽

Linear Function Approximation ◽

Concentration Bounds ◽

Batch Data

Download Full-text

Convergence analysis of temporal-difference learning algorithms with linear function approximation

Proceedings of the twelfth annual conference on Computational learning theory - COLT '99 ◽

10.1145/307400.307438 ◽

1999 ◽

Cited By ~ 4

Author(s):

Vladislav Tadić

Keyword(s):

Convergence Analysis ◽

Linear Function ◽

Function Approximation ◽

Learning Algorithms ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Linear Function Approximation

Download Full-text

Finite-Time Performance of Distributed Temporal-Difference Learning with Linear Function Approximation

SIAM Journal on Mathematics of Data Science ◽

10.1137/20m1311971 ◽

2021 ◽

Vol 3 (1) ◽

pp. 298-320

Author(s):

Thinh T. Doan ◽

Siva Theja Maguluri ◽

Justin Romberg

Keyword(s):

Linear Function ◽

Finite Time ◽

Function Approximation ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Time Performance ◽

Linear Function Approximation

Download Full-text

Asymptotic analysis of temporal-difference learning algorithms with linear function approximation

Proceedings of the 38th IEEE Conference on Decision and Control (Cat. No.99CH36304) ◽

10.1109/cdc.1999.833350 ◽

2003 ◽

Author(s):

V. Tadic

Keyword(s):

Asymptotic Analysis ◽

Linear Function ◽

Function Approximation ◽

Learning Algorithms ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Linear Function Approximation

Download Full-text

Multi-agent temporal-difference learning with linear function approximation: Weak convergence under time-varying network topologies

2016 American Control Conference (ACC) ◽

10.1109/acc.2016.7524910 ◽

2016 ◽

Cited By ~ 4

Author(s):

Milos S. Stankovic ◽

Srdjan S. Stankovic

Keyword(s):

Weak Convergence ◽

Linear Function ◽

Function Approximation ◽

Time Varying ◽

Temporal Difference ◽

Temporal Difference Learning ◽

Linear Function Approximation ◽

Network Topologies ◽

Multi Agent

Download Full-text

Improved Temporal Difference Methods with Linear Function Approximation

Handbook of Learning and Approximate Dynamic Programming ◽

10.1109/9780470544785.ch9 ◽

2009 ◽

Keyword(s):

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Linear Function Approximation ◽

Difference Methods ◽

Temporal Difference Methods

Download Full-text

A worst-case comparison between temporal difference and residual gradient with linear function approximation

Proceedings of the 25th international conference on Machine learning - ICML '08 ◽

10.1145/1390156.1390227 ◽

2008 ◽

Cited By ~ 5

Author(s):

Lihong Li

Keyword(s):

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Worst Case ◽

Linear Function Approximation ◽

Case Comparison

Download Full-text

Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation

2012 3rd International Workshop on Cognitive Information Processing (CIP) ◽

10.1109/cip.2012.6232901 ◽

2012 ◽

Cited By ~ 1

Author(s):

Sergio Valcarcel Macua ◽

Pavle Belanovic ◽

Santiago Zazo

Keyword(s):

Reinforcement Learning ◽

Linear Function ◽

Function Approximation ◽

Temporal Difference ◽

Linear Function Approximation

Download Full-text

Investigation on House Price Prediction with Various Gradient Descent Methods

Journal of Physics Conference Series ◽

10.1088/1742-6596/1827/1/012186 ◽

2021 ◽

Vol 1827 (1) ◽

pp. 012186

Author(s):

Yan Sun

Keyword(s):

Gradient Descent ◽

House Price ◽

Descent Methods ◽

Price Prediction ◽

Gradient Descent Methods

Download Full-text