Reinforcement Learning with Reward Shaping and Mixed Resolution Function Approximation

A crucial trade-off is involved in the design process when function approximation is used in reinforcement learning. Ideally the chosen representation should allow representing as closely as possible an approximation of the value function. However, the more expressive the representation the more training data is needed because the space of candidate hypotheses is larger. A less expressive representation has a smaller hypotheses space and a good candidate can be found faster. The core idea of this chapter is the use of a mixed resolution function approximation, that is, the use of a less expressive function approximation to provide useful guidance during learning, and the use of a more expressive function approximation to obtain a final result of high quality. A major question is how to combine the two representations. Two approaches are proposed and evaluated empirically: the use of two resolutions in one function approximation, and a more sophisticated algorithm with the application of reward shaping.

Download Full-text

Reinforcement Learning for Control Using Value Function Approximation

Encyclopedia of Systems and Control ◽

10.1007/978-3-030-44184-5_100067 ◽

2021 ◽

pp. 1868-1873

Author(s):

Konstantinos Gatsis ◽

George J. Pappas

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Value Function ◽

Value Function Approximation

Download Full-text

Continuous Reinforcement Learning With Knowledge-Inspired Reward Shaping for Autonomous Cavity Filter Tuning

2018 IEEE International Conference on Cyborg and Bionic Systems (CBS) ◽

10.1109/cbs.2018.8612197 ◽

2018 ◽

Cited By ~ 1

Author(s):

Zhiyang Wang ◽

Yongsheng Ou ◽

Xinyu Wu ◽

Wei Feng

Keyword(s):

Reinforcement Learning ◽

Continuous Reinforcement ◽

Filter Tuning ◽

Reward Shaping

Download Full-text

Fast Function Approximation with Hierarchical Neural Networks and Their Application to a Reinforcement Learning Agent

Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/3-540-45720-8_42 ◽

2001 ◽

pp. 363-369

Author(s):

Joern Fischer ◽

Ralph Breithaupt ◽

Mathias Bode

Keyword(s):

Neural Networks ◽

Reinforcement Learning ◽

Function Approximation ◽

Learning Agent ◽

Hierarchical Neural Networks

Download Full-text

Function approximation based multi-agent reinforcement learning

Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000 ◽

10.1109/tai.2000.889843 ◽

2002 ◽

Author(s):

O. Abul ◽

F. Polat ◽

R. Alhajj

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Multi Agent

Download Full-text

Residual Algorithms: Reinforcement Learning with Function Approximation

Machine Learning Proceedings 1995 ◽

10.1016/b978-1-55860-377-6.50013-x ◽

1995 ◽

pp. 30-37 ◽

Cited By ~ 212

Author(s):

Leemon Baird

Keyword(s):

Reinforcement Learning ◽

Function Approximation

Download Full-text

Benchmarking regression methods for function approximation in reinforcement learning: heat pump control

2019 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe) ◽

10.1109/isgteurope.2019.8905533 ◽

2019 ◽

Cited By ~ 1

Author(s):

Brida V. Mbuwir ◽

Fred Spiessens ◽

Geert Deconinck

Keyword(s):

Reinforcement Learning ◽

Heat Pump ◽

Function Approximation ◽

Regression Methods

Download Full-text

Fast Reinforcement Learning of Dialogue Policies Using Stable Function Approximation

Natural Language Processing – IJCNLP 2004 - Lecture Notes in Computer Science ◽

10.1007/978-3-540-30211-7_1 ◽

2005 ◽

pp. 1-11 ◽

Cited By ~ 4

Author(s):

Matthias Denecke ◽

Kohji Dohsaka ◽

Mikio Nakano

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Stable Function

Download Full-text

Adaptive function approximation in reinforcement learning with an interpolating growing neural gas1

International Journal of Hybrid Intelligent Systems ◽

10.3233/his-130183 ◽

2013 ◽

Vol 11 (1) ◽

pp. 55-69

Author(s):

Michael Baumann ◽

Hans Kleine Büning

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Adaptive Function

Download Full-text

FUZZY STATE AGGREGATION AND POLICY HILL CLIMBING FOR STOCHASTIC ENVIRONMENTS

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026806001903 ◽

2006 ◽

Vol 06 (03) ◽

pp. 413-428 ◽

Cited By ~ 1

Author(s):

DEAN C. WARDELL ◽

GILBERT L. PETERSON

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Hill Climbing ◽

Individual Agent ◽

Q Learning ◽

Stochastic Environments ◽

State Aggregation ◽

Multi Agent ◽

Fuzzy State ◽

Better Than

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the fastest policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing and fuzzy state aggregation function approximation is tested in two stochastic environments: Tileworld and the simulated robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning reinforcement learning alone. Results from the multi-agent RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing.

Download Full-text