Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Mapping Intimacies ◽

10.1109/isrimt53730.2021.9597110 ◽

2021 ◽

Author(s):

Zhaorun Chen ◽

Binhao Chen ◽

Shenghan Xie ◽

Liang Gong ◽

Chengliang Liu ◽

...

Keyword(s):

Reinforcement Learning ◽

Download Full-text

Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/475 ◽

2019 ◽

Author(s):

Wenjie Shi ◽

Shiji Song ◽

Cheng Wu

Keyword(s):

Reinforcement Learning ◽

Maximum Entropy ◽

Bellman Equation ◽

Value Functions ◽

Policy Actor ◽

Policy Gradient ◽

Gradient Based ◽

Continuous Actions ◽

Stable Learning

Maximum entropy deep reinforcement learning (RL) methods have been demonstrated on a range of challenging continuous tasks. However, existing methods either suffer from severe instability when training on large off-policy data or cannot scale to tasks with very high state and action dimensionality such as 3D humanoid locomotion. Besides, the optimality of desired Boltzmann policy set for non-optimal soft value function is not persuasive enough. In this paper, we first derive soft policy gradient based on entropy regularized expected reward objective for RL with continuous actions. Then, we present an off-policy actor-critic, model-free maximum entropy deep RL algorithm called deep soft policy gradient (DSPG) by combining soft policy gradient with soft Bellman equation. To ensure stable learning while eliminating the need of two separate critics for soft value functions, we leverage double sampling approach to making the soft Bellman equation tractable. The experimental results demonstrate that our method outperforms in performance over off-policy prior methods.

Download Full-text

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2020.12.2021 ◽

2020 ◽

Vol 53 (2) ◽

pp. 1549-1554

Author(s):

Wesley Suttle ◽

Zhuoran Yang ◽

Kaiqing Zhang ◽

Zhaoran Wang ◽

Tamer Başar ◽

...

Keyword(s):

Reinforcement Learning ◽

Policy Actor ◽

Multi Agent ◽

Distributed Reinforcement

Download Full-text

Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus

2019 IEEE 58th Conference on Decision and Control (CDC) ◽

10.1109/cdc40024.2019.9029969 ◽

2019 ◽

Author(s):

Yan Zhang ◽

Michael M. Zavlanos

Keyword(s):

Reinforcement Learning ◽

Download Full-text

Supplemental Material for Reconciling Reinforcement Learning Models With Behavioral Extinction and Renewal: Implications for Addiction, Relapse, and Problem Gambling

Psychological Review ◽

10.1037/0033-295x.114.3.784.supp ◽

2007 ◽

Keyword(s):

Reinforcement Learning ◽

Problem Gambling ◽

Learning Models ◽

Behavioral Extinction ◽

Reinforcement Learning Models

Download Full-text

Bayes factors for reinforcement-learning models of the Iowa gambling task.

Decision ◽

10.1037/dec0000040 ◽

2016 ◽

Vol 3 (2) ◽

pp. 115-131 ◽

Author(s):

Helen Steingroever ◽

Ruud Wetzels ◽

Eric-Jan Wagenmakers

Keyword(s):

Reinforcement Learning ◽

Iowa Gambling Task ◽

Bayes Factors ◽

Gambling Task ◽

Learning Models ◽

Reinforcement Learning Models

Download Full-text

Analogical Reinforcement Learning With Two-Stage Memory Retrieval

PsycEXTRA Dataset ◽

10.1037/e528942014-705 ◽

2014 ◽

Author(s):

James Foster ◽

Matt Jones

Keyword(s):

Reinforcement Learning ◽

Memory Retrieval ◽

Download Full-text

Effects of Working Memory Capacity on the Speed and Accuracy of Learning in Reinforcement Learning Models

PsycEXTRA Dataset ◽

10.1037/e528942014-552 ◽

2014 ◽

Author(s):

Adnane Ez-Zizi ◽

Simon Farrell ◽

David Leslie

Keyword(s):

Working Memory ◽

Reinforcement Learning ◽

Working Memory Capacity ◽

Memory Capacity ◽

Learning Models ◽

Reinforcement Learning Models ◽

Speed And Accuracy

Download Full-text

Supplemental Material for Reinforcement Learning Models of Risky Choice and the Promotion of Risk-Taking by Losses Disguised as Wins in Rats

Journal of Experimental Psychology Animal Learning and Cognition ◽

10.1037/xan0000141.supp ◽

2017 ◽

Keyword(s):

Reinforcement Learning ◽

Risk Taking ◽

Risky Choice ◽

Learning Models ◽

Losses Disguised As Wins ◽

Reinforcement Learning Models

Download Full-text

Reinforcement learning of irrelevant stimulus-response associations modulates cognitive control.

Journal of Experimental Psychology Learning Memory and Cognition ◽

10.1037/xlm0000850 ◽

2020 ◽

Author(s):

Jinglu Chen ◽

Ling Tan ◽

Lu Liu ◽

Ling Wang

Keyword(s):

Reinforcement Learning ◽

Cognitive Control ◽

Irrelevant Stimulus ◽

Stimulus Response

Download Full-text

A Collaborative Scheduling Lane Changing Model for Intelligent Connected Vehicles Based on Deep Reinforcement Learning

10.1061/9780784483053.178 ◽

2020 ◽

Author(s):

Zheyu Cui ◽

Jianming Hu

Keyword(s):

Reinforcement Learning ◽

Connected Vehicles ◽

Lane Changing ◽

Collaborative Scheduling

Download Full-text