Challenges in High-Dimensional Reinforcement Learning with Evolution Strategies

Generating more evenly distributed samples in high dimensional search spaces is the major purpose of the recently proposed mirrored sampling technique for evolution strategies. The diversity of the mutation samples is enlarged and the convergence rate is therefore improved by the mirrored sampling. Motivated by the mirrored sampling technique, this article introduces a new derandomized sampling technique called mirrored orthogonal sampling. The performance of this new technique is both theoretically analyzed and empirically studied on the sphere function. In particular, the mirrored orthogonal sampling technique is applied to the well-known Covariance Matrix Adaptation Evolution Strategy (CMA-ES). The resulting algorithm is experimentally tested on the well-known Black-Box Optimization Benchmark (BBOB). By comparing the results from the benchmark, mirrored orthogonal sampling is found to outperform both the standard CMA-ES and its variant using mirrored sampling.

Download Full-text

A Scalable Sampling Method to High-Dimensional Uncertainties for Optimal and Reinforcement Learning-Based Controls

IEEE Control Systems Letters ◽

10.1109/lcsys.2017.2708598 ◽

2017 ◽

Vol 1 (1) ◽

pp. 98-103 ◽

Cited By ~ 12

Author(s):

Junfei Xie ◽

Yan Wan ◽

Kevin Mills ◽

James J. Filliben ◽

F. L. Lewis

Keyword(s):

Reinforcement Learning ◽

Sampling Method ◽

High Dimensional

Download Full-text

Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments

Neuron ◽

10.1016/j.neuron.2020.11.021 ◽

2020 ◽

Author(s):

Logan Cross ◽

Jeff Cockburn ◽

Yisong Yue ◽

John P. O’Doherty

Keyword(s):

Reinforcement Learning ◽

State Space ◽

High Dimensional ◽

The Brain

Download Full-text

Reinforcement learning for high-dimensional problems with symmetrical actions

2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583) ◽

10.1109/icsmc.2004.1401371 ◽

2005 ◽

Author(s):

M.A.S. Kamal ◽

J. Murata

Keyword(s):

Reinforcement Learning ◽

High Dimensional

Download Full-text

Logic-Based Sequential Decision-Making

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019995 ◽

2019 ◽

Vol 33 ◽

pp. 9995-9996

Author(s):

Daoming Lyu ◽

Fangkai Yang ◽

Bo Liu ◽

Daesub Yoon

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

High Dimensional ◽

Great Success ◽

Sequential Decision ◽

Sensory Inputs ◽

Hierarchical Decision ◽

High Level ◽

Data Efficiency ◽

Symbolic Planning

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner – controller – meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.

Download Full-text

Kernel dynamic policy programming: Practical reinforcement learning for high-dimensional robots

2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids) ◽

10.1109/humanoids.2016.7803345 ◽

2016 ◽

Cited By ~ 3

Author(s):

Yunduan Cui ◽

Takamitsu Matsubara ◽

Kenji Sugimoto

Keyword(s):

Reinforcement Learning ◽

High Dimensional ◽

Dynamic Policy

Download Full-text