Consideration of State Representation for Semi-autonomous Reinforcement Learning of Sailing Within a Navigable Area

Decoupling State Representation Methods from Reinforcement Learning in Car Racing

Proceedings of the 13th International Conference on Agents and Artificial Intelligence ◽

10.5220/0010237507520759 ◽

2021 ◽

Author(s):

Juan Montoya ◽

Imant Daunhawer ◽

Julia Vogt ◽

Marco Wiering

Keyword(s):

Reinforcement Learning ◽

State Representation ◽

Car Racing

Download Full-text

Emergence of Discrete and Abstract State Representation through Reinforcement Learning in a Continuous Input Task

Advances in Intelligent Systems and Computing - Robot Intelligence Technology and Applications 2012 ◽

10.1007/978-3-642-37374-9_2 ◽

2013 ◽

pp. 13-21 ◽

Cited By ~ 2

Author(s):

Yoshito Sawatsubashi ◽

Mohamad Faizal bin Samusudin ◽

Katsunari Shibata

Keyword(s):

Reinforcement Learning ◽

State Representation ◽

Continuous Input

Download Full-text

State Representation Learning For Effective Deep Reinforcement Learning

2020 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme46284.2020.9102924 ◽

2020 ◽

Author(s):

Jian Zhao ◽

Wengang Zhou ◽

Tianyu Zhao ◽

Yun Zhou ◽

Houqiang Li

Keyword(s):

Reinforcement Learning ◽

Representation Learning ◽

State Representation

Download Full-text

Acceleration of Actor-Critic Deep Reinforcement Learning for Visual Grasping by State Representation Learning Based on a Preprocessed Input Image

10.1109/iros51168.2021.9635931 ◽

2021 ◽

Author(s):

Taewon Kim ◽

Yeseong Park ◽

Youngbin Park ◽

Sang Hyoung Lee ◽

Il Hong Suh

Keyword(s):

Reinforcement Learning ◽

Representation Learning ◽

Input Image ◽

State Representation

Download Full-text

RLCard: A Platform for Reinforcement Learning in Card Games

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/764 ◽

2020 ◽

Author(s):

Daochen Zha ◽

Kwei-Herng Lai ◽

Songyi Huang ◽

Yuanpu Cao ◽

Keerthana Reddy ◽

...

Keyword(s):

Reinforcement Learning ◽

Research And Development ◽

Imperfect Information ◽

State Representation ◽

Research Opportunities ◽

Card Games ◽

Imperfect Information Games ◽

Learning Research ◽

Information Games

We present RLCard, a Python platform for reinforcement learning research and development in card games. RLCard supports various card environments and several baseline algorithms with unified easy-to-use interfaces, aiming at bridging reinforcement learning and imperfect information games. The platform provides flexible configurations of state representation, action encoding, and reward design. RLCard also supports visualizations for algorithm debugging. In this demo, we showcase two representative environments and their visualization results. We conclude this demo with challenges and research opportunities brought by RLCard. A video is available on YouTube.

Download Full-text

An Experimental Study on State Representation Extraction for Vision-Based Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app112110337 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10337

Author(s):

Junkai Ren ◽

Yujun Zeng ◽

Sihang Zhou ◽

Yichuan Zhang

Keyword(s):

Experimental Study ◽

Reinforcement Learning ◽

Network Architecture ◽

Representation Learning ◽

Evaluation Metrics ◽

High Dimensional ◽

Regularization Methods ◽

Challenging Problem ◽

State Representation ◽

Sample Quality

Scaling end-to-end learning to control robots with vision inputs is a challenging problem in the field of deep reinforcement learning (DRL). While achieving remarkable success in complex sequential tasks, vision-based DRL remains extremely data-inefficient, especially when dealing with high-dimensional pixels inputs. Many recent studies have tried to leverage state representation learning (SRL) to break through such a barrier. Some of them could even help the agent learn from pixels as efficiently as from states. Reproducing existing work, accurately judging the improvements offered by novel methods, and applying these approaches to new tasks are vital for sustaining this progress. However, the demands of these three aspects are seldom straightforward. Without significant criteria and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the previous methods are meaningful. For this reason, we conducted ablation studies on hyperparameters, embedding network architecture, embedded dimension, regularization methods, sample quality and SRL methods to compare and analyze their effects on representation learning and reinforcement learning systematically. Three evaluation metrics are summarized, including five baseline algorithms (including both value-based and policy-based methods) and eight tasks are adopted to avoid the particularity of each experiment setting. We highlight the variability in reported methods and suggest guidelines to make future results in SRL more reproducible and stable based on a wide number of experimental analyses. We aim to spur discussion about how to assure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.

Download Full-text

On Overfitting and Asymptotic Bias in Batch Reinforcement Learning with Partial Observability (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/706 ◽

2020 ◽

Author(s):

Vincent Francois-Lavet ◽

Guillaume Rabusseau ◽

Joelle Pineau ◽

Damien Ernst ◽

Raphael Fonteneau

Keyword(s):

Reinforcement Learning ◽

Theoretical Analysis ◽

Asymptotic Bias ◽

Limited Information ◽

Limited Data ◽

State Representation ◽

Error Sources ◽

Partial Observability ◽

Batch Reinforcement Learning

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two error sources. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

Download Full-text