Incorporating Multi-Context into the Traversability Map for Urban Autonomous Driving using Deep Inverse Reinforcement Learning

Abstract Inverse reinforcement learning (IRL) has been successfully applied in many robotics and autonomous driving studies without the need for hand-tuning a reward function. However, it suffers from safety issues. Compared to the reinforcement learning (RL) algorithms, IRL is even more vulnerable to unsafe situations as it can only infer the importance of safety based on expert demonstrations. In this paper, we propose a safety-aware adversarial inverse reinforcement learning algorithm (S-AIRL). First, the control barrier function (CBF) is used to guide the training of a safety critic, which leverages the knowledge of system dynamics in the sampling process without training an additional guiding policy. The trained safety critic is then integrated into the discriminator to help discern the generated data and expert demonstrations from the standpoint of safety. Finally, to further improve the safety awareness, a regulator is introduced in the loss function of the discriminator training to prevent the recovered reward function from assigning high rewards to the risky behaviors. We tested our S-AIRL in the highway autonomous driving scenario. Comparing to the original AIRL algorithm, with the same level of imitation learning (IL) performance, the proposed S-AIRL can reduce the collision rate by 32.6%.

Download Full-text

Accelerated Inverse Reinforcement Learning with Randomly Pre-sampled Policies for Autonomous Driving Reward Design

2019 IEEE Intelligent Transportation Systems Conference (ITSC) ◽

10.1109/itsc.2019.8916952 ◽

2019 ◽

Cited By ~ 1

Author(s):

Long Xin ◽

Shengbo Eben Li ◽

Pin Wang ◽

Wenhan Cao ◽

Bingbing Nie ◽

...

Keyword(s):

Reinforcement Learning ◽

Autonomous Driving ◽

Inverse Reinforcement Learning

Download Full-text

Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning With Application to Autonomous Driving

IEEE Robotics and Automation Letters ◽

10.1109/lra.2020.3005126 ◽

2020 ◽

Vol 5 (4) ◽

pp. 5355-5362 ◽

Cited By ~ 1

Author(s):

Zheng Wu ◽

Liting Sun ◽

Wei Zhan ◽

Chenyu Yang ◽

Masayoshi Tomizuka

Keyword(s):

Reinforcement Learning ◽

Maximum Entropy ◽

Autonomous Driving ◽

Inverse Reinforcement Learning ◽

Efficient Sampling

Download Full-text

An Open Framework for Human-Like Autonomous Driving Using Inverse Reinforcement Learning

2014 IEEE Vehicle Power and Propulsion Conference (VPPC) ◽

10.1109/vppc.2014.7007013 ◽

2014 ◽

Author(s):

Dizan Vasquez ◽

Yufeng Yu ◽

Suryansh Kumar ◽

Christian Laugier

Keyword(s):

Reinforcement Learning ◽

Autonomous Driving ◽

Inverse Reinforcement Learning ◽

Open Framework

Download Full-text

Decision Making for Autonomous Driving via Augmented Adversarial Inverse Reinforcement Learning

10.1109/icra48506.2021.9560907 ◽

2021 ◽

Author(s):

Pin Wang ◽

Dapeng Liu ◽

Jiayu Chen ◽

Hanhan Li ◽

Ching-Yao Chan

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Driving ◽

Inverse Reinforcement Learning

Download Full-text

Ensemble Inverse Reinforcement Learning from Semi-Expert Agents

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.137.667 ◽

2017 ◽

Vol 137 (4) ◽

pp. 667-673

Author(s):

Shinji Tomita ◽

Fumiya Hamatsu ◽

Tomoki Hamagami

Keyword(s):

Reinforcement Learning ◽

Inverse Reinforcement Learning

Download Full-text

Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/891 ◽

2019 ◽

Author(s):

Ritesh Noothigattu ◽

Djallel Bouneffouf ◽

Nicholas Mattei ◽

Rachita Chandra ◽

Piyush Madan ◽

...

Keyword(s):

Reinforcement Learning ◽

Ethical Values ◽

Large Role ◽

Learning To Learn ◽

Inverse Reinforcement Learning ◽

Time Step ◽

Novel Approach

Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure that they behave in ways aligned with the values of society, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcement learning to learn to maximize environmental rewards. A contextual bandit-based orchestrator then picks between the two policies: constraint-based and environment reward-based. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward-maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using Pac-Man and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

Download Full-text