Interaction-Aware Behavior Planning for Autonomous Vehicles Validated With Real Traffic Data

Abstract Autonomous vehicles (AVs) need to interact with other traffic participants who can be either cooperative or aggressive, attentive or inattentive. Such different characteristics can lead to quite different interactive behaviors. Hence, to achieve safe and efficient autonomous driving, AVs need to be aware of such uncertainties when they plan their own behaviors. In this paper, we formulate such a behavior planning problem as a partially observable Markov Decision Process (POMDP) where the cooperativeness of other traffic participants is treated as an unobservable state. Under different cooperativeness levels, we learn the human behavior models from real traffic data via the principle of maximum likelihood. Based on that, the POMDP problem is solved by Monte-Carlo Tree Search. We verify the proposed algorithm in both simulations and real traffic data on a lane change scenario, and the results show that the proposed algorithm can successfully finish the lane changes without collisions.

Download Full-text

A Comparison of Trajectory Planning and Control Frameworks for Cooperative Autonomous Driving

Journal of Dynamic Systems Measurement and Control ◽

10.1115/1.4049554 ◽

2021 ◽

Vol 143 (7) ◽

Author(s):

Icaro Bezerra Viana ◽

Husain Kanchwala ◽

Kenan Ahiska ◽

Nabil Aouf

Keyword(s):

Trajectory Planning ◽

Cooperative Behavior ◽

Autonomous Driving ◽

General Idea ◽

Mixed Integer ◽

Quadratic Program ◽

Change Scenario ◽

Planning Problem ◽

Lane Change ◽

Double Lane Change

Abstract This work considers the cooperative trajectory-planning problem along a double lane change scenario for autonomous driving. In this paper, we develop two frameworks to solve this problem based on distributed model predictive control (MPC). The first approach solves a single nonlinear MPC problem. The general idea is to introduce a collision cost function in the optimization problem at the planning task to achieve a smooth and bounded collision function, and thus to prevent the need to implement tight hard constraints. The second method uses a hierarchical scheme with two main units: a trajectory-planning layer based on mixed-integer quadratic program (MIQP) computes an on-line collision-free trajectory using simplified motion dynamics, and a tracking controller unit to follow the trajectory from the higher level using the nonlinear vehicle model. Connected and automated vehicles (CAVs) sharing their planned trajectories lay the foundation of the cooperative behavior. In the tests and evaluation of the proposed methodologies, matlab-carsim cosimulation is utilized. carsim provides the high-fidelity model for the multibody vehicle dynamics. matlab-carsim conjoint simulation experiments compare both approaches for a cooperative double lane change maneuver of two vehicles moving along a one-way three-lane road with obstacles.

Download Full-text

Importance sampling for online planning under uncertainty

The International Journal of Robotics Research ◽

10.1177/0278364918780322 ◽

2018 ◽

Vol 38 (2-3) ◽

pp. 162-181 ◽

Cited By ~ 2

Author(s):

Yuanfu Luo ◽

Haoyu Bai ◽

David Hsu ◽

Wee Sun Lee

Keyword(s):

Importance Sampling ◽

Autonomous Vehicles ◽

State Of The Art ◽

Monte Carlo Sampling ◽

Planning Under Uncertainty ◽

Online Planning ◽

Markov Decision ◽

Partially Observable ◽

Robotic Tasks ◽

General Method

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves DESPOT’s performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks. We also present a general method for learning the importance sampling distribution.

Download Full-text

Point-Based Monte Carto Online Planning in POMDPs

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.846-847.1388 ◽

2013 ◽

Vol 846-847 ◽

pp. 1388-1391

Author(s):

Bo Wu ◽

Yan Peng Feng ◽

Hong Yan Zheng

Keyword(s):

Mean Squared Error ◽

Search Algorithm ◽

Search Tree ◽

Real Time System ◽

Monte Carlo Tree Search ◽

Online Planning ◽

Markov Decision ◽

Partially Observable ◽

Belief States ◽

Tree Search Algorithm

The online planning and learning in partially observable Markov decision processes are often intractable because belief states space has two curses: dimensionality and history. In order to address this problem, this paper proposes a point-based Monte Carto online planning approach in POMDPs. This approach involves performing value backup at specific reachable belief points, rather than over the entire belief simplex, to speed up computation processes. Then Monte Carlo tree search algorithm is exploited to share the value of actions across each subtree of the search tree so as to minimise the mean squared error. The experimental results show that the proposed algorithm is effective in real-time system.

Download Full-text

A Novel Vehicle Detection Framework Based on Parallel Vision

Wireless Communications and Mobile Computing ◽

10.1155/2022/9667506 ◽

2022 ◽

Vol 2022 ◽

pp. 1-11

Author(s):

Ying Zhuo ◽

Lan Yan ◽

Wenbo Zheng ◽

Yutian Zhang ◽

Chao Gou

Keyword(s):

Visual Information ◽

Large Scale ◽

Vehicle Detection ◽

Real Data ◽

Autonomous Driving ◽

Research Topic ◽

Traffic Data ◽

Detection Model ◽

Real Traffic ◽

Virtual Data

Autonomous driving has become a prevalent research topic in recent years, arousing the attention of many academic universities and commercial companies. As human drivers rely on visual information to discern road conditions and make driving decisions, autonomous driving calls for vision systems such as vehicle detection models. These vision models require a large amount of labeled data while collecting and annotating the real traffic data are time-consuming and costly. Therefore, we present a novel vehicle detection framework based on the parallel vision to tackle the above issue, using the specially designed virtual data to help train the vehicle detection model. We also propose a method to construct large-scale artificial scenes and generate the virtual data for the vision-based autonomous driving schemes. Experimental results verify the effectiveness of our proposed framework, demonstrating that the combination of virtual and real data has better performance for training the vehicle detection model than the only use of real data.

Download Full-text

Design and implementation of human driving data–based active lane change control for autonomous vehicles

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/0954407020947678 ◽

2020 ◽

Vol 235 (1) ◽

pp. 55-77

Author(s):

Heungseok Chae ◽

Yonghwan Jeong ◽

Hojun Lee ◽

Jongcherl Park ◽

Kyongsu Yi

Keyword(s):

Real World ◽

Autonomous Vehicles ◽

High Speed ◽

Control Algorithm ◽

Autonomous Vehicle ◽

Autonomous Driving ◽

Lane Change ◽

Driving Mode ◽

Lane Changes ◽

Safety Indices

This article describes the design, implementation, and evaluation of an active lane change control algorithm for autonomous vehicles with human factor considerations. Lane changes need to be performed considering both driver acceptance and safety with surrounding vehicles. Therefore, autonomous driving systems need to be designed based on an analysis of human driving behavior. In this article, manual driving characteristics are investigated using real-world driving test data. In lane change situations, interactions with surrounding vehicles were mainly investigated. And safety indices were developed with kinematic analysis. A safety indices–based lane change decision and control algorithm has been developed. In order to improve safety, stochastic predictions of both the ego vehicle and surrounding vehicles have been conducted with consideration of sensor noise and model uncertainties. The desired driving mode is decided to cope with all lane changes on highway. To obtain desired reference and constraints, motion planning for lane changes has been designed taking stochastic prediction-based safety indices into account. A stochastic model predictive control with constraints has been adopted to determine vehicle control inputs: the steering angle and the longitudinal acceleration. The proposed active lane change algorithm has been successfully implemented on an autonomous vehicle and evaluated via real-world driving tests. Safe and comfortable lane changes in high-speed driving on highways have been demonstrated using our autonomous test vehicle.

Download Full-text

Generalized Mean Estimation in Monte-Carlo Tree Search

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/332 ◽

2020 ◽

Author(s):

Tuan Dam ◽

Pascal Klink ◽

Carlo D'Eramo ◽

Jan Peters ◽

Joni Pajarinen

Keyword(s):

Monte Carlo ◽

Tree Search ◽

Power Mean ◽

Monte Carlo Tree Search ◽

Average Value ◽

Mean Estimation ◽

Markov Decision ◽

Speed Up ◽

Upper Confidence Bound ◽

Partially Observable

We consider Monte-Carlo Tree Search (MCTS) applied to Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs), and the well-known Upper Confidence bound for Trees (UCT) algorithm. In UCT, a tree with nodes (states) and edges (actions) is incrementally built by the expansion of nodes, and the values of nodes are updated through a backup strategy based on the average value of child nodes. However, it has been shown that with enough samples the maximum operator yields more accurate node value estimates than averaging. Instead of settling for one of these value estimates, we go a step further proposing a novel backup strategy which uses the power mean operator, which computes a value between the average and maximum value. We call our new approach Power-UCT, and argue how the use of the power mean operator helps to speed up the learning in MCTS. We theoretically analyze our method providing guarantees of convergence to the optimum. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing significant improvement in performance and convergence speed w.r.t. state of the art algorithms.

Download Full-text

DESPOT: Online POMDP Planning with Regularization

Journal of Artificial Intelligence Research ◽

10.1613/jair.5328 ◽

2017 ◽

Vol 58 ◽

pp. 231-266 ◽

Cited By ~ 27

Author(s):

Nan Ye ◽

Adhiraj Somani ◽

David Hsu ◽

Wee Sun Lee

Keyword(s):

Autonomous Driving ◽

Vehicle Control ◽

Planning Under Uncertainty ◽

Driving System ◽

Online Planning ◽

Markov Decision ◽

Planning Algorithm ◽

Regret Bound ◽

Partially Observable ◽

Autonomous Driving System

The partially observable Markov decision process (POMDP) provides a principled general framework for planning under uncertainty, but solving POMDPs optimally is computationally intractable, due to the "curse of dimensionality" and the "curse of history". To overcome these challenges, we introduce the Determinized Sparse Partially Observable Tree (DESPOT), a sparse approximation of the standard belief tree, for online planning under uncertainty. A DESPOT focuses online planning on a set of randomly sampled scenarios and compactly captures the "execution" of all policies under these scenarios. We show that the best policy obtained from a DESPOT is near-optimal, with a regret bound that depends on the representation size of the optimal policy. Leveraging this result, we give an anytime online planning algorithm, which searches a DESPOT for a policy that optimizes a regularized objective function. Regularization balances the estimated value of a policy under the sampled scenarios and the policy size, thus avoiding overfitting. The algorithm demonstrates strong experimental results, compared with some of the best online POMDP algorithms available. It has also been incorporated into an autonomous driving system for real-time vehicle control. The source code for the algorithm is available online.

Download Full-text

Online Decision-Making for Scalable Autonomous Systems

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/664 ◽

2017 ◽

Cited By ~ 6

Author(s):

Kyle Hollins Wray ◽

Stefan J. Witwicki ◽

Shlomo Zilberstein

Keyword(s):

Decision Making ◽

Autonomous Vehicles ◽

Formal Model ◽

A Priori ◽

Autonomous Systems ◽

Action Function ◽

Single Action ◽

Industry Standard ◽

Markov Decision ◽

Partially Observable

We present a general formal model called MODIA that can tackle a central challenge for autonomous vehicles (AVs), namely the ability to interact with an unspecified, large number of world entities. In MODIA, a collection of possible decision-problems (DPs), known a priori, are instantiated online and executed as decision-components (DCs), unknown a priori. To combine their individual action recommendations of the DCs into a single action, we propose the lexicographic executor action function (LEAF) mechanism. We analyze the complexity of MODIA and establish LEAF’s relation to regret minimization. Finally, we implement MODIA and LEAF using collections of partially observable Markov decision process (POMDP) DPs, and use them for complex AV intersection decision-making. We evaluate the approach in six scenarios within an industry-standard vehicle simulator, and present its use on an AV prototype.

Download Full-text

ATSIS: Achieving the Ad hoc Teamwork by Sub-task Inference and Selection

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/25 ◽

2019 ◽

Author(s):

Shuo Chen ◽

Ewa Andrejczuk ◽

Athirai A. Irissappane ◽

Jie Zhang

Keyword(s):

Monte Carlo ◽

Markov Decision Process ◽

Decision Process ◽

Ad Hoc ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Markov Decision ◽

Behaviour Models ◽

Partially Observable ◽

Do So

In an ad hoc teamwork setting, the team needs to coordinate their activities to perform a task without prior agreement on how to achieve it. The ad hoc agent cannot communicate with its teammates but it can observe their behaviour and plan accordingly. To do so, the existing approaches rely on the teammates' behaviour models. However, the models may not be accurate, which can compromise teamwork. For this reason, we present Ad Hoc Teamwork by Sub-task Inference and Selection (ATSIS) algorithm that uses a sub-task inference without relying on teammates' models. First, the ad hoc agent observes its teammates to infer which sub-tasks they are handling. Based on that, it selects its own sub-task using a partially observable Markov decision process that handles the uncertainty of the sub-task inference. Last, the ad hoc agent uses the Monte Carlo tree search to find the set of actions to perform the sub-task. Our experiments show the benefits of ATSIS for robust teamwork.

Download Full-text

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i06.6563 ◽

2020 ◽

Vol 34 (06) ◽

pp. 10061-10068

Author(s):

Maxime Bouton ◽

Jana Tumova ◽

Mykel J. Kochenderfer

Keyword(s):

Autonomous Systems ◽

Planning Problem ◽

Value Iteration ◽

Maximum Probability ◽

Markov Decision ◽

Iteration Methods ◽

Partially Observable Markov ◽

Temporal Logic Formula ◽

Partially Observable ◽

State Of The Environment

Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use point-based value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.

Download Full-text