HIGH-LEVEL CONTROL OF AUTONOMOUS ROBOTS USING A BEHAVIOR-BASED SCHEME AND REINFORCEMENT LEARNING

Commonly used sequential decision making tasks such as the games in the Arcade Learning Environment (ALE) provide rich observation spaces suitable for deep reinforcement learning. However, they consist mostly of low-level control tasks which are of limited use for the development of explainable artificial intelligence(XAI) due to the fine temporal resolution of the tasks. Many of these domains also lack built-in high level abstractions and symbols. Existing tasks that provide for both strategic decision-making and rich observation spaces are either difficult to simulate or are intractable. We provide a set of new strategic decision-making tasks specialized for the development and evaluation of explainable AI methods, built as constrained mini-games within the StarCraft II Learning Environment.

Download Full-text

Robust ASV Navigation Through Ground to Water Cross-Domain Deep Reinforcement Learning

Frontiers in Robotics and AI ◽

10.3389/frobt.2021.739023 ◽

2021 ◽

Vol 8 ◽

Author(s):

Reeve Lambert ◽

Jianwen Li ◽

Li-Fan Wu ◽

Nina Mahmoudian

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Control Level ◽

Training Data ◽

Level Control ◽

Autonomous Surface Vehicle ◽

High Control ◽

Marine Applications ◽

High Level ◽

The Cost

This paper presents a framework to alleviate the Deep Reinforcement Learning (DRL) training data sparsity problem that is present in challenging domains by creating a DRL agent training and vehicle integration methodology. The methodology leverages accessible domains to train an agent to solve navigational problems such as obstacle avoidance and allows the agent to generalize to challenging and inaccessible domains such as those present in marine environments with minimal further training. This is done by integrating a DRL agent at a high level of vehicle control and leveraging existing path planning and proven low-level control methodologies that are utilized in multiple domains. An autonomy package with a tertiary multilevel controller is developed to enable the DRL agent to interface at the prescribed high control level and thus be separated from vehicle dynamics and environmental constraints. An example Deep Q Network (DQN) employing this methodology for obstacle avoidance is trained in a simulated ground environment, and then its ability to generalize across domains is experimentally validated. Experimental validation utilized a simulated water surface environment and real-world deployment of ground and water robotic platforms. This methodology, when used, shows that it is possible to leverage accessible and data rich domains, such as ground, to effectively develop marine DRL agents for use on Autonomous Surface Vehicle (ASV) navigation. This will allow rapid and iterative agent development without the risk of ASV loss, the cost and logistic overhead of marine deployment, and allow landlocked institutions to develop agents for marine applications.

Download Full-text

Behavior-Based High Level Control of a VTOL UAV

AIAA Infotech@Aerospace Conference ◽

10.2514/6.2009-1977 ◽

2009 ◽

Cited By ~ 1

Author(s):

Florian Adolf ◽

Maurício Moraes Carneiro

Keyword(s):

Level Control ◽

Vtol Uav ◽

Behavior Based ◽

High Level

Download Full-text

Autonomous guidewire navigation in a two dimensional vascular phantom

Current Directions in Biomedical Engineering ◽

10.1515/cdbme-2020-0007 ◽

2020 ◽

Vol 6 (1) ◽

Author(s):

Lennart Karstensen ◽

Tobias Behr ◽

Tim Philipp Pusch ◽

Franziska Mathis-Ullrich ◽

Jan Stallkamp

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Two Dimensional ◽

Level Control ◽

Control Task ◽

Vascular Tree ◽

The Neural Network ◽

Experience Replay ◽

Catheter Navigation ◽

High Level

AbstractThe treatment of cerebro- and cardiovascular diseases requires complex and challenging navigation of a catheter. Previous attempts to automate catheter navigation lack the ability to be generalizable. Methods of Deep Reinforcement Learning show promising results and may be the key to automate catheter navigation through the tortuous vascular tree. This work investigates Deep Reinforcement Learning for guidewire manipulation in a complex and rigid vascular model in 2D. The neural network trained by Deep Deterministic Policy Gradients with Hindsight Experience Replay performs well on the low-level control task, however the high-level control of the path planning must be improved further.

Download Full-text

An Automated Planning Model for HRI: Use Cases on Social Assistive Robotics

Sensors ◽

10.3390/s20226520 ◽

2020 ◽

Vol 20 (22) ◽

pp. 6520

Author(s):

Raquel Fuentetaja ◽

Angel García-Olaya ◽

Javier García ◽

José Carlos González ◽

Fernando Fernández

Keyword(s):

Autonomous Robots ◽

Automated Planning ◽

Human Robot Interaction ◽

Assistive Robotics ◽

Use Cases ◽

General Definition ◽

Planning Model ◽

Level Control ◽

Robot Interaction ◽

High Level

Using Automated Planning for the high level control of robotic architectures is becoming very popular thanks mainly to its capability to define the tasks to perform in a declarative way. However, classical planning tasks, even in its basic standard Planning Domain Definition Language (PDDL) format, are still very hard to formalize for non expert engineers when the use case to model is complex. Human Robot Interaction (HRI) is one of those complex environments. This manuscript describes the rationale followed to design a planning model able to control social autonomous robots interacting with humans. It is the result of the authors’ experience in modeling use cases for Social Assistive Robotics (SAR) in two areas related to healthcare: Comprehensive Geriatric Assessment (CGA) and non-contact rehabilitation therapies for patients with physical impairments. In this work a general definition of these two use cases in a unique planning domain is proposed, which favors the management and integration with the software robotic architecture, as well as the addition of new use cases. Results show that the model is able to capture all the relevant aspects of the Human-Robot interaction in those scenarios, allowing the robot to autonomously perform the tasks by using a standard planning-execution architecture.

Download Full-text

Policy based reinforcement learning approach Of Jobshop scheduling with high level deadlock detection

10.31274/etd-180810-1488 ◽

2014 ◽

Author(s):

Mengmeng Chen

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Deadlock Detection ◽

Jobshop Scheduling ◽

High Level

Download Full-text

Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions

Applied Sciences ◽

10.3390/app11031291 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1291

Author(s):

Bonwoo Gu ◽

Yunsick Sung

Keyword(s):

Reinforcement Learning ◽

Search Algorithm ◽

Classification Criteria ◽

Tree Search ◽

Learning Method ◽

Board Game ◽

Ancient China ◽

Monte Carlo Tree Search ◽

High Level ◽

Tree Search Algorithm

Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform.

Download Full-text

Transfer Reinforcement Learning for Autonomous Driving

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3449356 ◽

2021 ◽

Vol 31 (3) ◽

pp. 1-26

Author(s):

Aravind Balakrishnan ◽

Jaeyoung Lee ◽

Ashish Gaurav ◽

Krzysztof Czarnecki ◽

Sean Sedwards

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Transfer Problem ◽

Autonomous Driving ◽

High Fidelity ◽

Rule Based ◽

High Level ◽

Real Vehicle

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.

Download Full-text

Deep Reinforcement Learning for End-to-End Local Motion Planning of Autonomous Aerial Robots in Unknown Outdoor Environments: Real-Time Flight Experiments

Sensors ◽

10.3390/s21072534 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2534

Author(s):

Oualid Doukhi ◽

Deok-Jin Lee

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Autonomous Navigation ◽

Control Technique ◽

Local Motion ◽

Aerial Robots ◽

Novel Approach ◽

Outdoor Environments ◽

Aerial Vehicle ◽

High Level

Autonomous navigation and collision avoidance missions represent a significant challenge for robotics systems as they generally operate in dynamic environments that require a high level of autonomy and flexible decision-making capabilities. This challenge becomes more applicable in micro aerial vehicles (MAVs) due to their limited size and computational power. This paper presents a novel approach for enabling a micro aerial vehicle system equipped with a laser range finder to autonomously navigate among obstacles and achieve a user-specified goal location in a GPS-denied environment, without the need for mapping or path planning. The proposed system uses an actor–critic-based reinforcement learning technique to train the aerial robot in a Gazebo simulator to perform a point-goal navigation task by directly mapping the noisy MAV’s state and laser scan measurements to continuous motion control. The obtained policy can perform collision-free flight in the real world while being trained entirely on a 3D simulator. Intensive simulations and real-time experiments were conducted and compared with a nonlinear model predictive control technique to show the generalization capabilities to new unseen environments, and robustness against localization noise. The obtained results demonstrate our system’s effectiveness in flying safely and reaching the desired points by planning smooth forward linear velocity and heading rates.

Download Full-text