HIGH-LEVEL CONTROL OF AUTONOMOUS ROBOTS USING A BEHAVIOR-BASED SCHEME AND REINFORCEMENT LEARNING

2002 ◽  
Vol 35 (1) ◽  
pp. 469-474 ◽  
Author(s):  
M. Carreras ◽  
J. Yuh ◽  
J. Batlle
Author(s):  
Rey Pocius ◽  
Lawrence Neal ◽  
Alan Fern

Commonly used sequential decision making tasks such as the games in the Arcade Learning Environment (ALE) provide rich observation spaces suitable for deep reinforcement learning. However, they consist mostly of low-level control tasks which are of limited use for the development of explainable artificial intelligence(XAI) due to the fine temporal resolution of the tasks. Many of these domains also lack built-in high level abstractions and symbols. Existing tasks that provide for both strategic decision-making and rich observation spaces are either difficult to simulate or are intractable. We provide a set of new strategic decision-making tasks specialized for the development and evaluation of explainable AI methods, built as constrained mini-games within the StarCraft II Learning Environment.


2021 ◽  
Vol 8 ◽  
Author(s):  
Reeve Lambert ◽  
Jianwen Li ◽  
Li-Fan Wu ◽  
Nina Mahmoudian

This paper presents a framework to alleviate the Deep Reinforcement Learning (DRL) training data sparsity problem that is present in challenging domains by creating a DRL agent training and vehicle integration methodology. The methodology leverages accessible domains to train an agent to solve navigational problems such as obstacle avoidance and allows the agent to generalize to challenging and inaccessible domains such as those present in marine environments with minimal further training. This is done by integrating a DRL agent at a high level of vehicle control and leveraging existing path planning and proven low-level control methodologies that are utilized in multiple domains. An autonomy package with a tertiary multilevel controller is developed to enable the DRL agent to interface at the prescribed high control level and thus be separated from vehicle dynamics and environmental constraints. An example Deep Q Network (DQN) employing this methodology for obstacle avoidance is trained in a simulated ground environment, and then its ability to generalize across domains is experimentally validated. Experimental validation utilized a simulated water surface environment and real-world deployment of ground and water robotic platforms. This methodology, when used, shows that it is possible to leverage accessible and data rich domains, such as ground, to effectively develop marine DRL agents for use on Autonomous Surface Vehicle (ASV) navigation. This will allow rapid and iterative agent development without the risk of ASV loss, the cost and logistic overhead of marine deployment, and allow landlocked institutions to develop agents for marine applications.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Lennart Karstensen ◽  
Tobias Behr ◽  
Tim Philipp Pusch ◽  
Franziska Mathis-Ullrich ◽  
Jan Stallkamp

AbstractThe treatment of cerebro- and cardiovascular diseases requires complex and challenging navigation of a catheter. Previous attempts to automate catheter navigation lack the ability to be generalizable. Methods of Deep Reinforcement Learning show promising results and may be the key to automate catheter navigation through the tortuous vascular tree. This work investigates Deep Reinforcement Learning for guidewire manipulation in a complex and rigid vascular model in 2D. The neural network trained by Deep Deterministic Policy Gradients with Hindsight Experience Replay performs well on the low-level control task, however the high-level control of the path planning must be improved further.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6520
Author(s):  
Raquel Fuentetaja ◽  
Angel García-Olaya ◽  
Javier García ◽  
José Carlos González ◽  
Fernando Fernández

Using Automated Planning for the high level control of robotic architectures is becoming very popular thanks mainly to its capability to define the tasks to perform in a declarative way. However, classical planning tasks, even in its basic standard Planning Domain Definition Language (PDDL) format, are still very hard to formalize for non expert engineers when the use case to model is complex. Human Robot Interaction (HRI) is one of those complex environments. This manuscript describes the rationale followed to design a planning model able to control social autonomous robots interacting with humans. It is the result of the authors’ experience in modeling use cases for Social Assistive Robotics (SAR) in two areas related to healthcare: Comprehensive Geriatric Assessment (CGA) and non-contact rehabilitation therapies for patients with physical impairments. In this work a general definition of these two use cases in a unique planning domain is proposed, which favors the management and integration with the software robotic architecture, as well as the addition of new use cases. Results show that the model is able to capture all the relevant aspects of the Human-Robot interaction in those scenarios, allowing the robot to autonomously perform the tasks by using a standard planning-execution architecture.


2021 ◽  
Vol 11 (3) ◽  
pp. 1291
Author(s):  
Bonwoo Gu ◽  
Yunsick Sung

Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform.


2021 ◽  
Vol 31 (3) ◽  
pp. 1-26
Author(s):  
Aravind Balakrishnan ◽  
Jaeyoung Lee ◽  
Ashish Gaurav ◽  
Krzysztof Czarnecki ◽  
Sean Sedwards

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.


Sensors ◽  
2021 ◽  
Vol 21 (7) ◽  
pp. 2534
Author(s):  
Oualid Doukhi ◽  
Deok-Jin Lee

Autonomous navigation and collision avoidance missions represent a significant challenge for robotics systems as they generally operate in dynamic environments that require a high level of autonomy and flexible decision-making capabilities. This challenge becomes more applicable in micro aerial vehicles (MAVs) due to their limited size and computational power. This paper presents a novel approach for enabling a micro aerial vehicle system equipped with a laser range finder to autonomously navigate among obstacles and achieve a user-specified goal location in a GPS-denied environment, without the need for mapping or path planning. The proposed system uses an actor–critic-based reinforcement learning technique to train the aerial robot in a Gazebo simulator to perform a point-goal navigation task by directly mapping the noisy MAV’s state and laser scan measurements to continuous motion control. The obtained policy can perform collision-free flight in the real world while being trained entirely on a 3D simulator. Intensive simulations and real-time experiments were conducted and compared with a nonlinear model predictive control technique to show the generalization capabilities to new unseen environments, and robustness against localization noise. The obtained results demonstrate our system’s effectiveness in flying safely and reaching the desired points by planning smooth forward linear velocity and heading rates.


Sign in / Sign up

Export Citation Format

Share Document