Search-Based Planning and Reinforcement Learning for Autonomous Systems and Robotics

10.36227/techrxiv.11607348.v1 ◽

2020 ◽

Author(s):

Than Le

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Optimal Trajectory ◽

Autonomous Systems ◽

Autonomous Mobile Robots ◽

Artificial Intelligent ◽

Unstructured Environment ◽

And Robotics ◽

Over Time ◽

Basic Prerequisite

In this chapter, we address the competent Autonomous Vehicles should have the ability to analyze the structure and unstructured environments and then to localize itself relative to surrounding things, where GPS, RFID or other similar means cannot give enough information about the location. Reliable SLAM is the most basic prerequisite for any further artificial intelligent tasks of an autonomous mobile robots. The goal of this paper is to simulate a SLAM process on the advanced software development. The model represents the system itself, whereas the simulation represents the operation of the system over time. And the software architecture will help us to focus our work to realize our wish with least trivial work. It is an open-source meta-operating system, which provides us tremendous tools for robotics related problems. Specifically, we address the advanced vehicles should have the ability to analyze the structured and unstructured environment based on solving the search-based planning and then we move to discuss interested in reinforcement learning-based model to optimal trajectory in order to apply to autonomous systems.

Download Full-text

Search-Based Planning and Reinforcement Learning for Autonomous Systems and Robotics

10.36227/techrxiv.11607348 ◽

2020 ◽

Author(s):

Than Le

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Optimal Trajectory ◽

Autonomous Systems ◽

Autonomous Mobile Robots ◽

Artificial Intelligent ◽

Unstructured Environment ◽

And Robotics ◽

Over Time ◽

Basic Prerequisite

In this chapter, we address the competent Autonomous Vehicles should have the ability to analyze the structure and unstructured environments and then to localize itself relative to surrounding things, where GPS, RFID or other similar means cannot give enough information about the location. Reliable SLAM is the most basic prerequisite for any further artificial intelligent tasks of an autonomous mobile robots. The goal of this paper is to simulate a SLAM process on the advanced software development. The model represents the system itself, whereas the simulation represents the operation of the system over time. And the software architecture will help us to focus our work to realize our wish with least trivial work. It is an open-source meta-operating system, which provides us tremendous tools for robotics related problems. Specifically, we address the advanced vehicles should have the ability to analyze the structured and unstructured environment based on solving the search-based planning and then we move to discuss interested in reinforcement learning-based model to optimal trajectory in order to apply to autonomous systems.

Download Full-text

Noninvasive Brain–Machine Interfaces for Robotic Devices

Annual Review of Control Robotics and Autonomous Systems ◽

10.1146/annurev-control-012720-093904 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Luca Tonin ◽

José del R. Millán

Keyword(s):

Autonomous Systems ◽

Discrete Control ◽

Annual Review ◽

Publication Date ◽

User Centered Design ◽

Brain Machine Interfaces ◽

Motor Disabilities ◽

Severe Motor ◽

Robotic Devices ◽

And Robotics

The last decade has seen a flowering of applications driven by brain–machine interfaces (BMIs), particularly brain-actuated robotic devices designed to restore the independence of people suffering from severe motor disabilities. This review provides an overview of the state of the art of noninvasive BMI-driven devices based on 86 studies published in the last 15 years, with an emphasis on the interactions among the user, the BMI system, and the robot. We found that BMIs are used mostly to drive devices for navigation (e.g., telepresence mobile robots), with BMI paradigms based mainly on exogenous stimulation, and the majority of brain-actuated robots adopt a discrete control strategy. Most critically, in only a few works have disabled people evaluated a brain-actuated robot. The review highlights the most urgent challenges in the field, from the integration between BMI and robotics to the need for a user-centered design to boost the translational impact of BMIs. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 4 is May 3, 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

Download Full-text

Reinforcement-Learning Approach Guidelines for Energy Management

Journal of Low Power Electronics ◽

10.1166/jolpe.2019.1618 ◽

2019 ◽

Vol 15 (3) ◽

pp. 283-293 ◽

Cited By ~ 1

Author(s):

Yohann Rioual ◽

Johann Laurent ◽

Jean-Philippe Diguet

Keyword(s):

Reinforcement Learning ◽

Energy Management ◽

Sensor Node ◽

Learning Algorithm ◽

Autonomous Systems ◽

Difficult Problem ◽

Learning Approach ◽

Neural Network Approach ◽

Q Learning ◽

Energy Harvesting Devices

IoT and autonomous systems are in charge of an increasing number of sensing, processing and communications tasks. These systems may be equipped with energy harvesting devices. Nevertheless, the energy harvested is uncertain and variable, which makes it difficult to manage the energy in these systems. Reinforcement learning algorithms can handle such uncertainties, however selecting the adapted algorithm is a difficult problem. Many algorithms are available and each has its own advantages and drawbacks. In this paper, we try to provide an overview of different approaches to help designer to determine the most appropriate algorithm according to its application and system. We focus on Q-learning, a popular reinforcement learning algorithm and several of these variants. The approach of Q-learning is based on the use of look up table, however some algorithms use a neural network approach. We compare different variants of Q-learning for the energy management of a sensor node. We show that depending on the desired performance and the constraints inherent in the application of the node, the appropriate approach changes.

Download Full-text

On special issue “Latest Reinforcement Learning and Robotics”

Journal of the Robotics Society of Japan ◽

10.7210/jrsj.39.570 ◽

2021 ◽

Vol 39 (7) ◽

pp. 570-571

Author(s):

Yuka Ariki

Keyword(s):

Reinforcement Learning ◽

Special Issue ◽

And Robotics

Download Full-text

Resource-Efficient Sensor Data Management for Autonomous Systems Using Deep Reinforcement Learning

Sensors ◽

10.3390/s19204410 ◽

2019 ◽

Vol 19 (20) ◽

pp. 4410 ◽

Cited By ~ 1

Author(s):

Seunghwan Jeong ◽

Gwangpyo Yoo ◽

Minjong Yoo ◽

Ikjun Yeom ◽

Honguk Woo

Keyword(s):

Reinforcement Learning ◽

Data Quality ◽

Data Management ◽

Autonomous Systems ◽

Sensor Data ◽

Data Uncertainty ◽

Seamless Integration ◽

Digital Twin ◽

Physical Attributes ◽

Sensor Data Management

Hyperconnectivity via modern Internet of Things (IoT) technologies has recently driven us to envision “digital twin”, in which physical attributes are all embedded, and their latest updates are synchronized on digital spaces in a timely fashion. From the point of view of cyberphysical system (CPS) architectures, the goals of digital twin include providing common programming abstraction on the same level of databases, thereby facilitating seamless integration of real-world physical objects and digital assets at several different system layers. However, the inherent limitations of sampling and observing physical attributes often pose issues related to data uncertainty in practice. In this paper, we propose a learning-based data management scheme where the implementation is layered between sensors attached to physical attributes and domain-specific applications, thereby mitigating the data uncertainty between them. To do so, we present a sensor data management framework, namely D2WIN, which adopts reinforcement learning (RL) techniques to manage the data quality for CPS applications and autonomous systems. To deal with the scale issue incurred by many physical attributes and sensor streams when adopting RL, we propose an action embedding strategy that exploits their distance-based similarity in the physical space coordination. We introduce two embedding methods, i.e., a user-defined function and a generative model, for different conditions. Through experiments, we demonstrate that the D2WIN framework with the action embedding outperforms several known heuristics in terms of achievable data quality under certain resource restrictions. We also test the framework with an autonomous driving simulator, clearly showing its benefit. For example, with only 30% of updates selectively applied by the learned policy, the driving agent maintains its performance about 96.2%, as compared to the ideal condition with full updates.

Download Full-text

Invariant Kalman Filtering

Annual Review of Control Robotics and Autonomous Systems ◽

10.1146/annurev-control-060117-105010 ◽

2018 ◽

Vol 1 (1) ◽

pp. 237-257 ◽

Cited By ~ 26

Author(s):

Axel Barrau ◽

Silvère Bonnabel

Keyword(s):

Kalman Filter ◽

Kalman Filtering ◽

Autonomous Vehicle ◽

Autonomous Systems ◽

Practical Application ◽

Extended Kalman Filtering ◽

Industrial Implementation ◽

Localization And Mapping ◽

And Robotics ◽

Insight Into

The Kalman filter—or, more precisely, the extended Kalman filter (EKF)—is a fundamental engineering tool that is pervasively used in control and robotics and for various estimation tasks in autonomous systems. The recently developed field of invariant extended Kalman filtering uses the geometric structure of the state space and the dynamics to improve the EKF, notably in terms of mathematical guarantees. The methodology essentially applies in the fields of localization, navigation, and simultaneous localization and mapping (SLAM). Although it was created only recently, its remarkable robustness properties have already motivated a real industrial implementation in the aerospace field. This review aims to provide an accessible introduction to the methodology of invariant Kalman filtering and to allow readers to gain insight into the relevance of the method as well as its important differences with the conventional EKF. This should be of interest to readers intrigued by the practical application of mathematical theories and those interested in finding robust, simple-to-implement filters for localization, navigation, and SLAM, notably for autonomous vehicle guidance.

Download Full-text

Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents

Entropy ◽

10.3390/e23040461 ◽

2021 ◽

Vol 23 (4) ◽

pp. 461

Author(s):

Jeongho Park ◽

Juwon Lee ◽

Taehwan Kim ◽

Inkyung Ahn ◽

Jooyoung Park

Keyword(s):

Reinforcement Learning ◽

Expected Returns ◽

Dynamic Nature ◽

Predator Prey ◽

Learning Agents ◽

Complex Interactions ◽

Multi Agent ◽

Simulation Results ◽

Multiple Species ◽

And Robotics

The problem of finding adequate population models in ecology is important for understanding essential aspects of their dynamic nature. Since analyzing and accurately predicting the intelligent adaptation of multiple species is difficult due to their complex interactions, the study of population dynamics still remains a challenging task in computational biology. In this paper, we use a modern deep reinforcement learning (RL) approach to explore a new avenue for understanding predator-prey ecosystems. Recently, reinforcement learning methods have achieved impressive results in areas, such as games and robotics. RL agents generally focus on building strategies for taking actions in an environment in order to maximize their expected returns. Here we frame the co-evolution of predators and preys in an ecosystem as allowing agents to learn and evolve toward better ones in a manner appropriate for multi-agent reinforcement learning. Recent significant advancements in reinforcement learning allow for new perspectives on these types of ecological issues. Our simulation results show that throughout the scenarios with RL agents, predators can achieve a reasonable level of sustainability, along with their preys.

Download Full-text

On the Impact of Gravity Compensation on Reinforcement Learning in Goal-Reaching Tasks for Robotic Manipulators

Robotics ◽

10.3390/robotics10010046 ◽

2021 ◽

Vol 10 (1) ◽

pp. 46

Author(s):

Jonathan Fugal ◽

Jihye Bae ◽

Hasan A. Poonawala

Keyword(s):

Reinforcement Learning ◽

Autonomous Systems ◽

Learning Technologies ◽

Robotic Systems ◽

Gravity Compensation ◽

Reaching Tasks ◽

Combined Control ◽

Classical Control ◽

The Impact ◽

Target Locations

Advances in machine learning technologies in recent years have facilitated developments in autonomous robotic systems. Designing these autonomous systems typically requires manually specified models of the robotic system and world when using classical control-based strategies, or time consuming and computationally expensive data-driven training when using learning-based strategies. Combination of classical control and learning-based strategies may mitigate both requirements. However, the performance of the combined control system is not obvious given that there are two separate controllers. This paper focuses on one such combination, which uses gravity-compensation together with reinforcement learning (RL). We present a study of the effects of gravity compensation on the performance of two reinforcement learning algorithms when solving reaching tasks using a simulated seven-degree-of-freedom robotic arm. The results of our study demonstrate that gravity compensation coupled with RL can reduce the training required in reaching tasks involving elevated target locations, but not all target locations.

Download Full-text

Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning

Software & Systems Modeling ◽

10.1007/s10270-021-00952-4 ◽

2021 ◽

Author(s):

Juan Marcelo Parra-Ullauri ◽

Antonio García-Domínguez ◽

Nelly Bencomo ◽

Changgang Zheng ◽

Chen Zhen ◽

...

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Mobile Communications ◽

Autonomous Agents ◽

Time Windows ◽

Autonomous Systems ◽

Software Systems ◽

Trade Offs ◽

Temporal Models ◽

Event Driven

AbstractModern software systems are increasingly expected to show higher degrees of autonomy and self-management to cope with uncertain and diverse situations. As a consequence, autonomous systems can exhibit unexpected and surprising behaviours. This is exacerbated due to the ubiquity and complexity of Artificial Intelligence (AI)-based systems. This is the case of Reinforcement Learning (RL), where autonomous agents learn through trial-and-error how to find good solutions to a problem. Thus, the underlying decision-making criteria may become opaque to users that interact with the system and who may require explanations about the system’s reasoning. Available work for eXplainable Reinforcement Learning (XRL) offers different trade-offs: e.g. for runtime explanations, the approaches are model-specific or can only analyse results after-the-fact. Different from these approaches, this paper aims to provide an online model-agnostic approach for XRL towards trustworthy and understandable AI. We present ETeMoX, an architecture based on temporal models to keep track of the decision-making processes of RL systems. In cases where the resources are limited (e.g. storage capacity or time to response), the architecture also integrates complex event processing, an event-driven approach, for detecting matches to event patterns that need to be stored, instead of keeping the entire history. The approach is applied to a mobile communications case study that uses RL for its decision-making. In order to test the generalisability of our approach, three variants of the underlying RL algorithms are used: Q-Learning, SARSA and DQN. The encouraging results show that using the proposed configurable architecture, RL developers are able to obtain explanations about the evolution of a metric, relationships between metrics, and were able to track situations of interest happening over time windows.

Download Full-text