scholarly journals PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making

Author(s):  
Fangkai Yang ◽  
Daoming Lyu ◽  
Bo Liu ◽  
Steven Gustafson

Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this paper we present a unified framework PEORL that integrates symbolic planning with hierarchical reinforcement learning (HRL) to cope with decision-making in dynamic environment with uncertainties. Symbolic plans are used to guide the agent's task execution and learning, and the learned experience is fed back to symbolic knowledge to improve planning. This method leads to rapid policy search and robust symbolic plans in complex domains. The framework is tested on benchmark domains of HRL.

2021 ◽  
Author(s):  
Daoming Lyu ◽  
Fangkai Yang ◽  
Hugh Kwon ◽  
Bo Liu ◽  
Wen Dong ◽  
...  

Human-robot interactive decision-making is increasingly becoming ubiquitous, and explainability is an influential factor in determining the reliance on autonomy. However, it is not reasonable to trust systems beyond our comprehension, and typical machine learning and data-driven decision-making are black-box paradigms that impede explainability. Therefore, it is critical to establish computational efficient decision-making mechanisms enhanced by explainability-aware strategies. To this end, we propose the Trustworthy Decision-Making (TDM), which is an explainable neuro-symbolic approach by integrating symbolic planning into hierarchical reinforcement learning. The framework of TDM enables the subtask-level explainability from the causal relational and understandable subtasks. Besides, TDM also demonstrates the advantage of the integration between symbolic planning and reinforcement learning, reaping the benefits of both worlds. Experimental results validate the effectiveness of proposed method while improving the explainability in the process of decision-making.


Author(s):  
Daoming Lyu ◽  
Fangkai Yang ◽  
Bo Liu ◽  
Daesub Yoon

Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options. This framework features a planner – controller – meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches.


Author(s):  
Juan Marcelo Parra-Ullauri ◽  
Antonio García-Domínguez ◽  
Nelly Bencomo ◽  
Changgang Zheng ◽  
Chen Zhen ◽  
...  

AbstractModern software systems are increasingly expected to show higher degrees of autonomy and self-management to cope with uncertain and diverse situations. As a consequence, autonomous systems can exhibit unexpected and surprising behaviours. This is exacerbated due to the ubiquity and complexity of Artificial Intelligence (AI)-based systems. This is the case of Reinforcement Learning (RL), where autonomous agents learn through trial-and-error how to find good solutions to a problem. Thus, the underlying decision-making criteria may become opaque to users that interact with the system and who may require explanations about the system’s reasoning. Available work for eXplainable Reinforcement Learning (XRL) offers different trade-offs: e.g. for runtime explanations, the approaches are model-specific or can only analyse results after-the-fact. Different from these approaches, this paper aims to provide an online model-agnostic approach for XRL towards trustworthy and understandable AI. We present ETeMoX, an architecture based on temporal models to keep track of the decision-making processes of RL systems. In cases where the resources are limited (e.g. storage capacity or time to response), the architecture also integrates complex event processing, an event-driven approach, for detecting matches to event patterns that need to be stored, instead of keeping the entire history. The approach is applied to a mobile communications case study that uses RL for its decision-making. In order to test the generalisability of our approach, three variants of the underlying RL algorithms are used: Q-Learning, SARSA and DQN. The encouraging results show that using the proposed configurable architecture, RL developers are able to obtain explanations about the evolution of a metric, relationships between metrics, and were able to track situations of interest happening over time windows.


Author(s):  
Syed Ihtesham Hussain Shah ◽  
Giuseppe De Pietro

In decision-making problems reward function plays an important role in finding the best policy. Reinforcement Learning (RL) provides a solution for decision-making problems under uncertainty in an Intelligent Environment (IE). However, it is difficult to specify the reward function for RL agents in large and complex problems. To counter these problems an extension of RL problem named Inverse Reinforcement Learning (IRL) is introduced, where reward function is learned from expert demonstrations. IRL is appealing for its potential use to build autonomous agents, capable of modeling others, deprived of compromising in performance of the task. This approach of learning by demonstrations relies on the framework of Markov Decision Process (MDP). This article elaborates original IRL algorithms along with their close variants to mitigate challenges. The purpose of this paper is to highlight an overview and theoretical background of IRL in the field of Machine Learning (ML) and Artificial Intelligence (AI). We presented a brief comparison between different variants of IRL in this article.


2014 ◽  
Vol 369 (1655) ◽  
pp. 20130480 ◽  
Author(s):  
Matthew Botvinick ◽  
Ari Weinstein

Recent work has reawakened interest in goal-directed or ‘model-based’ choice, where decisions are based on prospective evaluation of potential action outcomes. Concurrently, there has been growing attention to the role of hierarchy in decision-making and action control. We focus here on the intersection between these two areas of interest, considering the topic of hierarchical model-based control. To characterize this form of action control, we draw on the computational framework of hierarchical reinforcement learning, using this to interpret recent empirical findings. The resulting picture reveals how hierarchical model-based mechanisms might play a special and pivotal role in human decision-making, dramatically extending the scope and complexity of human behaviour.


2021 ◽  
Vol 7 ◽  
Author(s):  
Simen Theie Havenstrøm ◽  
Adil Rasheed ◽  
Omer San

Control theory provides engineers with a multitude of tools to design controllers that manipulate the closed-loop behavior and stability of dynamical systems. These methods rely heavily on insights into the mathematical model governing the physical system. However, in complex systems, such as autonomous underwater vehicles performing the dual objective of path following and collision avoidance, decision making becomes nontrivial. We propose a solution using state-of-the-art Deep Reinforcement Learning (DRL) techniques to develop autonomous agents capable of achieving this hybrid objective without having a priori knowledge about the goal or the environment. Our results demonstrate the viability of DRL in path following and avoiding collisions towards achieving human-level decision making in autonomous vehicle systems within extreme obstacle configurations.


Sign in / Sign up

Export Citation Format

Share Document