SkinnerDB: Regret-bounded Query Evaluation via Reinforcement Learning

SkinnerDB uses reinforcement learning for reliable join ordering, exploiting an adaptive processing engine with specialized join algorithms and data structures. It maintains no data statistics and uses no cost or cardinality models. Also, it uses no training workloads nor does it try to link the current query to seemingly similar queries in the past. Instead, it uses reinforcement learning to learn optimal join orders from scratch during the execution of the current query. To that purpose, it divides the execution of a query into many small time slices. Different join orders are tried in different time slices. SkinnerDB merges result tuples generated according to different join orders until a complete query result is obtained. By measuring execution progress per time slice, it identifies promising join orders as execution proceeds. Along with SkinnerDB, we introduce a new quality criterion for query execution strategies. We upper-bound expected execution cost regret, i.e., the expected amount of execution cost wasted due to sub-optimal join order choices. SkinnerDB features multiple execution strategies that are optimized for that criterion. Some of them can be executed on top of existing database systems. For maximal performance, we introduce a customized execution engine, facilitating fast join order switching via specialized multi-way join algorithms and tuple representations. We experimentally compare SkinnerDB’s performance against various baselines, including MonetDB, Postgres, and adaptive processing methods. We consider various benchmarks, including the join order benchmark, TPC-H, and JCC-H, as well as benchmark variants with user-defined functions. Overall, the overheads of reliable join ordering are negligible compared to the performance impact of the occasional, catastrophic join order choice.

Download Full-text

Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/891 ◽

2019 ◽

Author(s):

Ritesh Noothigattu ◽

Djallel Bouneffouf ◽

Nicholas Mattei ◽

Rachita Chandra ◽

Piyush Madan ◽

...

Keyword(s):

Reinforcement Learning ◽

Ethical Values ◽

Large Role ◽

Learning To Learn ◽

Inverse Reinforcement Learning ◽

Time Step ◽

Novel Approach

Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure that they behave in ways aligned with the values of society, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcement learning to learn to maximize environmental rewards. A contextual bandit-based orchestrator then picks between the two policies: constraint-based and environment reward-based. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward-maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using Pac-Man and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

Download Full-text

ANEGMA: an automated negotiation model for e-markets

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09513-x ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Pallavi Bagga ◽

Nicola Paoletti ◽

Bedour Alrayes ◽

Kostas Stathis

Keyword(s):

Reinforcement Learning ◽

Deep Neural Network ◽

Automated Negotiation ◽

Learning To Learn ◽

Negotiation Model ◽

Negotiation Strategies ◽

Exploration Time ◽

Model Free ◽

Bilateral Negotiations ◽

Time Required

AbstractWe present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.

Download Full-text

Reinforcement Learning Interfaces for Biomedical Database Systems

2006 International Conference of the IEEE Engineering in Medicine and Biology Society ◽

10.1109/iembs.2006.4398892 ◽

2006 ◽

Author(s):

I. Rudowsky ◽

O. Kulyba ◽

M. Kunin ◽

S. Parsons ◽

T. Raphan

Keyword(s):

Reinforcement Learning ◽

Database Systems

Download Full-text

Collection-intersect join algorithms for parallel object-oriented database systems

Euro-Par’98 Parallel Processing - Lecture Notes in Computer Science ◽

10.1007/bfb0057894 ◽

1998 ◽

pp. 505-512 ◽

Cited By ~ 3

Author(s):

David Taniar ◽

J. Wenny Rahayu

Keyword(s):

Object Oriented ◽

Database Systems ◽

Join Algorithms ◽

Object Oriented Database

Download Full-text

Using Deep Reinforcement Learning to Learn High-Level Policies on the ATRIAS Biped

2019 International Conference on Robotics and Automation (ICRA) ◽

10.1109/icra.2019.8793864 ◽

2019 ◽

Cited By ~ 6

Author(s):

Tianyu Li ◽

Hartmut Geyer ◽

Christopher G. Atkeson ◽

Akshara Rai

Keyword(s):

Reinforcement Learning ◽

Learning To Learn ◽

High Level

Download Full-text

Action learning and grounding in simulated human–robot interactions

The Knowledge Engineering Review ◽

10.1017/s0269888919000079 ◽

2019 ◽

Vol 34 ◽

Author(s):

Oliver Roesler ◽

Ann Nowé

Keyword(s):

Reinforcement Learning ◽

Natural Language ◽

Action Learning ◽

Learning To Learn ◽

Human Tutor ◽

Small Set ◽

Interaction Experiment ◽

Object Shapes ◽

Situational Learning ◽

Natural Way

Abstract In order to enable robots to interact with humans in a natural way, they need to be able to autonomously learn new tasks. The most natural way for humans to tell another agent, which can be a human or robot, to perform a task is via natural language. Thus, natural human–robot interactions also require robots to understand natural language, i.e. extract the meaning of words and phrases. To do this, words and phrases need to be linked to their corresponding percepts through grounding. Afterward, agents can learn the optimal micro-action patterns to reach the goal states of the desired tasks. Most previous studies investigated only learning of actions or grounding of words, but not both. Additionally, they often used only a small set of tasks as well as very short and unnaturally simplified utterances. In this paper, we introduce a framework that uses reinforcement learning to learn actions for several tasks and cross-situational learning to ground actions, object shapes and colors, and prepositions. The proposed framework is evaluated through a simulated interaction experiment between a human tutor and a robot. The results show that the employed framework can be used for both action learning and grounding.

Download Full-text

Meta Reinforcement Learning with Task Embedding and Shared Policy

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/387 ◽

2019 ◽

Cited By ~ 2

Author(s):

Lin Lan ◽

Zhenguo Li ◽

Xiaohong Guan ◽

Pinghui Wang

Keyword(s):

Reinforcement Learning ◽

The Other ◽

Specific Information ◽

Significant Progress ◽

Learning To Learn ◽

Learning Capacity ◽

Shared Information ◽

Meta Learning ◽

The One ◽

High Level

Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization. Recent efforts apply meta-learning to learn a meta-learner from a set of RL tasks such that a novel but related task could be solved quickly. Though specific in some ways, different tasks in meta-RL are generally similar at a high level. However, most meta-RL methods do not explicitly and adequately model the specific and shared information among different tasks, which limits their ability to learn training tasks and to generalize to novel tasks. In this paper, we propose to capture the shared information on the one hand and meta-learn how to quickly abstract the specific information about a task on the other hand. Methodologically, we train an SGD meta-learner to quickly optimize a task encoder for each task, which generates a task embedding based on past experience. Meanwhile, we learn a policy which is shared across all tasks and conditioned on task embeddings. Empirical results on four simulated tasks demonstrate that our method has better learning capacity on both training and novel tasks and attains up to 3 to 4 times higher returns compared to baselines.

Download Full-text

Integrating an Observer in Interactive Reinforcement Learning to Learn Legible Trajectories

2020 29th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) ◽

10.1109/ro-man47096.2020.9223338 ◽

2020 ◽

Author(s):

Manuel Bied ◽

Mohamed Chetouani

Keyword(s):

Reinforcement Learning ◽

Learning To Learn

Download Full-text

Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards

International Journal of Advanced Robotic Systems ◽

10.1177/1729881419898342 ◽

2020 ◽

Vol 17 (1) ◽

pp. 172988141989834

Author(s):

Guoyu Zuo ◽

Qishen Zhao ◽

Jiahao Lu ◽

Jiangeng Li

Keyword(s):

Reinforcement Learning ◽

Gradient Algorithm ◽

Learning To Learn ◽

Model Free ◽

Learning Speed ◽

Policy Gradient ◽

Experience Replay ◽

Speed Up ◽

Reward Functions ◽

Robotic Tasks

The goal of reinforcement learning is to enable an agent to learn by using rewards. However, some robotic tasks naturally specify with sparse rewards, and manually shaping reward functions is a difficult project. In this article, we propose a general and model-free approach for reinforcement learning to learn robotic tasks with sparse rewards. First, a variant of Hindsight Experience Replay, Curious and Aggressive Hindsight Experience Replay, is proposed to improve the sample efficiency of reinforcement learning methods and avoid the need for complicated reward engineering. Second, based on Twin Delayed Deep Deterministic policy gradient algorithm, demonstrations are leveraged to overcome the exploration problem and speed up the policy training process. Finally, the action loss is added into the loss function in order to minimize the vibration of output action while maximizing the value of the action. The experiments on simulated robotic tasks are performed with different hyperparameters to verify the effectiveness of our method. Results show that our method can effectively solve the sparse reward problem and obtain a high learning speed.

Download Full-text

Benchmarking the Performance Impact of Transport Layer Security in Cloud Database Systems

2014 IEEE International Conference on Cloud Engineering ◽

10.1109/ic2e.2014.48 ◽

2014 ◽

Cited By ~ 11

Author(s):

Steffen Muller ◽

David Bermbach ◽

Stefan Tai ◽

Frank Pallas

Keyword(s):

Database Systems ◽

Transport Layer ◽

Performance Impact ◽

Transport Layer Security ◽

Cloud Database

Download Full-text