A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation

We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.

Download Full-text

ANEGMA: an automated negotiation model for e-markets

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09513-x ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Pallavi Bagga ◽

Nicola Paoletti ◽

Bedour Alrayes ◽

Kostas Stathis

Keyword(s):

Reinforcement Learning ◽

Deep Neural Network ◽

Automated Negotiation ◽

Learning To Learn ◽

Negotiation Model ◽

Negotiation Strategies ◽

Exploration Time ◽

Model Free ◽

Bilateral Negotiations ◽

Time Required

AbstractWe present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.

Download Full-text

An Hybrid Model-Free Reinforcement Learning Approach for HVAC Control

10.1109/eeeic/icpseurope51590.2021.9584805 ◽

2021 ◽

Author(s):

Francesco M. Solinas ◽

Andrea Bellagarda ◽

Enrico Macii ◽

Edoardo Patti ◽

Lorenzo Bottaccioli

Keyword(s):

Reinforcement Learning ◽

Hybrid Model ◽

Learning Approach ◽

Model Free ◽

Hvac Control

Download Full-text

From Rocks to Walls: a Model-free Reinforcement Learning Approach to Dry Stacking with Irregular Rocks

10.1109/cvprw53098.2021.00234 ◽

2021 ◽

Author(s):

Andre Menezes ◽

Pedro Vicente ◽

Alexandre Bernardino ◽

Rodrigo Ventura

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Model Free

Download Full-text

Online model-free controller for flexible wing aircraft: a policy iteration-based reinforcement learning approach

International Journal of Intelligent Robotics and Applications ◽

10.1007/s41315-019-00105-3 ◽

2019 ◽

Vol 4 (1) ◽

pp. 21-43

Author(s):

Mohammed Abouheaf ◽

Wail Gueaieb

Keyword(s):

Reinforcement Learning ◽

Policy Iteration ◽

Learning Approach ◽

Flexible Wing ◽

Model Free ◽

Model Free Controller

Download Full-text

A combined model-based planning and model-free reinforcement learning approach for biped locomotion Uma abordagem combinada de planejamento baseado em modelo e aprendizado por reforço para locomoção bípede

10.47749/t/unicamp.2019.1129351 ◽

2019 ◽

Author(s):

Rafael Mariottini Tomazela

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Biped Locomotion ◽

Combined Model ◽

Model Based ◽

Model Free

Download Full-text

Routing of Electric Vehicles With Intermediary Charging Stations: A Reinforcement Learning Approach

Frontiers in Big Data ◽

10.3389/fdata.2021.586481 ◽

2021 ◽

Vol 4 ◽

Author(s):

Marina Dorokhova ◽

Christophe Ballif ◽

Nicolas Wyrsch

Keyword(s):

Reinforcement Learning ◽

Electric Vehicles ◽

Mathematical Formulation ◽

Route Planning ◽

Learning Approach ◽

Training Procedure ◽

Routing Problem ◽

Policy Model ◽

Model Free ◽

Charging Stations

In the past few years, the importance of electric mobility has increased in response to growing concerns about climate change. However, limited cruising range and sparse charging infrastructure could restrain a massive deployment of electric vehicles (EVs). To mitigate the problem, the need for optimal route planning algorithms emerged. In this paper, we propose a mathematical formulation of the EV-specific routing problem in a graph-theoretical context, which incorporates the ability of EVs to recuperate energy. Furthermore, we consider a possibility to recharge on the way using intermediary charging stations. As a possible solution method, we present an off-policy model-free reinforcement learning approach that aims to generate energy feasible paths for EV from source to target. The algorithm was implemented and tested on a case study of a road network in Switzerland. The training procedure requires low computing and memory demands and is suitable for online applications. The results achieved demonstrate the algorithm’s capability to take recharging decisions and produce desired energy feasible paths.

Download Full-text

Efficient hindsight reinforcement learning using demonstrations for robotic tasks with sparse rewards

International Journal of Advanced Robotic Systems ◽

10.1177/1729881419898342 ◽

2020 ◽

Vol 17 (1) ◽

pp. 172988141989834

Author(s):

Guoyu Zuo ◽

Qishen Zhao ◽

Jiahao Lu ◽

Jiangeng Li

Keyword(s):

Reinforcement Learning ◽

Gradient Algorithm ◽

Learning To Learn ◽

Model Free ◽

Learning Speed ◽

Policy Gradient ◽

Experience Replay ◽

Speed Up ◽

Reward Functions ◽

Robotic Tasks

The goal of reinforcement learning is to enable an agent to learn by using rewards. However, some robotic tasks naturally specify with sparse rewards, and manually shaping reward functions is a difficult project. In this article, we propose a general and model-free approach for reinforcement learning to learn robotic tasks with sparse rewards. First, a variant of Hindsight Experience Replay, Curious and Aggressive Hindsight Experience Replay, is proposed to improve the sample efficiency of reinforcement learning methods and avoid the need for complicated reward engineering. Second, based on Twin Delayed Deep Deterministic policy gradient algorithm, demonstrations are leveraged to overcome the exploration problem and speed up the policy training process. Finally, the action loss is added into the loss function in order to minimize the vibration of output action while maximizing the value of the action. The experiments on simulated robotic tasks are performed with different hyperparameters to verify the effectiveness of our method. Results show that our method can effectively solve the sparse reward problem and obtain a high learning speed.

Download Full-text