scholarly journals Batch mode reinforcement learning based on the synthesis of artificial trajectories

2012 ◽  
Vol 208 (1) ◽  
pp. 383-416 ◽  
Author(s):  
Raphael Fonteneau ◽  
Susan A. Murphy ◽  
Louis Wehenkel ◽  
Damien Ernst
Author(s):  
Gargya Gokhale ◽  
Bert Claessens ◽  
Chris Develder

We consider the problem of coordinating the charging of an entire fleet of electric vehicles (EV), using a model-free approach, i.e. purely data-driven reinforcement learning (RL). The objective of the RL-based control is to optimize charging actions, while fulfilling all EV charging constraints (e.g. timely completion of the charging). In particular, we focus on batch-mode learning and adopt fitted Q-iteration (FQI). A core component in FQI is approximating the Q-function using a regression technique, from which the policy is derived. Recently, a dueling neural networks architecture was proposed and shown to lead to better policy evaluation in the presence of many similar-valued actions, as applied in a computer game context. The main research contributions of the current paper are that (i)we develop a dueling neural networks approach for the setting of joint coordination of an entire EV fleet, and (ii)we evaluate its performance and compare it to an all-knowing benchmark and an FQI approach using EXTRA trees regression technique, a popular approach currently discussed in EV related works. We present a case study where RL agents are trained with an epsilon-greedy approach for different objectives, (a)cost minimization, and (b)maximization of self-consumption of local renewable energy sources. Our results indicate that RL agents achieve significant cost reductions (70--80%) compared to a business-as-usual scenario without smart charging. Comparing the dueling neural networks regression to EXTRA trees indicates that for our case study's EV fleet parameters and training scenario, the EXTRA trees-based agents achieve higher performance in terms of both lower costs (or higher self-consumption) and stronger robustness, i.e. less variation among trained agents. This suggests that adopting dueling neural networks in this EV setting is not particularly beneficial as opposed to the Atari game context from where this idea originated.


Author(s):  
Jiahang Liu ◽  
Lei Zuo ◽  
Xin Xu ◽  
Xinglong Zhang ◽  
Junkai Ren ◽  
...  

2013 ◽  
Vol 51 (5) ◽  
pp. 3355-3385 ◽  
Author(s):  
R. Fonteneau ◽  
D. Ernst ◽  
B. Boigelot ◽  
Q. Louveaux

Decision ◽  
2016 ◽  
Vol 3 (2) ◽  
pp. 115-131 ◽  
Author(s):  
Helen Steingroever ◽  
Ruud Wetzels ◽  
Eric-Jan Wagenmakers

Sign in / Sign up

Export Citation Format

Share Document