Reinforcement Learning for Electric Vehicle Charging using Dueling Neural Networks

We consider the problem of coordinating the charging of an entire fleet of electric vehicles (EV), using a model-free approach, i.e. purely data-driven reinforcement learning (RL). The objective of the RL-based control is to optimize charging actions, while fulfilling all EV charging constraints (e.g. timely completion of the charging). In particular, we focus on batch-mode learning and adopt fitted Q-iteration (FQI). A core component in FQI is approximating the Q-function using a regression technique, from which the policy is derived. Recently, a dueling neural networks architecture was proposed and shown to lead to better policy evaluation in the presence of many similar-valued actions, as applied in a computer game context. The main research contributions of the current paper are that (i)we develop a dueling neural networks approach for the setting of joint coordination of an entire EV fleet, and (ii)we evaluate its performance and compare it to an all-knowing benchmark and an FQI approach using EXTRA trees regression technique, a popular approach currently discussed in EV related works. We present a case study where RL agents are trained with an epsilon-greedy approach for different objectives, (a)cost minimization, and (b)maximization of self-consumption of local renewable energy sources. Our results indicate that RL agents achieve significant cost reductions (70--80%) compared to a business-as-usual scenario without smart charging. Comparing the dueling neural networks regression to EXTRA trees indicates that for our case study's EV fleet parameters and training scenario, the EXTRA trees-based agents achieve higher performance in terms of both lower costs (or higher self-consumption) and stronger robustness, i.e. less variation among trained agents. This suggests that adopting dueling neural networks in this EV setting is not particularly beneficial as opposed to the Atari game context from where this idea originated.

Download Full-text

Shaping Model-Free Reinforcement-Learning with Model-Based Pseudorewards

10.32470/ccn.2018.1191-0 ◽

2018 ◽

Author(s):

Paul Krueger ◽

Thomas Griffiths

Keyword(s):

Reinforcement Learning ◽

Model Based ◽

Model Free

Download Full-text

An Efficient Hybrid Forecasting Approach for Wind Speed Time Series

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i9.404 ◽

2017 ◽

Vol 7 (9) ◽

pp. 13

Author(s):

Bhargavi Munnaluri ◽

K. Ganesh Reddy

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Wind Speed ◽

Fossil Fuels ◽

Renewable Energy Sources ◽

Support Vector ◽

Hybrid Forecasting ◽

Artificial Neural ◽

Future Utilization

Wind forecasting is one of the best efficient ways to deal with the challenges of wind power generation. Due to the depletion of fossil fuels renewable energy sources plays a major role for the generation of power. For future management and for future utilization of power, we need to predict the wind speed. In this paper, an efficient hybrid forecasting approach with the combination of Support Vector Machine (SVM) and Artificial Neural Networks(ANN) are proposed to improve the quality of prediction of wind speed. Due to the different parameters of wind, it is difficult to find the accurate prediction value of the wind speed. The proposed hybrid model of forecasting is examined by taking the hourly wind speed of past years data by reducing the prediction error with the help of Mean Square Error by 0.019. The result obtained from the Artificial Neural Networks improves the forecasting quality.

Download Full-text

Model-Based and Model-Free Social Cognition

10.31234/osf.io/ue6j2 ◽

2019 ◽

Author(s):

Leor M Hackel ◽

Jeffrey Jordan Berg ◽

Björn Lindström ◽

David Amodio

Keyword(s):

Reinforcement Learning ◽

Social Cognition ◽

Learning Strategies ◽

Memory Systems ◽

Learning Task ◽

Financial Advisors ◽

Model Based ◽

Model Free ◽

Systems Model ◽

Task Assessment

Do habits play a role in our social impressions? To investigate the contribution of habits to the formation of social attitudes, we examined the roles of model-free and model-based reinforcement learning in social interactions—computations linked in past work to habit and planning, respectively. Participants in this study learned about novel individuals in a sequential reinforcement learning paradigm, choosing financial advisors who led them to high- or low-paying stocks. Results indicated that participants relied on both model-based and model-free learning, such that each independently predicted choice during the learning task and self-reported liking in a post-task assessment. Specifically, participants liked advisors who could provide large future rewards as well as advisors who had provided them with large rewards in the past. Moreover, participants varied in their use of model-based and model-free learning strategies, and this individual difference influenced the way in which learning related to self-reported attitudes: among participants who relied more on model-free learning, model-free social learning related more to post-task attitudes. We discuss implications for attitudes, trait impressions, and social behavior, as well as the role of habits in a memory systems model of social cognition.

Download Full-text

Faculty Opinions recommendation of States versus rewards: dissociable neural prediction error signals underlying model-based and model-free reinforcement learning.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.4125957.4076054 ◽

2010 ◽

Author(s):

Susan Courtney

Keyword(s):

Reinforcement Learning ◽

Prediction Error ◽

Model Based ◽

Model Free

Download Full-text

Batch mode reinforcement learning based on the synthesis of artificial trajectories

Annals of Operations Research ◽

10.1007/s10479-012-1248-5 ◽

2012 ◽

Vol 208 (1) ◽

pp. 383-416 ◽

Cited By ~ 16

Author(s):

Raphael Fonteneau ◽

Susan A. Murphy ◽

Louis Wehenkel ◽

Damien Ernst

Keyword(s):

Reinforcement Learning ◽

Batch Mode

Download Full-text

Model-Free Event-Triggered Optimal Consensus Control of Multiple Euler-Lagrange Systems via Reinforcement Learning

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2020.3036604 ◽

2020 ◽

pp. 1-1

Author(s):

Saiwei Wang ◽

Xin Jin ◽

Shuai Mao ◽

Athanasios V. Vasilakos ◽

Yang Tang

Keyword(s):

Reinforcement Learning ◽

Consensus Control ◽

Model Free ◽

Event Triggered

Download Full-text

Location- and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning

ACM Transactions on Internet of Things ◽

10.1145/3424739 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-25

Author(s):

Yongsen Ma ◽

Sheheryar Arshad ◽

Swetha Muniraju ◽

Eric Torkildson ◽

Enrico Rantala ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Reinforcement Learning ◽

Activity Recognition ◽

Deep Neural Networks ◽

State Machine ◽

Recognition Algorithm ◽

The State ◽

Neural Architecture ◽

Learning Agent

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Download Full-text

Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems

Machine Learning ◽

10.1007/s10994-020-05939-8 ◽

2021 ◽

Author(s):

Amarildo Likmeta ◽

Alberto Maria Metelli ◽

Giorgia Ramponi ◽

Andrea Tirinzoni ◽

Matteo Giuliani ◽

...

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Real Life ◽

User Preferences ◽

Inverse Reinforcement Learning ◽

Water Release ◽

Reward Function ◽

Model Free ◽

Conflicting Objectives ◽

Multiple Experts

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.

Download Full-text

Home Energy Management for Community Microgrids Using Optimal Power Sharing Algorithm

Energies ◽

10.3390/en14041060 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1060

Author(s):

Md Mamun Ur Rashid ◽

Majed A. Alotaibi ◽

Abdul Hasib Chowdhury ◽

Muaz Rahman ◽

Md. Shafiul Alam ◽

...

Keyword(s):

Energy Storage ◽

Real Time ◽

Energy Management ◽

Cost Minimization ◽

Renewable Energy Sources ◽

Storage System ◽

Power Sharing ◽

Optimal Power ◽

Time Operation ◽

Home Energy Management

From a residential point of view, home energy management (HEM) is an essential requirement in order to diminish peak demand and utility tariffs. The integration of renewable energy sources (RESs) together with battery energy storage systems (BESSs) and central battery storage system (CBSS) may promote energy and cost minimization. However, proper home appliance scheduling along with energy storage options is essential to significantly decrease the energy consumption profile and overall expenditure in real-time operation. This paper proposes a cost-effective HEM scheme in the microgrid framework to promote curtailing of energy usage and relevant utility tariff considering both energy storage and renewable sources integration. Usually, the household appliances have different runtime preferences and duration of operation based on user demand. This work considers a simulator designed in the C++ platform to address the domestic customer’s HEM issue based on usages priorities. The positive aspects of merging RESs, BESSs, and CBSSs with the proposed optimal power sharing algorithm (OPSA) are evaluated by considering three distinct case scenarios. Comprehensive analysis of each scenario considering the real-time scheduling of home appliances is conducted to substantiate the efficacy of the outlined energy and cost mitigation schemes. The results obtained demonstrate the effectiveness of the proposed algorithm to enable energy and cost savings up to 37.5% and 45% in comparison to the prevailing methodology.

Download Full-text

Optimal Energy Management of a Grid-Tied Solar PV-Battery Microgrid: A Reinforcement Learning Approach

Energies ◽

10.3390/en14092700 ◽

2021 ◽

Vol 14 (9) ◽

pp. 2700

Author(s):

Grace Muriithi ◽

Sunetra Chowdhury

Keyword(s):

Renewable Energy ◽

Reinforcement Learning ◽

Energy Management ◽

Renewable Energy Sources ◽

Critical Role ◽

Winter Season ◽

Learning Approach ◽

Solar Pv ◽

Optimal Energy ◽

Optimal Energy Management

In the near future, microgrids will become more prevalent as they play a critical role in integrating distributed renewable energy resources into the main grid. Nevertheless, renewable energy sources, such as solar and wind energy can be extremely volatile as they are weather dependent. These resources coupled with demand can lead to random variations on both the generation and load sides, thus complicating optimal energy management. In this article, a reinforcement learning approach has been proposed to deal with this non-stationary scenario, in which the energy management system (EMS) is modelled as a Markov decision process (MDP). A novel modification of the control problem has been presented that improves the use of energy stored in the battery such that the dynamic demand is not subjected to future high grid tariffs. A comprehensive reward function has also been developed which decreases infeasible action explorations thus improving the performance of the data-driven technique. A Q-learning algorithm is then proposed to minimize the operational cost of the microgrid under unknown future information. To assess the performance of the proposed EMS, a comparison study between a trading EMS model and a non-trading case is performed using a typical commercial load curve and PV profile over a 24-h horizon. Numerical simulation results indicate that the agent learns to select an optimized energy schedule that minimizes energy cost (cost of power purchased from the utility and battery wear cost) in all the studied cases. However, comparing the non-trading EMS to the trading EMS model operational costs, the latter one was found to decrease costs by 4.033% in summer season and 2.199% in winter season.

Download Full-text