scholarly journals Reinforcement Learning for Electric Vehicle Charging using Dueling Neural Networks

Author(s):  
Gargya Gokhale ◽  
Bert Claessens ◽  
Chris Develder

We consider the problem of coordinating the charging of an entire fleet of electric vehicles (EV), using a model-free approach, i.e. purely data-driven reinforcement learning (RL). The objective of the RL-based control is to optimize charging actions, while fulfilling all EV charging constraints (e.g. timely completion of the charging). In particular, we focus on batch-mode learning and adopt fitted Q-iteration (FQI). A core component in FQI is approximating the Q-function using a regression technique, from which the policy is derived. Recently, a dueling neural networks architecture was proposed and shown to lead to better policy evaluation in the presence of many similar-valued actions, as applied in a computer game context. The main research contributions of the current paper are that (i)we develop a dueling neural networks approach for the setting of joint coordination of an entire EV fleet, and (ii)we evaluate its performance and compare it to an all-knowing benchmark and an FQI approach using EXTRA trees regression technique, a popular approach currently discussed in EV related works. We present a case study where RL agents are trained with an epsilon-greedy approach for different objectives, (a)cost minimization, and (b)maximization of self-consumption of local renewable energy sources. Our results indicate that RL agents achieve significant cost reductions (70--80%) compared to a business-as-usual scenario without smart charging. Comparing the dueling neural networks regression to EXTRA trees indicates that for our case study's EV fleet parameters and training scenario, the EXTRA trees-based agents achieve higher performance in terms of both lower costs (or higher self-consumption) and stronger robustness, i.e. less variation among trained agents. This suggests that adopting dueling neural networks in this EV setting is not particularly beneficial as opposed to the Atari game context from where this idea originated.

Author(s):  
Bhargavi Munnaluri ◽  
K. Ganesh Reddy

Wind forecasting is one of the best efficient ways to deal with the challenges of wind power generation. Due to the depletion of fossil fuels renewable energy sources plays a major role for the generation of power. For future management and for future utilization of power, we need to predict the wind speed.  In this paper, an efficient hybrid forecasting approach with the combination of Support Vector Machine (SVM) and Artificial Neural Networks(ANN) are proposed to improve the quality of prediction of wind speed. Due to the different parameters of wind, it is difficult to find the accurate prediction value of the wind speed. The proposed hybrid model of forecasting is examined by taking the hourly wind speed of past years data by reducing the prediction error with the help of Mean Square Error by 0.019. The result obtained from the Artificial Neural Networks improves the forecasting quality.


2019 ◽  
Author(s):  
Leor M Hackel ◽  
Jeffrey Jordan Berg ◽  
Björn Lindström ◽  
David Amodio

Do habits play a role in our social impressions? To investigate the contribution of habits to the formation of social attitudes, we examined the roles of model-free and model-based reinforcement learning in social interactions—computations linked in past work to habit and planning, respectively. Participants in this study learned about novel individuals in a sequential reinforcement learning paradigm, choosing financial advisors who led them to high- or low-paying stocks. Results indicated that participants relied on both model-based and model-free learning, such that each independently predicted choice during the learning task and self-reported liking in a post-task assessment. Specifically, participants liked advisors who could provide large future rewards as well as advisors who had provided them with large rewards in the past. Moreover, participants varied in their use of model-based and model-free learning strategies, and this individual difference influenced the way in which learning related to self-reported attitudes: among participants who relied more on model-free learning, model-free social learning related more to post-task attitudes. We discuss implications for attitudes, trait impressions, and social behavior, as well as the role of habits in a memory systems model of social cognition.


2012 ◽  
Vol 208 (1) ◽  
pp. 383-416 ◽  
Author(s):  
Raphael Fonteneau ◽  
Susan A. Murphy ◽  
Louis Wehenkel ◽  
Damien Ernst

2021 ◽  
Vol 2 (1) ◽  
pp. 1-25
Author(s):  
Yongsen Ma ◽  
Sheheryar Arshad ◽  
Swetha Muniraju ◽  
Eric Torkildson ◽  
Enrico Rantala ◽  
...  

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.


2021 ◽  
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Giorgia Ramponi ◽  
Andrea Tirinzoni ◽  
Matteo Giuliani ◽  
...  

AbstractIn real-world applications, inferring the intentions of expert agents (e.g., human operators) can be fundamental to understand how possibly conflicting objectives are managed, helping to interpret the demonstrated behavior. In this paper, we discuss how inverse reinforcement learning (IRL) can be employed to retrieve the reward function implicitly optimized by expert agents acting in real applications. Scaling IRL to real-world cases has proved challenging as typically only a fixed dataset of demonstrations is available and further interactions with the environment are not allowed. For this reason, we resort to a class of truly batch model-free IRL algorithms and we present three application scenarios: (1) the high-level decision-making problem in the highway driving scenario, and (2) inferring the user preferences in a social network (Twitter), and (3) the management of the water release in the Como Lake. For each of these scenarios, we provide formalization, experiments and a discussion to interpret the obtained results.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1060
Author(s):  
Md Mamun Ur Rashid ◽  
Majed A. Alotaibi ◽  
Abdul Hasib Chowdhury ◽  
Muaz Rahman ◽  
Md. Shafiul Alam ◽  
...  

From a residential point of view, home energy management (HEM) is an essential requirement in order to diminish peak demand and utility tariffs. The integration of renewable energy sources (RESs) together with battery energy storage systems (BESSs) and central battery storage system (CBSS) may promote energy and cost minimization. However, proper home appliance scheduling along with energy storage options is essential to significantly decrease the energy consumption profile and overall expenditure in real-time operation. This paper proposes a cost-effective HEM scheme in the microgrid framework to promote curtailing of energy usage and relevant utility tariff considering both energy storage and renewable sources integration. Usually, the household appliances have different runtime preferences and duration of operation based on user demand. This work considers a simulator designed in the C++ platform to address the domestic customer’s HEM issue based on usages priorities. The positive aspects of merging RESs, BESSs, and CBSSs with the proposed optimal power sharing algorithm (OPSA) are evaluated by considering three distinct case scenarios. Comprehensive analysis of each scenario considering the real-time scheduling of home appliances is conducted to substantiate the efficacy of the outlined energy and cost mitigation schemes. The results obtained demonstrate the effectiveness of the proposed algorithm to enable energy and cost savings up to 37.5% and 45% in comparison to the prevailing methodology.


Energies ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2700
Author(s):  
Grace Muriithi ◽  
Sunetra Chowdhury

In the near future, microgrids will become more prevalent as they play a critical role in integrating distributed renewable energy resources into the main grid. Nevertheless, renewable energy sources, such as solar and wind energy can be extremely volatile as they are weather dependent. These resources coupled with demand can lead to random variations on both the generation and load sides, thus complicating optimal energy management. In this article, a reinforcement learning approach has been proposed to deal with this non-stationary scenario, in which the energy management system (EMS) is modelled as a Markov decision process (MDP). A novel modification of the control problem has been presented that improves the use of energy stored in the battery such that the dynamic demand is not subjected to future high grid tariffs. A comprehensive reward function has also been developed which decreases infeasible action explorations thus improving the performance of the data-driven technique. A Q-learning algorithm is then proposed to minimize the operational cost of the microgrid under unknown future information. To assess the performance of the proposed EMS, a comparison study between a trading EMS model and a non-trading case is performed using a typical commercial load curve and PV profile over a 24-h horizon. Numerical simulation results indicate that the agent learns to select an optimized energy schedule that minimizes energy cost (cost of power purchased from the utility and battery wear cost) in all the studied cases. However, comparing the non-trading EMS to the trading EMS model operational costs, the latter one was found to decrease costs by 4.033% in summer season and 2.199% in winter season.


Sign in / Sign up

Export Citation Format

Share Document