scholarly journals Optimal dispatch of PV inverters in unbalanced distribution systems using Reinforcement Learning

Author(s):  
Pedro P. Vergara ◽  
Mauricio Salazar ◽  
Juan S. Giraldo ◽  
Peter Palensky
2021 ◽  
Vol 12 (1) ◽  
pp. 361-371
Author(s):  
Ying Zhang ◽  
Xinan Wang ◽  
Jianhui Wang ◽  
Yingchen Zhang

2021 ◽  
Vol 5 (4) ◽  
pp. 1-24
Author(s):  
Jianguo Chen ◽  
Kenli Li ◽  
Keqin Li ◽  
Philip S. Yu ◽  
Zeng Zeng

As a new generation of Public Bicycle-sharing Systems (PBS), the Dockless PBS (DL-PBS) is an important application of cyber-physical systems and intelligent transportation. How to use artificial intelligence to provide efficient bicycle dispatching solutions based on dynamic bicycle rental demand is an essential issue for DL-PBS. In this article, we propose MORL-BD, a dynamic bicycle dispatching algorithm based on multi-objective reinforcement learning to provide the optimal bicycle dispatching solution for DL-PBS. We model the DL-PBS system from the perspective of cyber-physical systems and use deep learning to predict the layout of bicycle parking spots and the dynamic demand of bicycle dispatching. We define the multi-route bicycle dispatching problem as a multi-objective optimization problem by considering the optimization objectives of dispatching costs, dispatch truck's initial load, workload balance among the trucks, and the dynamic balance of bicycle supply and demand. On this basis, the collaborative multi-route bicycle dispatching problem among multiple dispatch trucks is modeled as a multi-agent and multi-objective reinforcement learning model. All dispatch paths between parking spots are defined as state spaces, and the reciprocal of dispatching costs is defined as a reward. Each dispatch truck is equipped with an agent to learn the optimal dispatch path in the dynamic DL-PBS network. We create an elite list to store the Pareto optimal solutions of bicycle dispatch paths found in each action, and finally get the Pareto frontier. Experimental results on the actual DL-PBS show that compared with existing methods, MORL-BD can find a higher quality Pareto frontier with less execution time.


Energies ◽  
2021 ◽  
Vol 14 (12) ◽  
pp. 3540
Author(s):  
Jing Zhang ◽  
Yiqi Li ◽  
Zhi Wu ◽  
Chunyan Rong ◽  
Tao Wang ◽  
...  

Because of the high penetration of renewable energies and the installation of new control devices, modern distribution networks are faced with voltage regulation challenges. Recently, the rapid development of artificial intelligence technology has introduced new solutions for optimal control problems with high dimensions and dynamics. In this paper, a deep reinforcement learning method is proposed to solve the two-timescale optimal voltage control problem. All control variables are assigned to different agents, and discrete variables are solved by a deep Q network (DQN) agent while the continuous variables are solved by a deep deterministic policy gradient (DDPG) agent. All agents are trained simultaneously with specially designed reward aiming at minimizing long-term average voltage deviation. Case study is executed on a modified IEEE-123 bus system, and the results demonstrate that the proposed algorithm has similar or even better performance than the model-based optimal control scheme and has high computational efficiency and competitive potential for online application.


2022 ◽  
Author(s):  
Yufan Zhang ◽  
Honglin Wen ◽  
Qiuwei Wu ◽  
Qian Ai

Prediction intervals (PIs) offer an effective tool for quantifying uncertainty of loads in distribution systems. The traditional central PIs cannot adapt well to skewed distributions, and their offline training fashion is vulnerable to the unforeseen change in future load patterns. Therefore, we propose an optimal PI estimation approach, which is online and adaptive to different data distributions by adaptively determining symmetric or asymmetric probability proportion pairs for quantiles of PIs’ bounds. It relies on the online learning ability of reinforcement learning (RL) to integrate the two online tasks, i.e., the adaptive selection of probability proportion pairs and quantile predictions, both of which are modeled by neural networks. As such, the quality of quantiles-formed PI can guide the selection process of optimal probability proportion pairs, which forms a closed loop to improve PIs’ quality. Furthermore, to improve the learning efficiency of quantile forecasts, a prioritized experience replay (PER) strategy is proposed for online quantile regression processes. Case studies on both load and net load demonstrate that the proposed method can better adapt to data distribution compared with online central PIs method. Compared with offline-trained methods, it obtains PIs with better quality and is more robust against concept drift.


Sign in / Sign up

Export Citation Format

Share Document