scholarly journals Approximate Value Iteration in the Reinforcement Learning Context. Application to Electrical Power System Control.

Author(s):  
Damien Ernst ◽  
Mevludin Glavic ◽  
Pierre Geurts ◽  
Louis Wehenkel

In this paper we explain how to design intelligent agents able to process the information acquired from interaction with a system to learn a good control policy and show how the methodology can be applied to control some devices aimed to damp electrical power oscillations. The control problem is formalized as a discrete-time optimal control problem and the information acquired from interaction with the system is a set of samples, where each sample is composed of four elements: a state, the action taken while being in this state, the instantaneous reward observed and the successor state of the system. To process this information we consider reinforcement learning algorithms that determine an approximation of the so-called Q-function by mimicking the behavior of the value iteration algorithm. Simulations are first carried on a benchmark power system modeled with two state variables. Then we present a more complex case study on a four-machine power system where the reinforcement learning algorithm controls a Thyristor Controlled Series Capacitor (TCSC) aimed to damp power system oscillations.

2014 ◽  
Vol 513-517 ◽  
pp. 1092-1095
Author(s):  
Bo Wu ◽  
Yan Peng Feng ◽  
Hong Yan Zheng

Bayesian reinforcement learning has turned out to be an effective solution to the optimal tradeoff between exploration and exploitation. However, in practical applications, the learning parameters with exponential growth are the main impediment for online planning and learning. To overcome this problem, we bring factored representations, model-based learning, and Bayesian reinforcement learning together in a new approach. Firstly, we exploit a factored representation to describe the states to reduce the size of learning parameters, and adopt Bayesian inference method to learn the unknown structure and parameters simultaneously. Then, we use an online point-based value iteration algorithm to plan and learn. The experimental results show that the proposed approach is an effective way for improving the learning efficiency in large-scale state spaces.


Author(s):  
Pasala Gopi ◽  
P. Linga Reddy

The response of the load frequency control problem in multi-area interconnected electrical power system is much more complex with increasing size, changing structure and increasing load.  This paper deals with Load Frequency Control of three area interconnected Power system incorporating Reheat, Non-reheat and Reheat turbines in all areas respectively.  The response of the load frequency control problem in a multi-area interconnected power system is improved by designing PID controller using different tuning techniques and proved that the PID controller which was designed by Simulink Design Optimization (SDO) Software gives the superior performance than other controllers for step perturbations. Finally the robustness of controller was checked against system parameter variations..


2016 ◽  
Vol 15 (14) ◽  
pp. 7416-7422
Author(s):  
M.Kamel EL-Sayed

In this paper,we introduce an approach for analysis of information concerning electrical power system. The suggested method is a result of hybridizing rough set concepts with nano topology constructed on the set of all data using the boundary of uncertain decision sets and its lower approximation. Bases of nano topologies are used as indicators for selecting effective features in information system of a power control. This method is applied using the main experimental data which make the suggested model near from the real life information.


1985 ◽  
Vol 1 (CONFERENCE) ◽  
pp. 1-10
Author(s):  
F. Bendary ◽  
M. Drouin ◽  
M. El-Metwally

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Rui Wang ◽  
Xianghua Gan ◽  
Qing Li ◽  
Xiao Yan

We study a joint pricing and inventory control problem for perishables with positive lead time in a finite horizon periodic-review system. Unlike most studies considering a continuous density function of demand, in our paper the customer demand depends on the price of current period and arrives according to a homogeneous Poisson process. We consider both backlogging and lost-sales cases, and our goal is to find a simultaneously ordering and pricing policy to maximize the expected discounted profit over the planning horizon. When there is no fixed ordering cost involved, we design a deep reinforcement learning algorithm to obtain a near-optimal ordering policy and show that there are some monotonicity properties in the learned policy. We also show that our deep reinforcement learning algorithm achieves a better performance than tabular-based Q-learning algorithms. When a fixed ordering cost is involved, we show that our deep reinforcement learning algorithm is effective and efficient, under which the problem of “curse of dimension” is circumvented.


Author(s):  
Aviv Tamar ◽  
Yi Wu ◽  
Garrett Thomas ◽  
Sergey Levine ◽  
Pieter Abbeel

We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within. VINs can learn to plan, and are suitable for predicting outcomes that involve planning-based reasoning, such as policies for reinforcement learning. Key to our approach is a novel differentiable approximation of the value-iteration algorithm, which can be represented as a convolutional neural network, and trained end-to-end using standard backpropagation.We evaluate VIN based policies on discrete and continuous path-planning domains, and on a natural-language based search task. We show that by learning an explicit planning computation, VIN policies generalize better to new, unseen domains.This paper is a significantly abridged and IJCAI audience targeted version of the original NIPS 2016 paper with the same title, available here: https://arxiv.org/abs/1602.02867


Sign in / Sign up

Export Citation Format

Share Document