scholarly journals Aggregation–Decomposition-Based Multi-Agent Reinforcement Learning for Multi-Reservoir Operations Optimization

Water ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 2688 ◽  
Author(s):  
Milad Hooshyar ◽  
S. Jamshid Mousavi ◽  
Masoud Mahootchi ◽  
Kumaraswamy Ponnambalam

Stochastic dynamic programming (SDP) is a widely-used method for reservoir operations optimization under uncertainty but suffers from the dual curses of dimensionality and modeling. Reinforcement learning (RL), a simulation-based stochastic optimization approach, can nullify the curse of modeling that arises from the need for calculating a very large transition probability matrix. RL mitigates the curse of the dimensionality problem, but cannot solve it completely as it remains computationally intensive in complex multi-reservoir systems. This paper presents a multi-agent RL approach combined with an aggregation/decomposition (AD-RL) method for reducing the curse of dimensionality in multi-reservoir operation optimization problems. In this model, each reservoir is individually managed by a specific operator (agent) while co-operating with other agents systematically on finding a near-optimal operating policy for the whole system. Each agent makes a decision (release) based on its current state and the feedback it receives from the states of all upstream and downstream reservoirs. The method, along with an efficient artificial neural network-based robust procedure for the task of tuning Q-learning parameters, has been applied to a real-world five-reservoir problem, i.e., the Parambikulam–Aliyar Project (PAP) in India. We demonstrate that the proposed AD-RL approach helps to derive operating policies that are better than or comparable with the policies obtained by other stochastic optimization methods with less computational burden.

Author(s):  
M. Hoffhues ◽  
W. Römisch ◽  
T. M. Surowiec

AbstractThe vast majority of stochastic optimization problems require the approximation of the underlying probability measure, e.g., by sampling or using observations. It is therefore crucial to understand the dependence of the optimal value and optimal solutions on these approximations as the sample size increases or more data becomes available. Due to the weak convergence properties of sequences of probability measures, there is no guarantee that these quantities will exhibit favorable asymptotic properties. We consider a class of infinite-dimensional stochastic optimization problems inspired by recent work on PDE-constrained optimization as well as functional data analysis. For this class of problems, we provide both qualitative and quantitative stability results on the optimal value and optimal solutions. In both cases, we make use of the method of probability metrics. The optimal values are shown to be Lipschitz continuous with respect to a minimal information metric and consequently, under further regularity assumptions, with respect to certain Fortet-Mourier and Wasserstein metrics. We prove that even in the most favorable setting, the solutions are at best Hölder continuous with respect to changes in the underlying measure. The theoretical results are tested in the context of Monte Carlo approximation for a numerical example involving PDE-constrained optimization under uncertainty.


Author(s):  
Saurabh Deshpande ◽  
Jonathan Cagan

Abstract Many optimization problems, such as manufacturing process planning optimization, are difficult problems due to the large number of potential configurations (process sequences) and associated (process) parameters. In addition, the search space is highly discontinuous and multi-modal. This paper introduces an agent based optimization algorithm that combines stochastic optimization techniques with knowledge based search. The motivation is that such a merging takes advantage of the benefits of stochastic optimization and accelerates the search process using domain knowledge. The result of applying this algorithm to computerized manufacturing process models is presented.


2006 ◽  
Vol 21 (3) ◽  
pp. 231-238 ◽  
Author(s):  
JIM DOWLING ◽  
RAYMOND CUNNINGHAM ◽  
EOIN CURRAN ◽  
VINNY CAHILL

This paper presents Collaborative Reinforcement Learning (CRL), a coordination model for online system optimization in decentralized multi-agent systems. In CRL system optimization problems are represented as a set of discrete optimization problems, each of whose solution cost is minimized by model-based reinforcement learning agents collaborating on their solution. CRL systems can be built to provide autonomic behaviours such as optimizing system performance in an unpredictable environment and adaptation to partial failures. We evaluate CRL using an ad hoc routing protocol that optimizes system routing performance in an unpredictable network environment.


2016 ◽  
Vol 138 (11) ◽  
Author(s):  
Piyush Pandita ◽  
Ilias Bilionis ◽  
Jitesh Panchal

Design optimization under uncertainty is notoriously difficult when the objective function is expensive to evaluate. State-of-the-art techniques, e.g., stochastic optimization or sampling average approximation, fail to learn exploitable patterns from collected data and require a lot of objective function evaluations. There is a need for techniques that alleviate the high cost of information acquisition and select sequential simulations optimally. In the field of deterministic single-objective unconstrained global optimization, the Bayesian global optimization (BGO) approach has been relatively successful in addressing the information acquisition problem. BGO builds a probabilistic surrogate of the expensive objective function and uses it to define an information acquisition function (IAF) that quantifies the merit of making new objective evaluations. In this work, we reformulate the expected improvement (EI) IAF to filter out parametric and measurement uncertainties. We bypass the curse of dimensionality, since the method does not require learning the response surface as a function of the stochastic parameters, and we employ a fully Bayesian interpretation of Gaussian processes (GPs) by constructing a particle approximation of the posterior of its hyperparameters using adaptive Markov chain Monte Carlo (MCMC) to increase the methods robustness. Also, our approach quantifies the epistemic uncertainty on the location of the optimum and the optimal value as induced by the limited number of objective evaluations used in obtaining it. We verify and validate our approach by solving two synthetic optimization problems under uncertainty and demonstrate it by solving the oil-well placement problem (OWPP) with uncertainties in the permeability field and the oil price time series.


Author(s):  
Joshua T. Bryson ◽  
Sunil K. Agrawal

Cable-driven robots have advantages which make them attractive solutions for a variety of tasks, however, the unidirectional nature of cable actuators complicates the design and often results in multiply redundant cable architectures which increase cost and robot complexity. This paper presents a stochastic optimization approach to the problem of designing a cable routing for a cable-driven manipulator to provide the desired robot workspace while minimizing the cable tensions required to perform a desired task. Two cable routing design variants are developed for a robot leg through the application of a stochastic optimization methodology called Particle Swarm Optimization. The PSO methodology is summarized, followed by a description of the specific implementation of the methodology to the particular problem of optimizing the cable routing of a robot leg. An objective function is developed to capture all pertinent design criteria in a quantitative evaluation of each particular set of cable parameters. Finally, a description of the PSO execution is presented and the results of the two optimization problems are presented and discussed.


2004 ◽  
Vol 126 (1) ◽  
pp. 46-55 ◽  
Author(s):  
Saurabh Deshpande ◽  
Jonathan Cagan

Many optimization problems, such as manufacturing process planning optimization, are difficult problems due to the large number of potential configurations (process sequences) and associated (process) parameters. In addition, the search space is highly discontinuous and multi-modal. This paper introduces an agent based optimization algorithm that combines stochastic optimization techniques with knowledge based search. The motivation is that such a merging takes advantage of the benefits of stochastic optimization and accelerates the search process using domain knowledge. The result of applying this algorithm to computerized manufacturing process models is presented.


Author(s):  
Piyush Pandita ◽  
Ilias Bilionis ◽  
Jitesh Panchal

Design optimization under uncertainty is notoriously difficult when the objective function is expensive to evaluate. State-of-the-art techniques, e.g., stochastic optimization or sampling average approximation, fail to learn exploitable patterns from collected data and, as a result, they tend to require an excessive number of objective function evaluations. There is a need for techniques that alleviate the high cost of information acquisition and select sequential simulations in an optimal way. In the field of deterministic single-objective unconstrained global optimization, the Bayesian global optimization (BGO) approach has been relatively successful in addressing the information acquisition problem. BGO builds a probabilistic surrogate of the expensive objective function and uses it to define an information acquisition function (IAF) whose role is to quantify the merit of making new objective evaluations. Specifically, BGO iterates between making the observations with the largest expected IAF and rebuilding the probabilistic surrogate, until a convergence criterion is met. In this work, we extend the expected improvement (EI) IAF to the case of design optimization under uncertainty. This involves a reformulation of the EI policy that is able to filter out parametric and measurement uncertainties. We by-pass the curse of dimensionality, since the method does not require learning the response surface as a function of the stochastic parameters. To increase the robustness of our approach in the low sample regime, we employ a fully Bayesian interpretation of Gaussian processes by constructing a particle approximation of the posterior of its hyperparameters using adaptive Markov chain Monte Carlo. An addendum of our approach is that it can quantify the epistemic uncertainty on the location of the optimum and the optimal value as induced by the limited number of objective evaluations used in obtaining it. We verify and validate our approach by solving two synthetic optimization problems under uncertainty. We demonstrate our approach by solving a challenging engineering problem: the oil-well-placement problem with uncertainties in the permeability field and the oil price time series.


2016 ◽  
Vol 19 (1) ◽  
pp. 47-61 ◽  
Author(s):  
Blagoj Delipetrev ◽  
Andreja Jonoski ◽  
Dimitri P. Solomatine

In this article we present two novel multipurpose reservoir optimization algorithms named nested stochastic dynamic programming (nSDP) and nested reinforcement learning (nRL). Both algorithms are built as a combination of two algorithms; in the nSDP case it is (1) stochastic dynamic programming (SDP) and (2) nested optimal allocation algorithm (nOAA) and in the nRL case it is (1) reinforcement learning (RL) and (2) nOAA. The nOAA is implemented with linear and non-linear optimization. The main novel idea is to include a nOAA at each SDP and RL state transition, that decreases starting problem dimension and alleviates curse of dimensionality. Both nSDP and nRL can solve multi-objective optimization problems without significant computational expenses and algorithm complexity and can handle dense and irregular variable discretization. The two algorithms were coded in Java as a prototype application and on the Knezevo reservoir, located in the Republic of Macedonia. The nSDP and nRL optimal reservoir policies were compared with nested dynamic programming policies, and overall conclusion is that nRL is more powerful, but significantly more complex than nSDP.


2021 ◽  
Author(s):  
Matthew R Walker ◽  
Mehrdad Malekmohammadi ◽  
Catherine Coolens ◽  
Normand Laperriere ◽  
Robert Heaton ◽  
...  

Gamma knife (GK) radiosurgery is a non-invasive treatment modality which allows single fraction delivery of focused radiation to one or more brain targets. Treatment planning mostly involves manual placement and shaping of shots to conform the prescribed dose to a surgical target. This process can be time consuming and labour intensive. An automated method is needed to determine the optimum combination of treatment parameters to decrease planning time and chance for operator-related error. Recent advancements in hardware platforms which employ parallel computational methods with stochastic optimization schemes are well suited to solving such combinatorial optimization problems efficiently. We present a method of generating optimized GK radiosurgery treatment plans using these techniques, which we name ROCKET (Radiosurgical Optimization Configuration Kit for Enhanced Treatments). Our approach consists of two phases in which shot isocenter positions are generated based on target geometry, followed by optimization of sector collimator parameters. Using this method, complex treatment plans can be generated, on average, in less than one minute, a substantial decrease relative to manual planning. Our results also demonstrate improved selectivity and treatment safety through decreased exposure to nearby organs-at-risk (OARs), compared to manual reference plans with matched coverage. Stochastic optimization is therefore shown to be a robust and efficient clinical tool for the automatic generation of GK radiosurgery treatment plans.


Sign in / Sign up

Export Citation Format

Share Document