bellman equation
Recently Published Documents


TOTAL DOCUMENTS

315
(FIVE YEARS 46)

H-INDEX

26
(FIVE YEARS 3)

Author(s):  
Yu.V. Averboukh

The paper is concerned with the approximation of the value function of the zero-sum differential game with the minimal cost, i.e., the differential game with the payoff functional determined by the minimization of some quantity along the trajectory by the solutions of continuous-time stochastic games with the stopping governed by one player. Notice that the value function of the auxiliary continuous-time stochastic game is described by the Isaacs–Bellman equation with additional inequality constraints. The Isaacs–Bellman equation is a parabolic PDE for the case of stochastic differential game and it takes a form of system of ODEs for the case of continuous-time Markov game. The approximation developed in the paper is based on the concept of the stochastic guide first proposed by Krasovskii and Kotelnikova.


2021 ◽  
Author(s):  
Wei Liao ◽  
Taotao Liang ◽  
Xiaohui Wei ◽  
Jizhou Lai ◽  
Qiaozhi Yin

A novel method for computing reachable sets is proposed in this paper. In the proposed method, a Hamilton-Jacobi-Bellman equation with running cost function is numerically solved and the reachable sets of different time horizons are characterized by a family of non-zero level sets of the solution of the Hamilton-Jacobi-Bellman equation. In addition to the classical reachable set, by setting different running cost functions and terminal conditions of the Hamilton-Jacobi-Bellman equation, the proposed method allows to compute more generalized reachable sets, which are referred to as cost-limited reachable sets. In order to overcome the difficulty of solving the Hamilton-Jacobi-Bellman equation caused by the discontinuity of the solution, a method based on recursion and grid interpolation is employed. At the end of this paper, some examples are taken to illustrate the validity and generality of the proposed method.


2021 ◽  
Author(s):  
Wei Liao ◽  
Taotao Liang ◽  
Xiaohui Wei ◽  
Jizhou Lai ◽  
Qiaozhi Yin

A novel method for computing reachable sets is proposed in this paper. In the proposed method, a Hamilton-Jacobi-Bellman equation with running cost function is numerically solved and the reachable sets of different time horizons are characterized by a family of non-zero level sets of the solution of the Hamilton-Jacobi-Bellman equation. In addition to the classical reachable set, by setting different running cost functions and terminal conditions of the Hamilton-Jacobi-Bellman equation, the proposed method allows to compute more generalized reachable sets, which are referred to as cost-limited reachable sets. In order to overcome the difficulty of solving the Hamilton-Jacobi-Bellman equation caused by the discontinuity of the solution, a method based on recursion and grid interpolation is employed. At the end of this paper, some examples are taken to illustrate the validity and generality of the proposed method.


Author(s):  
Shuang Wu ◽  
Jingyu Zhao ◽  
Guangjian Tian ◽  
Jun Wang

The restless multi-armed bandit (RMAB) problem is a generalization of the multi-armed bandit with non-stationary rewards. Its optimal solution is intractable due to exponentially large state and action spaces with respect to the number of arms. Existing approximation approaches, e.g., Whittle's index policy, have difficulty in capturing either temporal or spatial factors such as impacts from other arms. We propose considering both factors using the attention mechanism, which has achieved great success in deep learning. Our state-aware value function approximation solution comprises an attention-based value function approximator and a Bellman equation solver. The attention-based coordination module capture both spatial and temporal factors for arm coordination. The Bellman equation solver utilizes the decoupling structure of RMABs to acquire solutions with significantly reduced computation overheads. In particular, the time complexity of our approximation is linear in the number of arms. Finally, we illustrate the effectiveness and investigate the properties of our proposed method with numerical experiments.


2021 ◽  
pp. 1-14
Author(s):  
Daniel Saranovic ◽  
Martin Pavlovski ◽  
William Power ◽  
Ivan Stojkovic ◽  
Zoran Obradovic

As the prevalence of drones increases, understanding and preparing for possible adversarial uses of drones and drone swarms is of paramount importance. Correspondingly, developing defensive mechanisms in which swarms can be used to protect against adversarial Unmanned Aerial Vehicles (UAVs) is a problem that requires further attention. Prior work on intercepting UAVs relies mostly on utilizing additional sensors or uses the Hamilton-Jacobi-Bellman equation, for which strong conditions need to be met to guarantee the existence of a saddle-point solution. To that end, this work proposes a novel interception method that utilizes the swarm’s onboard PID controllers for setting the drones’ states during interception. The drone’s states are constrained only by their physical limitations, and only partial feedback of the adversarial drone’s positions is assumed. The new framework is evaluated in a virtual environment under different environmental and model settings, using random simulations of more than 165,000 swarm flights. For certain environmental settings, our results indicate that the interception performance of larger swarms under partial observation is comparable to that of a one-drone swarm under full observation of the adversarial drone.


2021 ◽  
Author(s):  
Natalia Gorbunova ◽  
Oleg Mirkushov

Abstract The main aim of this work was to identify an optimal route design for the delivery of coal from the open pit mines to the enrichment plants on the example of the Sibathractic Group’s mine “Kolyvansk” and its enrichment plants «Listvjanskaja-1» and «Listvjanskaja-2». The new route would allow the corporation to increase production rates and to reduce the risks associated with the use of motor vehicles. Dynamic programming method, mainly the Bellman equation principle, was chosen as an algorithm for searching for the optimal route from Kolyvansky mine to the «Listvjanskaja-1» and «Listvjanskaja-2». In the course of the study the two possible routes were selected. Further comparison indicated that only one of them - “2nd railway track”, corresponded to all the recommended parameters for the delivery route. All the findings are presented in detail in the study.


Sign in / Sign up

Export Citation Format

Share Document