Construction of the value function in a pursuit-evasion game with three pursuers and one evader

AbstractWe consider the game of a holonomic evader passing between two holonomic pursuers. The optimal trajectories of this game are known. We give a detailed explanation of the game of kind’s solution and present a computationally efficient way to obtain trajectories numerically by integrating the retrograde path equations. Additionally, we propose a method for calculating the partial derivatives of the Value function in the game of degree. This latter result applies to differential games with homogeneous Value.

Download Full-text

Fully-Discrete Schemes for the Value Function of Pursuit-Evasion Games with State Constraints

Advances in Dynamic Games and Their Applications ◽

10.1007/978-0-8176-4834-3_11 ◽

2009 ◽

pp. 1-30 ◽

Cited By ~ 9

Author(s):

Emiliano Cristiani ◽

Maurizio Falcone

Keyword(s):

State Constraints ◽

Value Function ◽

Fully Discrete ◽

Pursuit Evasion ◽

The Value Function

Download Full-text

Fully Discrete Schemes for the Value Function of Pursuit-Evasion Games

Advances in Dynamic Games and Applications ◽

10.1007/978-1-4612-0245-5_5 ◽

1994 ◽

pp. 89-105 ◽

Cited By ~ 17

Author(s):

M. Bardi ◽

P. Soravia ◽

M. Falcone

Keyword(s):

Value Function ◽

Fully Discrete ◽

Pursuit Evasion ◽

The Value Function

Download Full-text

The value function for time-related decisions

PsycEXTRA Dataset ◽

10.1037/e653632011-006 ◽

2011 ◽

Author(s):

Anouk Festjens ◽

Siegfried Dewitte ◽

Enrico Diecidue ◽

Sabrina Bruyneel

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

The Equal Tails: A Method to Elicit the Value Function

SSRN Electronic Journal ◽

10.2139/ssrn.893748 ◽

2006 ◽

Author(s):

Manel Baucells ◽

Antonio Villasis

Keyword(s):

Value Function ◽

The Value Function

Download Full-text

Self-Adaptation of Meta-Parameters for Lamarckian-Inherited Neuromodulated Neurocontrollers in the Pursuit-Evasion Game

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308450 ◽

2020 ◽

Author(s):

Ian Showalter ◽

Howard Schwartz

Keyword(s):

Pursuit Evasion ◽

Evasion Game ◽

Self Adaptation

Download Full-text

Predictive pursuit-evasion game control method for approaching space non-cooperative target

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2020.12.1947 ◽

2020 ◽

Vol 53 (2) ◽

pp. 14882-14887

Author(s):

Yuan Chai ◽

Jianjun Luo ◽

Mingming Wang ◽

Min Yu

Keyword(s):

Control Method ◽

Pursuit Evasion ◽

Evasion Game

Download Full-text

Solving flow-shop scheduling problem with a reinforcement learning algorithm that generalizes the value function with neural network

Alexandria Engineering Journal ◽

10.1016/j.aej.2021.01.030 ◽

2021 ◽

Vol 60 (3) ◽

pp. 2787-2800

Author(s):

Jianfeng Ren ◽

Chunming Ye ◽

Feng Yang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Value Function ◽

Flow Shop ◽

Learning Algorithm ◽

Flow Shop Scheduling ◽

Scheduling Problem ◽

Shop Scheduling ◽

The Value Function ◽

Reinforcement Learning Algorithm

Download Full-text

Pricing Perpetual American Put Options with Asset-Dependent Discounting

Journal of Risk and Financial Management ◽

10.3390/jrfm14030130 ◽

2021 ◽

Vol 14 (3) ◽

pp. 130

Author(s):

Jonas Al-Hadad ◽

Zbigniew Palmowski

Keyword(s):

Value Function ◽

Asset Price ◽

Stopping Times ◽

Martingale Measure ◽

Put Options ◽

American Put Options ◽

Exact Calculations ◽

Negative Exponential ◽

American Put ◽

The Value Function

The main objective of this paper is to present an algorithm of pricing perpetual American put options with asset-dependent discounting. The value function of such an instrument can be described as VAPutω(s)=supτ∈TEs[e−∫0τω(Sw)dw(K−Sτ)+], where T is a family of stopping times, ω is a discount function and E is an expectation taken with respect to a martingale measure. Moreover, we assume that the asset price process St is a geometric Lévy process with negative exponential jumps, i.e., St=seζt+σBt−∑i=1NtYi. The asset-dependent discounting is reflected in the ω function, so this approach is a generalisation of the classic case when ω is constant. It turns out that under certain conditions on the ω function, the value function VAPutω(s) is convex and can be represented in a closed form. We provide an option pricing algorithm in this scenario and we present exact calculations for the particular choices of ω such that VAPutω(s) takes a simplified form.

Download Full-text

Robo-Advising: Learning Investors’ Risk Preferences via Portfolio Choices*

Journal of Financial Econometrics ◽

10.1093/jjfinec/nbz040 ◽

2020 ◽

Author(s):

Humoud Alsabah ◽

Agostino Capponi ◽

Octavio Ruiz Lacedelli ◽

Matt Stern

Keyword(s):

Opportunity Cost ◽

Value Function ◽

Risk Preference ◽

Portfolio Decisions ◽

Learning Framework ◽

Portfolio Choices ◽

Trading Decisions ◽

Exploration Exploitation ◽

The Value Function ◽

Over Time

Abstract We introduce a reinforcement learning framework for retail robo-advising. The robo-advisor does not know the investor’s risk preference but learns it over time by observing her portfolio choices in different market environments. We develop an exploration–exploitation algorithm that trades off costly solicitations of portfolio choices by the investor with autonomous trading decisions based on stale estimates of investor’s risk aversion. We show that the approximate value function constructed by the algorithm converges to the value function of an omniscient robo-advisor over a number of periods that is polynomial in the state and action space. By correcting for the investor’s mistakes, the robo-advisor may outperform a stand-alone investor, regardless of the investor’s opportunity cost for making portfolio decisions.

Download Full-text