the value function Latest Research Papers

AbstractWe consider the optimal control problem for stochastic differential equations (SDEs) with random coefficients under the recursive-type objective functional captured by the backward SDE (BSDE). Due to the random coefficients, the associated Hamilton–Jacobi–Bellman (HJB) equation is a class of second-order stochastic PDEs (SPDEs) driven by Brownian motion, which we call the stochastic HJB (SHJB) equation. In addition, as we adopt the recursive-type objective functional, the drift term of the SHJB equation depends on the second component of its solution. These two generalizations cause several technical intricacies, which do not appear in the existing literature. We prove the dynamic programming principle (DPP) for the value function, for which unlike the existing literature we have to use the backward semigroup associated with the recursive-type objective functional. By the DPP, we are able to show the continuity of the value function. Using the Itô–Kunita’s formula, we prove the verification theorem, which constitutes a sufficient condition for optimality and characterizes the value function, provided that the smooth (classical) solution of the SHJB equation exists. In general, the smooth solution of the SHJB equation may not exist. Hence, we study the existence and uniqueness of the solution to the SHJB equation under two different weak solution concepts. First, we show, under appropriate assumptions, the existence and uniqueness of the weak solution via the Sobolev space technique, which requires converting the SHJB equation to a class of backward stochastic evolution equations. The second result is obtained under the notion of viscosity solutions, which is an extension of the classical one to the case for SPDEs. Using the DPP and the estimates of BSDEs, we prove that the value function is the viscosity solution to the SHJB equation. For applications, we consider the linear-quadratic problem, the utility maximization problem, and the European option pricing problem. Specifically, different from the existing literature, each problem is formulated by the generalized recursive-type objective functional and is subject to random coefficients. By applying the theoretical results of this paper, we obtain the explicit optimal solution for each problem in terms of the solution of the corresponding SHJB equation.

Download Full-text

An Offloading Algorithm based on Markov Decision Process in Mobile Edge Computing System

International Journal of Circuits, Systems and Signal Processing ◽

10.46300/9106.2022.16.15 ◽

2022 ◽

Vol 16 ◽

pp. 115-121

Author(s):

Bingxin Yao ◽

Bin Wu ◽

Siyun Wu ◽

Yin Ji ◽

Danggui Chen ◽

...

Keyword(s):

Energy Consumption ◽

Markov Decision Process ◽

Decision Process ◽

Value Function ◽

Wireless Channel ◽

Edge Computing ◽

Iteration Algorithm ◽

Mobile Edge Computing ◽

Markov Decision ◽

The Value Function

In this paper, an offloading algorithm based on Markov Decision Process (MDP) is proposed to solve the multi-objective offloading decision problem in Mobile Edge Computing (MEC) system. The feature of the algorithm is that MDP is used to make offloading decision. The number of tasks in the task queue, the number of accessible edge clouds and Signal-Noise-Ratio (SNR) of the wireless channel are taken into account in the state space of the MDP model. The offloading delay and energy consumption are considered to define the value function of the MDP model, i.e. the objective function. To maximize the value function, Value Iteration Algorithm is used to obtain the optimal offloading policy. According to the policy, tasks of mobile terminals (MTs) are offloaded to the edge cloud or central cloud, or executed locally. The simulation results show that the proposed algorithm can effectively reduce the offloading delay and energy consumption.

Download Full-text

A variational inequality arising from optimal surrender of variable annuity with lookback benefit

Journal of Inequalities and Applications ◽

10.1186/s13660-021-02743-3 ◽

2022 ◽

Vol 2022 (1) ◽

Author(s):

Junkee Jeon ◽

Minsuk Kwak

Keyword(s):

Variational Inequality ◽

Value Function ◽

Numerical Solutions ◽

Integration Method ◽

Comparative Statics ◽

One Dimensional ◽

Math Econ ◽

The One ◽

Recursive Integration ◽

The Value Function

AbstractWe introduce a variable annuity (VA) contract with a surrender option and lookback benefit, that is, the benefit of the VA contract is linked to the maximum process of the policyholder’s account value. In contrast to the constant guarantee model provided in Bernard et al. (Insur. Math. Econ. 55:116–128, 2014), it is optimal for the policyholder of the VA contract with lookback benefit to surrender the VA contract when the policyholder’s account value is below or equal to the optimal surrender boundary. Thus, from the perspective of the insurer to construct a portfolio of VA contracts, utilizing the VA contracts with lookback benefit along with VA contracts with constant guarantee provides the diversification of early surrenders. The valuation of this contract can be described as a two-dimensional parabolic variational inequality. By converting this into the one-dimensional problem, we obtain the integral equations for the value function and the free boundary. The recursive integration method is applied to obtain the numerical solutions. We also provide comparative statics of the optimal surrender boundaries with respect to various parameters.

Download Full-text

The value function of a transportation problem

Operations Research Letters ◽

10.1016/j.orl.2021.12.003 ◽

2022 ◽

Author(s):

Victor Domansky ◽

Victoria Kreps

Keyword(s):

Transportation Problem ◽

Value Function ◽

The Value Function

Download Full-text

Statistical inference of the value function for reinforcement learning in infinite‐horizon settings

Journal of the Royal Statistical Society Series B (Statistical Methodology) ◽

10.1111/rssb.12465 ◽

2021 ◽

Author(s):

Chengchun Shi ◽

Sheng Zhang ◽

Wenbin Lu ◽

Rui Song

Keyword(s):

Reinforcement Learning ◽

Statistical Inference ◽

Value Function ◽

Infinite Horizon ◽

The Value Function

Download Full-text

HOW TO NOT MIX WORDS AND OBJECTS

Логико-философские штудии ◽

10.52119/lphs.2021.55.75.008 ◽

2021 ◽

pp. 291-303

Author(s):

Иван Борисович Микиртумов

Keyword(s):

Value Function ◽

World Line ◽

Intensional Logic ◽

Main Task ◽

Object Language ◽

Rigid Designator ◽

Possible World ◽

Rigid Designators ◽

The Value Function ◽

The Way

В статье я излагаю свои соображения по поводу статьи Евгения Борисова, помещённой в этом выпуске журнала. Попутно я излагаю своё видение проблем кросс-мировой предикации и кросс-идентификации. Я полагаю, что межмировое тождество невозможно и что главная задача состоит в обеспечении идентификации. Для этого можно использовать либо метод поддержания когнитивного контакта либо метод двойников, отождествляемых по набору существенных признаков. Он определяется прагматически. Метод жёстких десигнаторов также ведёт к интенсиональной логике, поскольку в языке-объекте должны присутствовать релятивизованные к мирам имена объектов. Борисов пытается построить логику кросс-мировой предикации сразу на нескольких основаниях, которые плохо совместимы друг с другом. Он квантифицирует по возможным индивидам, но при этом пытается опереться на метаязыковые имена индивидов как на основание для кросс-идентификации, метаязыковое имя индивида становится аргументом для функции значения, хотя не является жёстким десигнатором. Ключевая операция системы Борисова - назначение двойника в возможном мире - спрятана за функцией f, которая выступает в роли условия идентификации, т. е. прочерчивает кросс-мировую линию. На мой взгляд, система имеет потенциал, но нуждается в додумывании и уточнении. In this article, I present my comments on the article by Evgeny Borisov, which is included in this issue of the journal. Along the way, I set out my vision of the problems of cross-world predication and cross-identification. I believe that cross-world identity is impossible, and that the main task is to provide identification. To do this, you can use either the method of keeping cognitive contact, or the method of counterparts identified by a set of essential features, which is defined pragmatically. The method of rigid designators leads to intensional logic, since the object language must contain object names that are relativized to worlds. Borisov is trying to build the logic of cross-world predication on several bases at once, which are poorly compatible with each other. He quantifies over the domain of possible individuals, but at the same time he tries to rely on the metalinguistic names of individuals as a basis for cross-identification, the metalinguistic name of an individual becomes an argument for the value function, although it is not a rigid designator. The key operation of Borisov’s system is the appointment of a counterpart in a possible world. It is hidden behind the function f, which acts as a condition for identification, that is, it draws a cross-world line. In my opinion, the system has some good prospects, but it needs to be thought out and refined.

Download Full-text

Approximation of value function of differential game with minimal cost

Vestnik Udmurtskogo Universiteta Matematika Mekhanika Komp yuternye Nauki ◽

10.35634/vm210402 ◽

2021 ◽

Vol 31 (4) ◽

pp. 536-561

Author(s):

Yu.V. Averboukh

Keyword(s):

Differential Game ◽

Continuous Time ◽

Bellman Equation ◽

Value Function ◽

Stochastic Game ◽

Inequality Constraints ◽

Minimal Cost ◽

Parabolic Pde ◽

Zero Sum ◽

The Value Function

The paper is concerned with the approximation of the value function of the zero-sum differential game with the minimal cost, i.e., the differential game with the payoff functional determined by the minimization of some quantity along the trajectory by the solutions of continuous-time stochastic games with the stopping governed by one player. Notice that the value function of the auxiliary continuous-time stochastic game is described by the Isaacs–Bellman equation with additional inequality constraints. The Isaacs–Bellman equation is a parabolic PDE for the case of stochastic differential game and it takes a form of system of ODEs for the case of continuous-time Markov game. The approximation developed in the paper is based on the concept of the stochastic guide first proposed by Krasovskii and Kotelnikova.

Download Full-text

On properties of one functional used in software constructions for solving differential games

Vestnik Udmurtskogo Universiteta Matematika Mekhanika Komp yuternye Nauki ◽

10.35634/vm210410 ◽

2021 ◽

Vol 31 (4) ◽

pp. 668-696

Author(s):

A.G. Chentsov

Keyword(s):

Value Function ◽

Sufficient Conditions ◽

Game Problem ◽

Limit Function ◽

Phase Constraints ◽

Nonlinear Differential Game ◽

Target Set ◽

Iteration Number ◽

Game Position ◽

The Value Function

Nonlinear differential game (DG) is investigated; relaxations of the game problem of guidance are investigated also. The variant of the program iterations method realized in the space of position functions and delivering in limit the value function of the minimax-maximin DG for special functionals of a trajectory is considered. For every game position, this limit function realizes the least size of the target set neighborhood for which, under proportional weakening of phase constraints, the player interested in a guidance yet guarantees its realization. Properties of above-mentioned functionals and limit function are investigated. In particular, sufficient conditions for realization of values of given function under fulfilment of finite iteration number are obtained.

Download Full-text

Continuity of the value function for deterministic optimal impulse control with terminal state constraint

ESAIM Control Optimisation and Calculus of Variations ◽

10.1051/cocv/2021101 ◽

2021 ◽

Author(s):

Yue Zhou ◽

Xinwei Feng ◽

Jiongmin Yong

Keyword(s):

Viscosity Solution ◽

Impulse Control ◽

Value Function ◽

State Constraint ◽

Terminal State ◽

Dynamic Programming Principle ◽

Hamilton Jacobi Bellman ◽

Optimal Impulse ◽

The Value Function ◽

Terminal State Constraint

Deterministic optimal impulse control problem with terminal state constraint is considered. Due to the appearance of the terminal state constraint, the value function might be discontinuous in general. The main contribution of this paper is the introduction of an intrinsic condition under which the value function is proved to be continuous. Then by a Bellman dynamic programming principle, the corresponding Hamilton-Jacobi-Bellman type quasi-variational inequality (QVI, for short) is derived. The value function is proved to be a viscosity solution to such a QVI. The issue of whether the value function is characterized as the unique viscosity solution to this QVI is carefully addressed and the answer is left open challengingly.

Download Full-text

On the forward algorithm for stopping problems on continuous-time Markov chains

Journal of Applied Probability ◽

10.1017/jpr.2021.11 ◽

2021 ◽

Vol 58 (4) ◽

pp. 1043-1063

Author(s):

Laurent Miclo ◽

Stéphane Villeneuve

Keyword(s):

Markov Chains ◽

Optimal Stopping ◽

Continuous Time ◽

Value Function ◽

Constructive Method ◽

Forward Algorithm ◽

Continuous Time Markov Chains ◽

Stopping Set ◽

Optimal Stopping Problems ◽

The Value Function

AbstractWe revisit the forward algorithm, developed by Irle, to characterize both the value function and the stopping set for a large class of optimal stopping problems on continuous-time Markov chains. Our objective is to renew interest in this constructive method by showing its usefulness in solving some constrained optimal stopping problems that have emerged recently.

Download Full-text

the value function
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Stochastic optimal control with random coefficients and associated stochastic Hamilton–Jacobi–Bellman equations

An Offloading Algorithm based on Markov Decision Process in Mobile Edge Computing System

A variational inequality arising from optimal surrender of variable annuity with lookback benefit

The value function of a transportation problem

Statistical inference of the value function for reinforcement learning in infinite‐horizon settings

HOW TO NOT MIX WORDS AND OBJECTS

Approximation of value function of differential game with minimal cost

On properties of one functional used in software constructions for solving differential games

Continuity of the value function for deterministic optimal impulse control with terminal state constraint

On the forward algorithm for stopping problems on continuous-time Markov chains

Export Citation Format

the value functionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Stochastic optimal control with random coefficients and associated stochastic Hamilton–Jacobi–Bellman equations

An Offloading Algorithm based on Markov Decision Process in Mobile Edge Computing System

A variational inequality arising from optimal surrender of variable annuity with lookback benefit

The value function of a transportation problem

Statistical inference of the value function for reinforcement learning in infinite‐horizon settings

HOW TO NOT MIX WORDS AND OBJECTS

Approximation of value function of differential game with minimal cost

On properties of one functional used in software constructions for solving differential games

Continuity of the value function for deterministic optimal impulse control with terminal state constraint

On the forward algorithm for stopping problems on continuous-time Markov chains

the value function
Recently Published Documents