scholarly journals CertRL: formalizing convergence proofs for value and policy iteration in Coq

Author(s):  
Koundinya Vajjha ◽  
Avraham Shinnar ◽  
Barry Trager ◽  
Vasily Pestun ◽  
Nathan Fulton
2021 ◽  
Vol 11 (5) ◽  
pp. 2312
Author(s):  
Dengguo Xu ◽  
Qinglin Wang ◽  
Yuan Li

In this study, based on the policy iteration (PI) in reinforcement learning (RL), an optimal adaptive control approach is established to solve robust control problems of nonlinear systems with internal and input uncertainties. First, the robust control is converted into solving an optimal control containing a nominal or auxiliary system with a predefined performance index. It is demonstrated that the optimal control law enables the considered system globally asymptotically stable for all admissible uncertainties. Second, based on the Bellman optimality principle, the online PI algorithms are proposed to calculate robust controllers for the matched and the mismatched uncertain systems. The approximate structure of the robust control law is obtained by approximating the optimal cost function with neural network in PI algorithms. Finally, in order to illustrate the availability of the proposed algorithm and theoretical results, some numerical examples are provided.


Author(s):  
Sudeep Kundu ◽  
Karl Kunisch

AbstractPolicy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case. Here we analyze the case with control constraints both for the HJB equations which arise in deterministic and in stochastic control cases. The linear equations in each iteration step are solved by an implicit upwind scheme. Numerical examples are conducted to solve the HJB equation with control constraints and comparisons are shown with the unconstrained cases.


Sign in / Sign up

Export Citation Format

Share Document