state and control constraints
Recently Published Documents


TOTAL DOCUMENTS

113
(FIVE YEARS 13)

H-INDEX

20
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Xinglong Zhang ◽  
Yaoqian Peng ◽  
Biao Luo ◽  
Wei Pan ◽  
Xin Xu ◽  
...  

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>


2021 ◽  
Author(s):  
Xinglong Zhang ◽  
Yaoqian Peng ◽  
Biao Luo ◽  
Wei Pan ◽  
Xin Xu ◽  
...  

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>


2021 ◽  
Author(s):  
Xinglong Zhang ◽  
Yaoqian Peng ◽  
Biao Luo ◽  
Wei Pan ◽  
Xin Xu ◽  
...  

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>


Author(s):  
Kaiwen Liu ◽  
Nan Li ◽  
Ilya Kolmanovsky ◽  
Denise Rizzo ◽  
Anouck Girard

Abstract This paper proposes a learning reference governor (LRG) approach to enforce state and control constraints in systems for which an accurate model is unavailable; and this approach enables the reference governor to gradually improve command tracking performance through learning while enforcing the constraints during learning and after learning is completed. The learning can be performed either on a black-box type model of the system or directly on the hardware. After introducing the LRG algorithm and outlining its theoretical properties, this paper investigates LRG application to fuel truck (tank truck) rollover avoidance. Through simulations based on a fuel truck model that accounts for liquid fuel sloshing effects, we show that the proposed LRG can effectively protect fuel trucks from rollover accidents under various operating conditions.


Author(s):  
Martin Gugat

AbstractIn this paper the turnpike phenomenon is studied for problems of optimal control where both pointwise-in-time state and control constraints can appear. We assume that in the objective function, a tracking term appears that is given as an integral over the time-interval $$[0,\, T]$$ [ 0 , T ] and measures the distance to a desired stationary state. In the optimal control problem, both the initial and the desired terminal state are prescribed. We assume that the system is exactly controllable in an abstract sense if the time horizon is long enough. We show that that the corresponding optimal control problems on the time intervals $$[0, \, T]$$ [ 0 , T ] give rise to a turnpike structure in the sense that for natural numbers n if T is sufficiently large, the contribution of the objective function from subintervals of [0, T] of the form $$\begin{aligned} {[}t - t/2^n,\; t + (T-t)/2^n] \end{aligned}$$ [ t - t / 2 n , t + ( T - t ) / 2 n ] is of the order $$1/\min \{t^n, (T-t)^n\}$$ 1 / min { t n , ( T - t ) n } . We also show that a similar result holds for $$\epsilon $$ ϵ -optimal solutions of the optimal control problems if $$\epsilon >0$$ ϵ > 0 is chosen sufficiently small. At the end of the paper we present both systems that are governed by ordinary differential equations and systems governed by partial differential equations where the results can be applied.


2020 ◽  
Vol 65 (10) ◽  
pp. 4288-4294
Author(s):  
Dominic Liao-McPherson ◽  
Marco M. Nicotra ◽  
Asen L. Dontchev ◽  
Ilya V. Kolmanovsky ◽  
Vladimir. M. Veliov

Sign in / Sign up

Export Citation Format

Share Document