state and control constraints Latest Research Papers

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>

Download Full-text

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

10.36227/techrxiv.17205740 ◽

2021 ◽

Author(s):

Xinglong Zhang ◽

Yaoqian Peng ◽

Biao Luo ◽

Wei Pan ◽

Xin Xu ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Policy ◽

Intelligent Vehicles ◽

Time Varying ◽

Control Constraints ◽

Model Based ◽

Safety Constraints ◽

And Control ◽

State And Control Constraints

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>

Download Full-text

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

10.36227/techrxiv.17205740.v1 ◽

2021 ◽

Author(s):

Xinglong Zhang ◽

Yaoqian Peng ◽

Biao Luo ◽

Wei Pan ◽

Xin Xu ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Policy ◽

Intelligent Vehicles ◽

Time Varying ◽

Control Constraints ◽

Model Based ◽

Safety Constraints ◽

And Control ◽

State And Control Constraints

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>

Download Full-text

Safe Learning Reference Governor: Theory and Application to Fuel Truck Rollover Avoidance

Journal of Autonomous Vehicles and Systems ◽

10.1115/1.4053244 ◽

2021 ◽

pp. 1-19

Author(s):

Kaiwen Liu ◽

Nan Li ◽

Ilya Kolmanovsky ◽

Denise Rizzo ◽

Anouck Girard

Keyword(s):

Liquid Fuel ◽

Black Box ◽

Operating Conditions ◽

Type Model ◽

Control Constraints ◽

Tank Truck ◽

Accurate Model ◽

And Control ◽

State And Control Constraints ◽

Reference Governor

Abstract This paper proposes a learning reference governor (LRG) approach to enforce state and control constraints in systems for which an accurate model is unavailable; and this approach enables the reference governor to gradually improve command tracking performance through learning while enforcing the constraints during learning and after learning is completed. The learning can be performed either on a black-box type model of the system or directly on the hardware. After introducing the LRG algorithm and outlining its theoretical properties, this paper investigates LRG application to fuel truck (tank truck) rollover avoidance. Through simulations based on a fuel truck model that accounts for liquid fuel sloshing effects, we show that the proposed LRG can effectively protect fuel trucks from rollover accidents under various operating conditions.

Download Full-text

On the turnpike property with interior decay for optimal control problems

Mathematics of Control Signals and Systems ◽

10.1007/s00498-021-00280-4 ◽

2021 ◽

Author(s):

Martin Gugat

Keyword(s):

Optimal Control ◽

Differential Equations ◽

Objective Function ◽

Optimal Control Problems ◽

Terminal State ◽

Control Problems ◽

Time Interval ◽

Time Intervals ◽

And Control ◽

State And Control Constraints

AbstractIn this paper the turnpike phenomenon is studied for problems of optimal control where both pointwise-in-time state and control constraints can appear. We assume that in the objective function, a tracking term appears that is given as an integral over the time-interval $$[0,\, T]$$ [ 0 , T ] and measures the distance to a desired stationary state. In the optimal control problem, both the initial and the desired terminal state are prescribed. We assume that the system is exactly controllable in an abstract sense if the time horizon is long enough. We show that that the corresponding optimal control problems on the time intervals $$[0, \, T]$$ [ 0 , T ] give rise to a turnpike structure in the sense that for natural numbers n if T is sufficiently large, the contribution of the objective function from subintervals of [0, T] of the form $$\begin{aligned} {[}t - t/2^n,\; t + (T-t)/2^n] \end{aligned}$$ [ t - t / 2 n , t + ( T - t ) / 2 n ] is of the order $$1/\min \{t^n, (T-t)^n\}$$ 1 / min { t n , ( T - t ) n } . We also show that a similar result holds for $$\epsilon $$ ϵ -optimal solutions of the optimal control problems if $$\epsilon >0$$ ϵ > 0 is chosen sufficiently small. At the end of the paper we present both systems that are governed by ordinary differential equations and systems governed by partial differential equations where the results can be applied.

Download Full-text