Efficient approximate dynamic programming based on design and analysis of computer experiments for infinite-horizon optimization

Approximate dynamic programming, also known as reinforcement learning, is applied for optimal control of Antilock Brake Systems (ABS) in ground vehicles. As an accurate and control oriented model of the brake system, quarter vehicle model with hydraulic brake system is selected. Due to the switching nature of hydraulic brake system of ABS, an optimal switching solution is generated through minimizing a performance index that penalizes the braking distance and forces the vehicle velocity to go to zero, while preventing wheel lock-ups. Towards this objective, a value iteration algorithm is selected for ‘learning’ the infinite horizon solution. Artificial neural networks, as powerful function approximators, are utilized for approximating the value function. The training is conducted offline using least squares. Once trained, the converged neural network is used for determining optimal decisions for the actuators on the fly. Numerical simulations show that this approach is very promising while having low real-time computational burden, hence, outperforms many existing solutions in the literature.

Download Full-text

Infinite horizon optimal control of affine nonlinear discrete switched systems using two-stage approximate dynamic programming

International Journal of Systems Science ◽

10.1080/00207721.2010.549590 ◽

2012 ◽

Vol 43 (9) ◽

pp. 1673-1682 ◽

Cited By ~ 17

Author(s):

Ning Cao ◽

Huaguang Zhang ◽

Yanhong Luo ◽

Dezhi Feng

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Switched Systems ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Two Stage ◽

Discrete Switched Systems

Download Full-text

Approximate Dynamic Programming for Military Medical Evacuation Dispatching Policies

INFORMS Journal on Computing ◽

10.1287/ijoc.2019.0930 ◽

2020 ◽

Author(s):

Phillip R. Jenkins ◽

Matthew J. Robbins ◽

Brian J. Lunday

Keyword(s):

Dynamic Programming ◽

High Intensity ◽

Approximate Dynamic Programming ◽

Infinite Horizon ◽

Problem Instance ◽

Classical Dynamic ◽

Medical Evacuation ◽

Solution Methods ◽

Problem Instances ◽

Solution Techniques

Military medical planners must consider how aerial medical evacuation (MEDEVAC) assets will be dispatched when preparing for and supporting high-intensity combat operations. The dispatching authority seeks to dispatch MEDEVAC assets to prioritized requests for service, such that battlefield casualties are effectively and efficiently transported to nearby medical-treatment facilities. We formulate and solve a discounted, infinite-horizon Markov decision process (MDP) model of the MEDEVAC dispatching problem. Because the high dimensionality and uncountable state space of our MDP model renders classical dynamic programming solution methods intractable, we instead apply approximate dynamic programming (ADP) solution methods to produce high-quality dispatching policies relative to the currently practiced closest-available dispatching policy. We develop, test, and compare two distinct ADP solution techniques, both of which utilize an approximate policy iteration (API) algorithmic framework. The first algorithm uses least-squares temporal differences (LSTD) learning for policy evaluation, whereas the second algorithm uses neural network (NN) learning. We construct a notional, yet representative planning scenario based on high-intensity combat operations in southern Azerbaijan to demonstrate the applicability of our MDP model and to compare the efficacies of our proposed ADP solution techniques. We generate 30 problem instances via a designed experiment to examine how selected problem features and algorithmic features affect the quality of solutions attained by our ADP policies. Results show that the respective policies determined by the NN-API and LSTD-API algorithms significantly outperform the closest-available benchmark policies in 27 (90%) and 24 (80%) of the problem instances examined. Moreover, the NN-API policies significantly outperform the LSTD-API policies in each of the problem instances examined. Compared with the closest-available policy for the baseline problem instance, the NN-API policy decreases the average response time of important urgent (i.e., life-threatening) requests by 39 minutes. These research models, methodologies, and results inform the implementation and modification of current and future MEDEVAC tactics, techniques, and procedures, as well as the design and purchase of future aerial MEDEVAC assets.

Download Full-text

Optimization for Large-Scale Multi-Mission Space Campaign Design by Approximate Dynamic Programming

2018 AIAA SPACE and Astronautics Forum and Exposition ◽

10.2514/6.2018-5287 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hao Chen ◽

Arthur Lapin ◽

Takaya Ukai ◽

Chao Lei ◽

Koki Ho

Keyword(s):

Dynamic Programming ◽

Large Scale ◽

Approximate Dynamic Programming

Download Full-text

Undiscounted Control Policy Generation for Continuous-Valued Optimal Control by Approximate Dynamic Programming

International Journal of Control ◽

10.1080/00207179.2021.1939892 ◽

2021 ◽

pp. 1-35

Author(s):

Jonathan Lock ◽

Tomas McKelvey

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Approximate Dynamic Programming ◽

Control Policy

Download Full-text

Large-scale dynamic system optimization using dual decomposition method with approximate dynamic programming

Systems & Control Letters ◽

10.1016/j.sysconle.2021.104894 ◽

2021 ◽

Vol 150 ◽

pp. 104894

Author(s):

Pegah Rokhforoz ◽

Hamed Kebriaei ◽

Majid Nili Ahmadabadi

Keyword(s):

Dynamic Programming ◽

Dynamic System ◽

Decomposition Method ◽

Large Scale ◽

Approximate Dynamic Programming ◽

System Optimization ◽

Dual Decomposition

Download Full-text

Control of a Buck DC/DC Converter Using Approximate Dynamic Programming and Artificial Neural Networks

IEEE Transactions on Circuits and Systems I Regular Papers ◽

10.1109/tcsi.2021.3053468 ◽

2021 ◽

pp. 1-9

Author(s):

Weizhen Dong ◽

Shuhui Li ◽

Xingang Fu ◽

Zhongwen Li ◽

Michael Fairbank ◽

...

Keyword(s):

Neural Networks ◽

Dynamic Programming ◽

Artificial Neural Networks ◽

Approximate Dynamic Programming ◽

Artificial Neural

Download Full-text

Flexible Expansion Planning of Distribution System Integrating Multiple Renewable Energy Sources: An Approximate Dynamic Programming Approach

Energy ◽

10.1016/j.energy.2021.120367 ◽

2021 ◽

pp. 120367

Author(s):

Qirun Sun ◽

Zhi Wu ◽

Wei Gu ◽

Tao Zhu ◽

Lei Zhong ◽

...

Keyword(s):

Dynamic Programming ◽

Renewable Energy ◽

Distribution System ◽

Approximate Dynamic Programming ◽

Renewable Energy Sources ◽

Energy Sources ◽

Programming Approach ◽

Dynamic Programming Approach ◽

Expansion Planning

Download Full-text

Optimizing trading decisions of wind power plants with hybrid energy storage systems using backwards approximate dynamic programming

International Journal of Production Economics ◽

10.1016/j.ijpe.2021.108155 ◽

2021 ◽

pp. 108155

Author(s):

Benedikt Finnah ◽

Jochen Gönsch

Keyword(s):

Dynamic Programming ◽

Energy Storage ◽

Wind Power ◽

Power Plants ◽

Approximate Dynamic Programming ◽

Storage Systems ◽

Hybrid Energy ◽

Hybrid Energy Storage ◽

Wind Power Plants ◽

Trading Decisions

Download Full-text

Robust-adaptive dynamic programming-based time-delay control of autonomous ships under stochastic disturbances using an actor-critic learning algorithm

Journal of Marine Science and Technology ◽

10.1007/s00773-021-00813-1 ◽

2021 ◽

Author(s):

Hossein Nejatbakhsh Esfahani ◽

Rafal Szlapczynski

Keyword(s):

Dynamic Programming ◽

Time Delay ◽

Control Algorithm ◽

Approximate Dynamic Programming ◽

Robust Adaptive Control ◽

Stochastic Disturbances ◽

Time Delay Control ◽

Model Free ◽

Delay Control ◽

Robust Adaptive

AbstractThis paper proposes a hybrid robust-adaptive learning-based control scheme based on Approximate Dynamic Programming (ADP) for the tracking control of autonomous ship maneuvering. We adopt a Time-Delay Control (TDC) approach, which is known as a simple, practical, model free and roughly robust strategy, combined with an Actor-Critic Approximate Dynamic Programming (ACADP) algorithm as an adaptive part in the proposed hybrid control algorithm. Based on this integration, Actor-Critic Time-Delay Control (AC-TDC) is proposed. It offers a high-performance robust-adaptive control approach for path following of autonomous ships under deterministic and stochastic disturbances induced by the winds, waves, and ocean currents. Computer simulations have been conducted under two different conditions in terms of the deterministic and stochastic disturbances and all simulation results indicate an acceptable performance in tracking of paths for the proposed control algorithm in comparison with the conventional TDC approach.

Download Full-text