A Stochastic Optimal Control Policy for A Manufacturing System on A Finite Time Horizon

Author(s):  
Eugene Khmelnitsky ◽  
Gonen Singer
Author(s):  
Han Zhang ◽  
Yibei Li ◽  
Xiaoming Hu

AbstractIn this paper, the problem of inverse quadratic optimal control over finite time-horizon for discrete-time linear systems is considered. Our goal is to recover the corresponding quadratic objective function using noisy observations. First, the identifiability of the model structure for the inverse optimal control problem is analyzed under relative degree assumption and we show the model structure is strictly globally identifiable. Next, we study the inverse optimal control problem whose initial state distribution and the observation noise distribution are unknown, yet the exact observations on the initial states are available. We formulate the problem as a risk minimization problem and approximate the problem using empirical average. It is further shown that the solution to the approximated problem is statistically consistent under the assumption of relative degrees. We then study the case where the exact observations on the initial states are not available, yet the observation noises are known to be white Gaussian distributed and the distribution of the initial state is also Gaussian (with unknown mean and covariance). EM-algorithm is used to estimate the parameters in the objective function. The effectiveness of our results are demonstrated by numerical examples.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1466
Author(s):  
Beatris Adriana Escobedo-Trujillo ◽  
José Daniel López-Barrientos ◽  
Javier Garrido-Meléndez

This work presents a study of a finite-time horizon stochastic control problem with restrictions on both the reward and the cost functions. To this end, it uses standard dynamic programming techniques, and an extension of the classic Lagrange multipliers approach. The coefficients considered here are supposed to be unbounded, and the obtained strategies are of non-stationary closed-loop type. The driving thread of the paper is a sequence of examples on a pollution accumulation model, which is used for the purpose of showing three algorithms for the purpose of replicating the results. There, the reader can find a result on the interchangeability of limits in a Dirichlet problem.


Sign in / Sign up

Export Citation Format

Share Document