scholarly journals Controlling a random population

2021 ◽  
Vol Volume 17, Issue 4 ◽  
Author(s):  
Thomas Colcombet ◽  
Nathanaël Fijalkow ◽  
Pierre Ohlmann

Bertrand et al. introduced a model of parameterised systems, where each agent is represented by a finite state system, and studied the following control problem: for any number of agents, does there exist a controller able to bring all agents to a target state? They showed that the problem is decidable and EXPTIME-complete in the adversarial setting, and posed as an open problem the stochastic setting, where the agent is represented by a Markov decision process. In this paper, we show that the stochastic control problem is decidable. Our solution makes significant uses of well quasi orders, of the max-flow min-cut theorem, and of the theory of regular cost functions. We introduce an intermediate problem of independence interest called the sequential flow problem and study its complexity.

Author(s):  
Thomas Colcombet ◽  
Nathanaël Fijalkow ◽  
Pierre Ohlmann

AbstractBertrand et al. introduced a model of parameterised systems, where each agent is represented by a finite state system, and studied the following control problem: for any number of agents, does there exist a controller able to bring all agents to a target state? They showed that the problem is decidable and EXPTIME-complete in the adversarial setting, and posed as an open problem the stochastic setting, where the agent is represented by a Markov decision process. In this paper, we show that the stochastic control problem is decidable. Our solution makes significant uses of well quasi orders, of the max-flow min-cut theorem, and of the theory of regular cost functions.


2020 ◽  
Vol 17 (6) ◽  
pp. 172988142096908
Author(s):  
Mao Zheng ◽  
Shuo Xie ◽  
Xiumin Chu ◽  
Tianquan Zhu ◽  
Guohao Tian

To learn the optimal collision avoidance policy of merchant ships controlled by human experts, a finite-state Markov decision process model for ship collision avoidance is proposed based on the analysis of collision avoidance mechanism, and an inverse reinforcement learning (IRL) method based on cross entropy and projection is proposed to obtain the optimal policy from expert’s demonstrations. Collision avoidance simulations in different ship encounters are conducted and the results show that the policy obtained by the proposed IRL has a good inversion effect on two kinds of human experts, which indicate that the proposed method can effectively learn the policy of human experts for ship collision avoidance.


2012 ◽  
Author(s):  
Krishnamoorthy Kalyanam ◽  
Swaroop Darbha ◽  
Myoungkuk Park ◽  
Meir Pachter ◽  
Phil Chandler ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document