Markov decision processes for integrating life cycle dynamics into fab-level decision making

1999 ◽  
Vol 32 (2) ◽  
pp. 4852-4857
Author(s):  
Shalabh Bhatnagar ◽  
Michael C. Fu ◽  
Steven I. Marcus ◽  
Ying He
2018 ◽  
Author(s):  
Koosha Khalvati ◽  
Seongmin A. Park ◽  
Saghar Mirbagheri ◽  
Remi Philippe ◽  
Mariateresa Sestito ◽  
...  

AbstractTo make decisions in a social context, humans have to predict the behavior of others, an ability that is thought to rely on having a model of other minds known as theory of mind. Such a model becomes especially complex when the number of people one simultaneously interacts is large and the actions are anonymous. Here, we show that in order to make decisions within a large group, humans employ Bayesian inference to model the “mind of the group,” making predictions of others’ decisions while also considering the effects of their own actions on the group as a whole. We present results from a group decision making task known as the Volunteers Dilemma and demonstrate that a Bayesian model based on partially observable Markov decision processes outperforms existing models in quantitatively explaining human behavior. Our results suggest that in group decision making, rather than acting based solely on the rewards received thus far, humans maintain a model of the group and simulate the group’s dynamics into the future in order to choose an action as a member of the group.


Author(s):  
Alvaro Velasquez

In this paper, we introduce the Steady-State Policy Synthesis (SSPS) problem which consists of finding a stochastic decision-making policy that maximizes expected rewards while satisfying a set of asymptotic behavioral specifications. These specifications are determined by the steady-state probability distribution resulting from the Markov chain induced by a given policy. Since such distributions necessitate recurrence, we propose a solution which finds policies that induce recurrent Markov chains within possibly non-recurrent Markov Decision Processes (MDPs). The SSPS problem functions as a generalization of steady-state control, which has been shown to be in PSPACE. We improve upon this result by showing that SSPS is in P via linear programming. Our results are validated using CPLEX simulations on MDPs with over 10000 states. We also prove that the deterministic variant of SSPS is NP-hard.


Author(s):  
Matthew Hoffman ◽  
Nando de Freitas

Semi-Markov decision processes are used to formulate many control problems and also play a key role in hierarchical reinforcement learning. In this chapter we show how to translate the decision making problem into a form that can instead be solved by inference and learning techniques. In particular, we will establish a formal connection between planning in semi-Markov decision processes and inference in probabilistic graphical models, then build on this connection to develop an expectation maximization (EM) algorithm for policy optimization in these models.


Sign in / Sign up

Export Citation Format

Share Document