Construction of Semi-Markov Decision Process Models of Continuous State Space Environments Using Growing Cell Structures and Multiagentk-Certainty Exploration Method

k-certainty exploration method, an efficient reinforcement learning algorithm, is not applied to environments whose state space is continuous because continuous state space must be changed to discrete state space. Our purpose is to construct discrete semi-Markov decision process (SMDP) models of such environments using growing cell structures to autonomously divide continuous state space then usingk-certainty exploration method to construct SMDP models. Multiagentk-certainty exploration method is then used to improve exploration efficiency. Mobile robot simulation demonstrated our proposal's usefulness and efficiency.

Download Full-text

Interactive spoken content retrieval by extended query model and continuous state space Markov Decision Process

2013 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2013.6639326 ◽

2013 ◽

Cited By ~ 3

Author(s):

Tsung-Hsien Wen ◽

Hung-yi Lee ◽

Pei-hao Su ◽

Lin-Shan Lee

Keyword(s):

State Space ◽

Markov Decision Process ◽

Decision Process ◽

Content Retrieval ◽

Continuous State Space ◽

Continuous State ◽

Markov Decision ◽

Query Model

Download Full-text

Computing optimal (s, S) policies in inventory models with continuous demands

Advances in Applied Probability ◽

10.1017/s0001867800015056 ◽

1985 ◽

Vol 17 (02) ◽

pp. 424-442 ◽

Cited By ~ 1

Author(s):

A. Federgruen ◽

P. Zipkin

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Inventory Models ◽

Cost Criterion ◽

Continuous State Space ◽

Continuous State ◽

Phase Type ◽

Phase Type Distributions ◽

Markov Decision ◽

Continuous Demand

Special algorithms have been developed to compute an optimal (s, S) policy for an inventory model with discrete demand and under standard assumptions (stationary data, a well-behaved one-period cost function, full backlogging and the average cost criterion). We present here an iterative algorithm for continuous demand distributions which avoids any form of prior discretization. The method can be viewed as a modified form of policy iteration applied to a Markov decision process with continuous state space. For phase-type distributions, the calculations can be done in closed form.

Download Full-text

Computing optimal (s, S) policies in inventory models with continuous demands

Advances in Applied Probability ◽

10.2307/1427149 ◽

1985 ◽

Vol 17 (2) ◽

pp. 424-442 ◽

Cited By ~ 19

Author(s):

A. Federgruen ◽

P. Zipkin

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Inventory Models ◽

Cost Criterion ◽

Continuous State Space ◽

Continuous State ◽

Phase Type ◽

Phase Type Distributions ◽

Markov Decision ◽

Continuous Demand

Download Full-text

Semiparametric Estimation of Markov Decision Processes with Continuous State Space

SSRN Electronic Journal ◽

10.2139/ssrn.1654335 ◽

2010 ◽

Cited By ~ 1

Author(s):

Sorawoot Srisuma ◽

Oliver B. Linton

Keyword(s):

State Space ◽

Markov Decision Processes ◽

Semiparametric Estimation ◽

Decision Processes ◽

Continuous State Space ◽

Continuous State ◽

Markov Decision

Download Full-text

A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces

2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning ◽

10.1109/adprl.2009.4927527 ◽

2009 ◽

Cited By ~ 7

Author(s):

Jun Ma ◽

Warren B. Powell

Keyword(s):

Least Squares ◽

Markov Decision Process ◽

Decision Process ◽

Recursive Least Squares ◽

Iteration Algorithm ◽

Continuous State ◽

Markov Decision ◽

Approximate Policy Iteration ◽

Policy Iteration Algorithm ◽

Action Spaces

Download Full-text

Blackwell optimal policies in a Markov decision process with a Borel state space

Mathematical Methods of Operations Research ◽

10.1007/bf01432969 ◽

1994 ◽

Vol 40 (3) ◽

pp. 253-288 ◽

Cited By ~ 8

Author(s):

A. A. Yushkevich

Keyword(s):

State Space ◽

Markov Decision Process ◽

Decision Process ◽

Borel State Space ◽

Optimal Policies ◽

Markov Decision

Download Full-text

Asymptotic behaviour of continuous time, continuous state-space branching processes

Journal of Applied Probability ◽

10.1017/s0021900200118108 ◽

1974 ◽

Vol 11 (04) ◽

pp. 669-677 ◽

Cited By ~ 14

Author(s):

D. R. Grey

Keyword(s):

Asymptotic Behaviour ◽

State Space ◽

Discrete Time ◽

Continuous Time ◽

Branching Processes ◽

Discrete State ◽

Space And Time ◽

Continuous State Space ◽

Continuous State ◽

Discrete State Space

Results on the behaviour of Markov branching processes as time goes to infinity, hitherto obtained for models which assume a discrete state-space or discrete time or both, are here generalised to a model with both state-space and time continuous. The results are similar but the methods not always so.

Download Full-text

A STATISTICAL METHOD FOR DETECTING CYCLES IN DISCRETE DYNAMICAL SYSTEMS

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127496001521 ◽

1996 ◽

Vol 06 (12a) ◽

pp. 2375-2388 ◽

Cited By ~ 2

Author(s):

MARKUS LOHMANN ◽

JAN WENZELBURGER

Keyword(s):

Dynamical Systems ◽

State Space ◽

Statistical Method ◽

Basins Of Attraction ◽

Discrete State ◽

Continuous State Space ◽

Continuous State ◽

Long Term Behavior ◽

Discrete Time Dynamical Systems

This paper introduces a statistical method for detecting cycles in discrete time dynamical systems. The continuous state space is replaced by a discrete one consisting of cells. Hashing is used to represent the cells in the computer’s memory. An algorithm for a two-parameter bifurcation analysis is presented which uses the statistical method to detect cycles in the discrete state space. The output of this analysis is a colored cartogram where parameter regions are marked according to the long-term behavior of the system. Moreover, the algorithm allows the computation of basins of attraction of cycles.

Download Full-text

Strong 0-discount optimal policies in a Markov decision process with a Borel state space

Mathematical Methods of Operations Research ◽

10.1007/bf01415675 ◽

1995 ◽

Vol 42 (1) ◽

pp. 93-108 ◽

Cited By ~ 3

Author(s):

A. A. Yushkevich

Keyword(s):

State Space ◽

Markov Decision Process ◽

Decision Process ◽

Borel State Space ◽

Optimal Policies ◽

Markov Decision

Download Full-text

Solving Continual Combinatorial Selection via Deep Reinforcement Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/481 ◽

2019 ◽

Author(s):

Hyungseok Song ◽

Hyeryung Jang ◽

Hai H. Tran ◽

Se-eun Yoon ◽

Kyunghwan Son ◽

...

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Markov Decision Process ◽

Decision Process ◽

Joint Action ◽

Expressive Power ◽

State Space Explosion ◽

Exponential Increase ◽

Markov Decision ◽

Action Spaces

We consider the Markov Decision Process (MDP) of selecting a subset of items at each step, termed the Select-MDP (S-MDP). The large state and action spaces of S-MDPs make them intractable to solve with typical reinforcement learning (RL) algorithms especially when the number of items is huge. In this paper, we present a deep RL algorithm to solve this issue by adopting the following key ideas. First, we convert the original S-MDP into an Iterative Select-MDP (IS-MDP), which is equivalent to the S-MDP in terms of optimal actions. IS-MDP decomposes a joint action of selecting K items simultaneously into K iterative selections resulting in the decrease of actions at the expense of an exponential increase of states. Second, we overcome this state space explosion by exploiting a special symmetry in IS-MDPs with novel weight shared Q-networks, which provably maintain sufficient expressive power. Various experiments demonstrate that our approach works well even when the item space is large and that it scales to environments with item spaces different from those used in training.

Download Full-text