Strong consistency of a modified maximum likelihood estimator for controlled Markov chains

1980 ◽  
Vol 17 (03) ◽  
pp. 726-734 ◽  
Author(s):  
Bharat Doshi ◽  
Steven E. Shreve

A controlled Markov chain with finite state space has transition probabilities which depend on an unknown parameter α lying in a known finite set A. For each α, a stationary control law ϕ α is given. This paper develops a control scheme whereby at each stage t a parameter α t is chosen at random from among those parameters which nearly maximize the log likelihood function, and the control ut is chosen according to the control law ϕ αt. It is proved that this algorithm leads to identification of the true α under conditions weaker than any previously considered.

1980 ◽  
Vol 17 (3) ◽  
pp. 726-734 ◽  
Author(s):  
Bharat Doshi ◽  
Steven E. Shreve

A controlled Markov chain with finite state space has transition probabilities which depend on an unknown parameter α lying in a known finite set A. For each α, a stationary control law ϕ α is given. This paper develops a control scheme whereby at each stage t a parameter α t is chosen at random from among those parameters which nearly maximize the log likelihood function, and the control ut is chosen according to the control law ϕ αt. It is proved that this algorithm leads to identification of the true α under conditions weaker than any previously considered.


1973 ◽  
Vol 5 (02) ◽  
pp. 328-339 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.


1973 ◽  
Vol 5 (2) ◽  
pp. 328-339 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.


1994 ◽  
Vol 8 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Madhav Desai ◽  
Sunil Kumar ◽  
P. R. Kumar

We consider time-inhomogeneous Markov chains on a finite state-space, whose transition probabilitiespij(t) = cijε(t)Vij are proportional to powers of a vanishing small parameter ε(t). We determine the precise relationship between this chain and the corresponding time-homogeneous chains pij= cijε(t)vij, as ε ↘ 0. Let {} be the steady-state distribution of this time-homogeneous chain. We characterize the orders {ηι} in = θ(εηι). We show that if ε(t) ↘ 0 slowly enough, then the timewise occupation measures βι := sup { q > 0 | Prob(x(t) = i) = + ∞}, called the recurrence orders, satisfy βi — βj = ηj — ηi. Moreover, : = { ηι|ηι = minj} is the set of ground states of the time-homogeneous chain, then x(t) → . in an appropriate sense, whenever η(t) is “cooled” slowly. We also show that there exists a critical ρ* such that x(t) → if and only if = + ∞. We characterize this critical rate as ρ* = max.min min max. Finally, we provide a graph algorithm for determining the orders [ηi] [βi] and the critical rate ρ*.


1973 ◽  
Vol 5 (3) ◽  
pp. 521-540 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.


2017 ◽  
Vol 50 (3) ◽  
pp. 1535
Author(s):  
C. Panorias ◽  
A. Papadopoulou ◽  
T. Tsapanos

In the present paper, the earthquake occurrences in the area of Japan, are studied by a semi Markov model which is considered homogeneous in time. The data applied refer to earthquakes of large magnitude (Mw>6.0) during the period 1900-2012. We consider 9 seismic zones derived from the typical 11 zones for the area of Japan, due to the lack of data for 3 zones (9-th,10-th and 11-th). Also, we define 3 groups for the magnitudes, corresponding to 6-7,7.1-8 and M> 8.0. Thus, we consider for our semi Markov model a finite state space, S={ ( ,)j i ZR | i=1,...9, j=1,2,3}, where i Z defines the i-th seismic zone and j R states the j-th magnitude scale. We applied the data to describe the interval transition probabilities for the states and the model's limiting behaviour for which is sufficient an interval of time of seven years. The time unit of the model is considered to be one day. Some interesting results, concerning the interval transition probabilities and the limiting state vector, are derived.


1970 ◽  
Vol 7 (3) ◽  
pp. 771-775
Author(s):  
I. V. Basawa

Let {Xk}, k = 1, 2, ··· be a sequence of random variables forming a homogeneous Markov chain on a finite state-space, S = {1, 2, ···, s}. Xk could be thought of as the state at time k of some physical system for which are the (one-step) transition probabilities. It is assumed that all the states are inter-communicating, so that the transition matrix P = ((pij)) is irreducible.


1994 ◽  
Vol 8 (1) ◽  
pp. 51-68
Author(s):  
Masaaki Kijima

This article considers separation for a birth-death process on a finite state space S = [1,2,…, N]. Separation is defined by si(t) = 1 – minj∈sPij(t)/πj, as in Fill [5,6], where Pij(t) denotes the transition probabilities of the birth-death process and πj the stationary probabilities. Separation is a measure of nonstationarity of Markov chains and provides an upper bound of the variation distance. Easily computable upper bounds for si-(t) are given, which consist of simple exponential functions whose parameters are the eigenvalues of the infinitesimal generator or its submatrices of the birth-death process.


1973 ◽  
Vol 5 (03) ◽  
pp. 521-540 ◽  
Author(s):  
John Bather

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a given convex family of distributions depending on the present state. The immediate cost is prescribed for each choice and it is required to minimise the average expected cost over an infinite future. The paper considers a special case of this general problem and provides the foundation for a general solution. The main result is that an optimal policy exists if each state of the system can be reached with positive probability from any other state by choosing a suitable policy.


1970 ◽  
Vol 7 (03) ◽  
pp. 771-775
Author(s):  
I. V. Basawa

Let {Xk }, k = 1, 2, ··· be a sequence of random variables forming a homogeneous Markov chain on a finite state-space, S = {1, 2, ···, s}. Xk could be thought of as the state at time k of some physical system for which are the (one-step) transition probabilities. It is assumed that all the states are inter-communicating, so that the transition matrix P = ((pij )) is irreducible.


Sign in / Sign up

Export Citation Format

Share Document