Bias Optimality in Controlled Queueing Systems

Moshe Haviv; Martin L. Puterman

doi:10.1239/jap/1032192558

Bias Optimality in Controlled Queueing Systems

Journal of Applied Probability ◽

10.1017/s0021900200014741 ◽

1998 ◽

Vol 35 (01) ◽

pp. 136-150 ◽

Cited By ~ 3

Author(s):

Moshe Haviv ◽

Martin L. Puterman

Keyword(s):

Queueing System ◽

Queueing Systems ◽

Control Limit ◽

Structure Of Solutions ◽

Stationary Policy ◽

Optimality Equation ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bias Optimality ◽

Control Limits

This paper studies an admission control M/M/1 queueing system. It shows that the only gain (average) optimal stationary policies with gain and bias which satisfy the optimality equation are of control limit type, that there are at most two and, if there are two, they occur consecutively. Conditions are provided which ensure the existence of two gain optimal control limit policies and are illustrated with an example. The main result is that bias optimality distinguishes these two gain optimal policies and that the larger of the two control limits is the unique bias optimal stationary policy. Consequently it is also Blackwell optimal. This result is established by appealing to the third optimality equation of the Markov decision process and some observations concerning the structure of solutions of the second optimality equation.

Download Full-text

Average Cost Semi-Markov Decision Processes and the Control of Queueing Systems

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800001121 ◽

1989 ◽

Vol 3 (2) ◽

pp. 247-272 ◽

Cited By ~ 47

Author(s):

Linn I. Sennott

Keyword(s):

Markov Decision Processes ◽

Average Cost ◽

Queueing Systems ◽

Decision Processes ◽

Single Server ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Poisson Arrivals ◽

Action Spaces

Semi-Markov decision processes underlie the control of many queueing systems. In this paper, we deal with infinite state semi-Markov decision processes with nonnegative, unbounded costs and finite action sets. Axioms for the existence of an expected average cost optimal stationary policy are presented. These conditions generalize the work in Sennott [22] for Markov decision processes. Verifiable conditions for the axioms to hold are obtained. The theory is applied to control of the M/G/l queue with variable service parameter, with on-off server, and with batch processing, and to control of the G/M/m queue with variable arrival parameter and customer rejection. It is applied to a timesharing network of queues with a single server and finally to optimal routing of Poisson arrivals to parallel exponential servers. The final section extends the existence result to compact action spaces.

Download Full-text

The Average Cost Optimality Equation and Critical Number Policies

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800002783 ◽

1993 ◽

Vol 7 (1) ◽

pp. 47-67 ◽

Cited By ~ 13

Author(s):

Linn I. Sennott

Keyword(s):

Average Cost ◽

Critical Number ◽

Stationary Policy ◽

Optimality Equation ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Optimality Inequality ◽

Positive Recurrent ◽

Average Cost Optimality Equation ◽

Cost Optimality

We consider a Markov decision chain with countable state space, finite action sets, and nonnegative costs. Conditions for the average cost optimality inequality to be an equality are derived. This extends work of Cavazos-Cadena [8]. It is shown that an optimal stationary policy must satisfy the optimality equation at all positive recurrent states. Structural results on the chain induced by an optimal stationary policy are derived. The results are employed in two examples to prove that any optimal stationary policy must be of critical number form.

Download Full-text

EXISTENCE OF OPTIMAL STATIONARY POLICIES IN FINITE DYNAMIC PROGRAMS WITH NONNEGATIVE REWARDS

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964801154082 ◽

2001 ◽

Vol 15 (4) ◽

pp. 557-564 ◽

Cited By ~ 1

Author(s):

Rolando Cavazos-Cadena ◽

Raúl Montes-de-Oca

Keyword(s):

Control Policy ◽

Stationary Policy ◽

Reward Function ◽

Total Reward ◽

Dynamic Programs ◽

Finite State ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Action Spaces ◽

Discounted Criterion

This article concerns Markov decision chains with finite state and action spaces, and a control policy is graded via the expected total-reward criterion associated to a nonnegative reward function. Within this framework, a classical theorem guarantees the existence of an optimal stationary policy whenever the optimal value function is finite, a result that is obtained via a limit process using the discounted criterion. The objective of this article is to present an alternative approach, based entirely on the properties of the expected total-reward index, to establish such an existence result.

Download Full-text

Sample-Path Optimal Stationary Policies in Stable Markov Decision Chains with the Average Reward Criterion

Journal of Applied Probability ◽

10.1239/jap/1437658607 ◽

2015 ◽

Vol 52 (2) ◽

pp. 419-440

Author(s):

Rolando Cavazos-Cadena ◽

Raúl Montes-De-Oca ◽

Karel Sladký

Keyword(s):

Sample Path ◽

Point Of View ◽

Average Reward ◽

Stationary Policy ◽

Optimality Equation ◽

Markov Decision ◽

Average Reward Criterion ◽

Compact Action Sets ◽

Path Point ◽

Reward Criterion

This paper concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function ℓ. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that if the expected average reward associated to ℓ2 is finite under any policy then a stationary policy obtained from the optimality equation in the standard way is sample-path average optimal in a strong sense.

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1017/s000186780000447x ◽

2010 ◽

Vol 42 (04) ◽

pp. 953-985 ◽

Cited By ~ 2

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

A note on bias optimality in controlled queueing systems

Journal of Applied Probability ◽

10.1239/jap/1014842288 ◽

2000 ◽

Vol 37 (1) ◽

pp. 300-305 ◽

Cited By ~ 12

Author(s):

Mark E. Lewis ◽

Martin L. Puterman

Keyword(s):

Optimal Policy ◽

Queueing Systems ◽

Control Limit ◽

Lower Control Limit ◽

Optimal Policies ◽

Holding Cost ◽

Bias Optimality ◽

Number Of Customers

The use of bias optimality to distinguish among gain optimal policies was recently studied by Haviv and Puterman [1] and extended in Lewis et al. [2]. In [1], upon arrival to an M/M/1 queue, customers offer the gatekeeper a reward R. If accepted, the gatekeeper immediately receives the reward, but is charged a holding cost, c(s), depending on the number of customers in the system. The gatekeeper, whose objective is to ‘maximize’ rewards, must decide whether to admit the customer. If the customer is accepted, the customer joins the queue and awaits service. Haviv and Puterman [1] showed there can be only two Markovian, stationary, deterministic gain optimal policies and that only the policy which uses the larger control limit is bias optimal. This showed the usefulness of bias optimality to distinguish between gain optimal policies. In the same paper, they conjectured that if the gatekeeper receives the reward upon completion of a job instead of upon entry, the bias optimal policy will be the lower control limit. This note confirms that conjecture.

Download Full-text

New discount and average optimality conditions for continuous-time Markov decision processes

Advances in Applied Probability ◽

10.1239/aap/1293113146 ◽

2010 ◽

Vol 42 (4) ◽

pp. 953-985 ◽

Cited By ~ 9

Author(s):

Xianping Guo ◽

Liuer Ye

Keyword(s):

Markov Decision Processes ◽

Continuous Time ◽

Average Cost ◽

Nonnegative Solution ◽

Decision Processes ◽

Stationary Policy ◽

Discounted Cost ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Bounded Below

This paper deals with continuous-time Markov decision processes in Polish spaces, under the discounted and average cost criteria. All underlying Markov processes are determined by given transition rates which are allowed to be unbounded, and the costs are assumed to be bounded below. By introducing an occupation measure of a randomized Markov policy and analyzing properties of occupation measures, we first show that the family of all randomized stationary policies is ‘sufficient’ within the class of all randomized Markov policies. Then, under the semicontinuity and compactness conditions, we prove the existence of a discounted cost optimal stationary policy by providing a value iteration technique. Moreover, by developing a new average cost, minimum nonnegative solution method, we prove the existence of an average cost optimal stationary policy under some reasonably mild conditions. Finally, we use some examples to illustrate applications of our results. Except that the costs are assumed to be bounded below, the conditions for the existence of discounted cost (or average cost) optimal policies are much weaker than those in the previous literature, and the minimum nonnegative solution approach is new.

Download Full-text

Average optimality for Markov decision processes in borel spaces: a new condition and approach

Journal of Applied Probability ◽

10.1017/s0021900200001662 ◽

2006 ◽

Vol 43 (02) ◽

pp. 318-334

Author(s):

Xianping Guo ◽

Quanxin Zhu

Keyword(s):

Markov Decision Processes ◽

Discrete Time ◽

Existence Of Solutions ◽

Sufficient Conditions ◽

Decision Processes ◽

Stationary Policy ◽

Markov Decision ◽

Optimal Stationary Policy ◽

Optimality Inequality ◽

Action Spaces

In this paper we study discrete-time Markov decision processes with Borel state and action spaces. The criterion is to minimize average expected costs, and the costs may have neither upper nor lower bounds. We first provide two average optimality inequalities of opposing directions and give conditions for the existence of solutions to them. Then, using the two inequalities, we ensure the existence of an average optimal (deterministic) stationary policy under additional continuity-compactness assumptions. Our conditions are slightly weaker than those in the previous literature. Also, some new sufficient conditions for the existence of an average optimal stationary policy are imposed on the primitive data of the model. Moreover, our approach is slightly different from the well-known ‘optimality inequality approach’ widely used in Markov decision processes. Finally, we illustrate our results in two examples.

Download Full-text

The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms

Journal of Applied Probability ◽

10.2307/3213407 ◽

1978 ◽

Vol 15 (2) ◽

pp. 356-373 ◽

Cited By ~ 34

Author(s):

A. Federgruen ◽

H. C. Tijms

Keyword(s):

Iteration Method ◽

Average Cost ◽

Bounded Solution ◽

Transition Probability ◽

Value Iteration ◽

Stationary Policy ◽

Optimality Equation ◽

Markov Decision ◽

Markov Decision Model ◽

Probability Matrices

This paper is concerned with the optimality equation for the average costs in a denumerable state semi-Markov decision model. It will be shown that under each of a number of recurrency conditions on the transition probability matrices associated with the stationary policies, the optimality equation has a bounded solution. This solution indeed yields a stationary policy which is optimal for a strong version of the average cost optimality criterion. Besides the existence of a bounded solution to the optimality equation, we will show that both the value-iteration method and the policy-iteration method can be used to determine such a solution. For the latter method we will prove that the average costs and the relative cost functions of the policies generated converge to a solution of the optimality equation.

Download Full-text