Computing Optimal Policies for Controlled Tandem Queueing Systems

This paper studies the expected average cost control problem for discrete-time Markov decision processes with denumerably infinite state spaces. A sequence of finite state space truncations is defined such that the average costs and average optimal policies in the sequence converge to the optimal average cost and an optimal policy in the original process. The theory is illustrated with several examples from the control of discrete-time queueing systems. Numerical results are discussed.

Download Full-text

A note on bias optimality in controlled queueing systems

Journal of Applied Probability ◽

10.1239/jap/1014842288 ◽

2000 ◽

Vol 37 (1) ◽

pp. 300-305 ◽

Cited By ~ 12

Author(s):

Mark E. Lewis ◽

Martin L. Puterman

Keyword(s):

Optimal Policy ◽

Queueing Systems ◽

Control Limit ◽

Lower Control Limit ◽

Optimal Policies ◽

Holding Cost ◽

Bias Optimality ◽

Number Of Customers

The use of bias optimality to distinguish among gain optimal policies was recently studied by Haviv and Puterman [1] and extended in Lewis et al. [2]. In [1], upon arrival to an M/M/1 queue, customers offer the gatekeeper a reward R. If accepted, the gatekeeper immediately receives the reward, but is charged a holding cost, c(s), depending on the number of customers in the system. The gatekeeper, whose objective is to ‘maximize’ rewards, must decide whether to admit the customer. If the customer is accepted, the customer joins the queue and awaits service. Haviv and Puterman [1] showed there can be only two Markovian, stationary, deterministic gain optimal policies and that only the policy which uses the larger control limit is bias optimal. This showed the usefulness of bias optimality to distinguish between gain optimal policies. In the same paper, they conjectured that if the gatekeeper receives the reward upon completion of a job instead of upon entry, the bias optimal policy will be the lower control limit. This note confirms that conjecture.

Download Full-text

Sample Path Criteria for Weak Majorization

Advances in Applied Probability ◽

10.1017/s0001867800026057 ◽

1994 ◽

Vol 26 (01) ◽

pp. 155-171 ◽

Cited By ~ 1

Author(s):

Panayotis D. Sparaggis ◽

Don Towsley ◽

Christos G. Cassandras

Keyword(s):

Performance Measures ◽

Queueing Systems ◽

Sample Path ◽

Cumulative Number ◽

Stochastic Orderings ◽

Optimal Policies ◽

Weak Majorization ◽

Queue Lengths ◽

And Performance

We present two forms of weak majorization, namely, very weak majorization and p-weak majorization that can be used as sample path criteria in the analysis of queueing systems. We demonstrate how these two criteria can be used in making comparisons among the joint queue lengths of queueing systems with blocking and/or multiple classes, by capturing an interesting interaction between state and performance descriptors. As a result, stochastic orderings on performance measures such as the cumulative number of losses can be derived. We describe applications that involve the determination of optimal policies in the context of load-balancing and scheduling.

Download Full-text

Containment of socially optimal policies in multiple-facility Markovian queueing systems

Journal of the Operational Research Society ◽

10.1057/jors.2015.98 ◽

2016 ◽

Vol 67 (4) ◽

pp. 629-643 ◽

Cited By ~ 5

Author(s):

Rob Shone ◽

Vincent A Knight ◽

Paul R Harper ◽

Janet E Williams ◽

John Minty

Keyword(s):

Queueing Systems ◽

Optimal Policies

Download Full-text

Open bandit processes and optimal scheduling of queueing networks

Advances in Applied Probability ◽

10.2307/1427399 ◽

1988 ◽

Vol 20 (2) ◽

pp. 447-472 ◽

Cited By ~ 32

Author(s):

Tze Leung Lai ◽

Zhiliang Ying

Keyword(s):

Queueing Networks ◽

Queueing Systems ◽

Optimal Scheduling ◽

Discount Factor ◽

Asymptotic Approximations ◽

Bandit Problem ◽

Bandit Problems ◽

Optimal Policies ◽

Gittins Indices

Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.

Download Full-text

Open bandit processes and optimal scheduling of queueing networks

Advances in Applied Probability ◽

10.1017/s0001867800017067 ◽

1988 ◽

Vol 20 (02) ◽

pp. 447-472 ◽

Cited By ~ 7

Author(s):

Tze Leung Lai ◽

Zhiliang Ying

Keyword(s):

Queueing Networks ◽

Queueing Systems ◽

Optimal Scheduling ◽

Discount Factor ◽

Asymptotic Approximations ◽

Bandit Problem ◽

Bandit Problems ◽

Optimal Policies ◽

Gittins Indices

Asymptotic approximations are developed herein for the optimal policies in discounted multi-armed bandit problems in which new projects are continually appearing, commonly known as ‘open bandit problems’ or ‘arm-acquiring bandits’. It is shown that under certain stability assumptions the open bandit problem is asymptotically equivalent to a closed bandit problem in which there is no arrival of new projects, as the discount factor approaches 1. Applications of these results to optimal scheduling of queueing networks are given. In particular, Klimov&s priority indices for scheduling queueing networks are shown to be limits of the Gittins indices for the associated closed bandit problem, and extensions of Klimov&s results to preemptive policies and to unstable queueing systems are given.

Download Full-text

Sample Path Criteria for Weak Majorization

Advances in Applied Probability ◽

10.2307/1427584 ◽

1994 ◽

Vol 26 (1) ◽

pp. 155-171 ◽

Cited By ~ 8

Author(s):

Panayotis D. Sparaggis ◽

Don Towsley ◽

Christos G. Cassandras

Keyword(s):

Performance Measures ◽

Queueing Systems ◽

Sample Path ◽

Cumulative Number ◽

Stochastic Orderings ◽

Optimal Policies ◽

Weak Majorization ◽

Queue Lengths ◽

And Performance

We present two forms of weak majorization, namely, very weak majorization and p-weak majorization that can be used as sample path criteria in the analysis of queueing systems. We demonstrate how these two criteria can be used in making comparisons among the joint queue lengths of queueing systems with blocking and/or multiple classes, by capturing an interesting interaction between state and performance descriptors. As a result, stochastic orderings on performance measures such as the cumulative number of losses can be derived. We describe applications that involve the determination of optimal policies in the context of load-balancing and scheduling.

Download Full-text

A note on bias optimality in controlled queueing systems

Journal of Applied Probability ◽

10.1017/s002190020001545x ◽

2000 ◽

Vol 37 (01) ◽

pp. 300-305 ◽

Cited By ~ 8

Author(s):

Mark E. Lewis ◽

Martin L. Puterman

Keyword(s):

Optimal Policy ◽

Queueing Systems ◽

Control Limit ◽

Lower Control Limit ◽

Optimal Policies ◽

Holding Cost ◽

Bias Optimality ◽

Number Of Customers

The use ofbias optimalityto distinguish among gain optimal policies was recently studied by Haviv and Puterman [1] and extended in Lewiset al.[2]. In [1], upon arrival to anM/M/1 queue, customers offer the gatekeeper a rewardR. If accepted, the gatekeeper immediately receives the reward, but is charged a holding cost,c(s), depending on the number of customers in the system. The gatekeeper, whose objective is to ‘maximize’ rewards, must decide whether to admit the customer. If the customer is accepted, the customer joins the queue and awaits service. Haviv and Puterman [1] showed there can be only two Markovian, stationary, deterministic gain optimal policies and that only the policy which uses thelargercontrol limit is bias optimal. This showed the usefulness of bias optimality to distinguish between gain optimal policies. In the same paper, they conjectured that if the gatekeeper receives the reward uponcompletionof a job instead of upon entry, the bias optimal policy will be the lower control limit. This note confirms that conjecture.

Download Full-text

The Computation of Average Optimal Policies in Denumerable State Markov Decision Chains

Advances in Applied Probability ◽

10.2307/1427863 ◽

1997 ◽

Vol 29 (1) ◽

pp. 114-137 ◽

Cited By ~ 4

Author(s):

Linn I. Sennott

Keyword(s):

Discrete Time ◽

Average Cost ◽

Queueing Systems ◽

State Spaces ◽

Original Process ◽

Optimal Policies ◽

Finite State ◽

Markov Decision ◽

Optimal Average ◽

Infinite State

This paper studies the expected average cost control problem for discrete-time Markov decision processes with denumerably infinite state spaces. A sequence of finite state space truncations is defined such that the average costs and average optimal policies in the sequence converge to the optimal average cost and an optimal policy in the original process. The theory is illustrated with several examples from the control of discrete-time queueing systems. Numerical results are discussed.

Download Full-text

Optimal policies in the presence of non-economic objectives

The Economics of Trade Protection ◽

10.1017/cbo9780511571978.013 ◽

1990 ◽

pp. 262-266

Keyword(s):

Optimal Policies

Download Full-text