Bias optimal admission control policies for a multiclass nonstationary queueing system

2002 ◽  
Vol 39 (01) ◽  
pp. 20-37 ◽  
Author(s):  
Mark E. Lewis ◽  
Hayriye Ayhan ◽  
Robert D. Foley

We consider a finite-capacity queueing system where arriving customers offer rewards which are paid upon acceptance into the system. The gatekeeper, whose objective is to ‘maximize’ rewards, decides if the reward offered is sufficient to accept or reject the arriving customer. Suppose the arrival rates, service rates, and system capacity are changing over time in a known manner. We show that all bias optimal (a refinement of long-run average reward optimal) policies are of threshold form. Furthermore, we give sufficient conditions for the bias optimal policy to be monotonic in time. We show, via a counterexample, that if these conditions are violated, the optimal policy may not be monotonic in time or of threshold form.

2002 ◽  
Vol 39 (1) ◽  
pp. 20-37 ◽  
Author(s):  
Mark E. Lewis ◽  
Hayriye Ayhan ◽  
Robert D. Foley

We consider a finite-capacity queueing system where arriving customers offer rewards which are paid upon acceptance into the system. The gatekeeper, whose objective is to ‘maximize’ rewards, decides if the reward offered is sufficient to accept or reject the arriving customer. Suppose the arrival rates, service rates, and system capacity are changing over time in a known manner. We show that all bias optimal (a refinement of long-run average reward optimal) policies are of threshold form. Furthermore, we give sufficient conditions for the bias optimal policy to be monotonic in time. We show, via a counterexample, that if these conditions are violated, the optimal policy may not be monotonic in time or of threshold form.


1999 ◽  
Vol 13 (3) ◽  
pp. 309-327 ◽  
Author(s):  
Mark E. Lewis ◽  
Hayriye Ayhan ◽  
Robert D. Foley

We consider a finite capacity queueing system in which each arriving customer offers a reward. A gatekeeper decides based on the reward offered and the space remaining whether each arriving customer should be accepted or rejected. The gatekeeper only receives the offered reward if the customer is accepted. A traditional objective function is to maximize the gain, that is, the long-run average reward. It is quite possible, however, to have several different gain optimal policies that behave quite differently. Bias and Blackwell optimality are more refined objective functions that can distinguish among multiple stationary, deterministic gain optimal policies. This paper focuses on describing the structure of stationary, deterministic, optimal policies and extending this optimality to distinguish between multiple gain optimal policies. We show that these policies are of trunk reservation form and must occur consecutively. We then prove that we can distinguish among these gain optimal policies using the bias or transient reward and extend to Blackwell optimality.


1994 ◽  
Vol 8 (4) ◽  
pp. 463-489 ◽  
Author(s):  
Eugene A. Feinberg ◽  
Martin I. Reiman

We consider a controlled queueing system that is a generalization of the M/M/c/W queue. There are m types of customers that arrive according to independent Poisson processes. Service times are exponential and independent and do not depend on the customer type. There is room in the system for a total of N customers; if there are N customers in the system, new arrivals are lost. Type j customers are more profitable than type (j + 1 ) customers, j = 2,…, m —, and type 1 customers are at least as profitable as type 2 customers. The allowed control is to accept or reject customers at arrival. No preemption of customers in service is allowed. The goal is to maximize the average reward per unit of time subject to a constraint that the blocking probability of type 1 customers is no greater than a given level.For an M/M/c/c system without a constraint, Miller [12] proved that an optimal policy has a simple threshold structure. We show that, for the constrained problem described above, an optimal policy has a similar structure, but one of the thresholds might have to be randomized. We also derive an algorithm that constructs an optimal policy and describe other forms of optimal policies.


2001 ◽  
Vol 38 (2) ◽  
pp. 369-385 ◽  
Author(s):  
Mark E. Lewis

We consider a controlled M/M/1 queueing system where customers may be subject to two potential rejections. The first occurs upon arrival and is dependent on the number of customers in the queue and the service rate of the customer currently in service. The second, which may or may not occur, occurs immediately prior to the customer receiving service. That is, after each service completion the customer in the front of the queue is assessed and the service rate of that customer is revealed. If the second decision-maker recommends rejection, the customer is denied service with a fixed probability. We show the existence of long-run average optimal monotone switching-curve policies. Further, we show that the average reward is increasing in the probability that the second decision-maker's recommendation of rejection is honored. Applications include call centers with delayed classifications and manufacturing systems when the server is responsible for multiple tasks.


2001 ◽  
Vol 38 (02) ◽  
pp. 369-385 ◽  
Author(s):  
Mark E. Lewis

We consider a controlled M/M/1 queueing system where customers may be subject to two potential rejections. The first occurs upon arrival and is dependent on the number of customers in the queue and the service rate of the customer currently in service. The second, which may or may not occur, occurs immediately prior to the customer receiving service. That is, after each service completion the customer in the front of the queue is assessed and the service rate of that customer is revealed. If the second decision-maker recommends rejection, the customer is denied service with a fixed probability. We show the existence of long-run average optimal monotone switching-curve policies. Further, we show that the average reward is increasing in the probability that the second decision-maker's recommendation of rejection is honored. Applications include call centers with delayed classifications and manufacturing systems when the server is responsible for multiple tasks.


1983 ◽  
Vol 15 (2) ◽  
pp. 274-303 ◽  
Author(s):  
Arie Hordijk ◽  
Frank A. Van Der Duyn Schouten

Recently the authors introduced the concept of Markov decision drift processes. A Markov decision drift process can be seen as a straightforward generalization of a Markov decision process with continuous time parameter. In this paper we investigate the existence of stationary average optimal policies for Markov decision drift processes. Using a well-known Abelian theorem we derive sufficient conditions, which guarantee that a ‘limit point' of a sequence of discounted optimal policies with the discounting factor approaching 1 is an average optimal policy. An alternative set of sufficient conditions is obtained for the case in which the discounted optimal policies generate regenerative stochastic processes. The latter set of conditions is easier to verify in several applications. The results of this paper are also applicable to Markov decision processes with discrete or continuous time parameter and to semi-Markov decision processes. In this sense they generalize some well-known results for Markov decision processes with finite or compact action space. Applications to an M/M/1 queueing model and a maintenance replacement model are given. It is shown that under certain conditions on the model parameters the average optimal policy for the M/M/1 queueing model is monotone non-decreasing (as a function of the number of waiting customers) with respect to the service intensity and monotone non-increasing with respect to the arrival intensity. For the maintenance replacement model we prove the average optimality of a bang-bang type policy. Special attention is paid to the computation of the optimal control parameters.


2003 ◽  
Vol 17 (2) ◽  
pp. 251-265 ◽  
Author(s):  
I.J.B.F. Adan ◽  
J.A.C. Resing ◽  
V.G. Kulkarni

Stochastic discretization is a technique of representing a continuous random variable as a random sum of i.i.d. exponential random variables. In this article, we apply this technique to study the limiting behavior of a stochastic fluid model. Specifically, we consider an infinite-capacity fluid buffer, where the net input of fluid is regulated by a finite-state irreducible continuous-time Markov chain. Most long-run performance characteristics for such a fluid system can be expressed as the long-run average reward for a suitably chosen reward structure. In this article, we use stochastic discretization of the fluid content process to efficiently determine the long-run average reward. This method transforms the continuous-state Markov process describing the fluid model into a discrete-state quasi-birth–death process. Hence, standard tools, such as the matrix-geometric approach, become available for the analysis of the fluid buffer. To demonstrate this approach, we analyze the output of a buffer processing fluid from K sources on a first-come first-served basis.


1982 ◽  
Vol 19 (2) ◽  
pp. 301-309 ◽  
Author(s):  
Zvi Rosberg

A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached in one transition. It is shown that under an ergodicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem, for which previous criteria are inapplicable, is given as an application.


2017 ◽  
Vol 32 (2) ◽  
pp. 163-178 ◽  
Author(s):  
Kenneth C. Chong ◽  
Shane G. Henderson ◽  
Mark E. Lewis

We consider the problem of routing and admission control in a loss system featuring two classes of arriving jobs (high-priority and low-priority jobs) and two types of servers, in which decision-making for high-priority jobs is forced, and rewards influence the desirability of each of the four possible routing decisions. We seek a policy that maximizes expected long-run reward, under both the discounted reward and long-run average reward criteria, and formulate the problem as a Markov decision process. When the reward structure favors high-priority jobs, we demonstrate that there exists an optimal monotone switching curve policy with slope of at least −1. When the reward structure favors low-priority jobs, we demonstrate that the value function, in general, lacks structure, which complicates the search for structure in optimal policies. However, we identify conditions under which optimal policies can be characterized in greater detail. We also examine the performance of heuristic policies in a brief numerical study.


Sign in / Sign up

Export Citation Format

Share Document