BIAS OPTIMALITY IN A QUEUE WITH ADMISSION CONTROL

We consider a finite-capacity queueing system where arriving customers offer rewards which are paid upon acceptance into the system. The gatekeeper, whose objective is to ‘maximize’ rewards, decides if the reward offered is sufficient to accept or reject the arriving customer. Suppose the arrival rates, service rates, and system capacity are changing over time in a known manner. We show that all bias optimal (a refinement of long-run average reward optimal) policies are of threshold form. Furthermore, we give sufficient conditions for the bias optimal policy to be monotonic in time. We show, via a counterexample, that if these conditions are violated, the optimal policy may not be monotonic in time or of threshold form.

Download Full-text

Bias optimal admission control policies for a multiclass nonstationary queueing system

Journal of Applied Probability ◽

10.1239/jap/1019737985 ◽

2002 ◽

Vol 39 (1) ◽

pp. 20-37 ◽

Cited By ~ 6

Author(s):

Mark E. Lewis ◽

Hayriye Ayhan ◽

Robert D. Foley

Keyword(s):

Optimal Policy ◽

Queueing System ◽

Sufficient Conditions ◽

System Capacity ◽

Average Reward ◽

Finite Capacity ◽

Long Run ◽

Optimal Policies ◽

Service Rates ◽

Long Run Average Reward

We consider a finite-capacity queueing system where arriving customers offer rewards which are paid upon acceptance into the system. The gatekeeper, whose objective is to ‘maximize’ rewards, decides if the reward offered is sufficient to accept or reject the arriving customer. Suppose the arrival rates, service rates, and system capacity are changing over time in a known manner. We show that all bias optimal (a refinement of long-run average reward optimal) policies are of threshold form. Furthermore, we give sufficient conditions for the bias optimal policy to be monotonic in time. We show, via a counterexample, that if these conditions are violated, the optimal policy may not be monotonic in time or of threshold form.

Download Full-text

Average optimal policies in a controlled queueing system with dual admission control

Journal of Applied Probability ◽

10.1239/jap/996986750 ◽

2001 ◽

Vol 38 (2) ◽

pp. 369-385 ◽

Cited By ~ 7

Author(s):

Mark E. Lewis

Keyword(s):

Admission Control ◽

Manufacturing Systems ◽

Queueing System ◽

Service Rate ◽

Average Reward ◽

Long Run ◽

Switching Curve ◽

Multiple Tasks ◽

Optimal Policies ◽

Number Of Customers

We consider a controlled M/M/1 queueing system where customers may be subject to two potential rejections. The first occurs upon arrival and is dependent on the number of customers in the queue and the service rate of the customer currently in service. The second, which may or may not occur, occurs immediately prior to the customer receiving service. That is, after each service completion the customer in the front of the queue is assessed and the service rate of that customer is revealed. If the second decision-maker recommends rejection, the customer is denied service with a fixed probability. We show the existence of long-run average optimal monotone switching-curve policies. Further, we show that the average reward is increasing in the probability that the second decision-maker's recommendation of rejection is honored. Applications include call centers with delayed classifications and manufacturing systems when the server is responsible for multiple tasks.

Download Full-text

Average optimal policies in a controlled queueing system with dual admission control

Journal of Applied Probability ◽

10.1017/s0021900200019914 ◽

2001 ◽

Vol 38 (02) ◽

pp. 369-385 ◽

Cited By ~ 5

Author(s):

Mark E. Lewis

Keyword(s):

Admission Control ◽

Manufacturing Systems ◽

Queueing System ◽

Service Rate ◽

Average Reward ◽

Long Run ◽

Switching Curve ◽

Multiple Tasks ◽

Optimal Policies ◽

Number Of Customers

We consider a controlled M/M/1 queueing system where customers may be subject to two potential rejections. The first occurs upon arrival and is dependent on the number of customers in the queue and the service rate of the customer currently in service. The second, which may or may not occur, occurs immediately prior to the customer receiving service. That is, after each service completion the customer in the front of the queue is assessed and the service rate of that customer is revealed. If the second decision-maker recommends rejection, the customer is denied service with a fixed probability. We show the existence of long-run average optimal monotone switching-curve policies. Further, we show that the average reward is increasing in the probability that the second decision-maker's recommendation of rejection is honored. Applications include call centers with delayed classifications and manufacturing systems when the server is responsible for multiple tasks.

Download Full-text

TWO-CLASS ROUTING WITH ADMISSION CONTROL AND STRICT PRIORITIES

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964817000195 ◽

2017 ◽

Vol 32 (2) ◽

pp. 163-178 ◽

Cited By ~ 1

Author(s):

Kenneth C. Chong ◽

Shane G. Henderson ◽

Mark E. Lewis

Keyword(s):

Admission Control ◽

Numerical Study ◽

Reward Structure ◽

Long Run ◽

Switching Curve ◽

Optimal Policies ◽

Markov Decision ◽

Average Reward Criteria ◽

The Value Function ◽

Long Run Average Reward

We consider the problem of routing and admission control in a loss system featuring two classes of arriving jobs (high-priority and low-priority jobs) and two types of servers, in which decision-making for high-priority jobs is forced, and rewards influence the desirability of each of the four possible routing decisions. We seek a policy that maximizes expected long-run reward, under both the discounted reward and long-run average reward criteria, and formulate the problem as a Markov decision process. When the reward structure favors high-priority jobs, we demonstrate that there exists an optimal monotone switching curve policy with slope of at least −1. When the reward structure favors low-priority jobs, we demonstrate that the value function, in general, lacks structure, which complicates the search for structure in optimal policies. However, we identify conditions under which optimal policies can be characterized in greater detail. We also examine the performance of heuristic policies in a brief numerical study.

Download Full-text

A simulation-based learning automata framework for solving semi-Markov decision problems under long-run average reward

IIE Transactions ◽

10.1080/07408170490438672 ◽

2004 ◽

Vol 36 (6) ◽

pp. 557-567 ◽

Cited By ~ 14

Author(s):

ABHIJIT GOSAVI ◽

TAPAS K. DAS ◽

SUDEEP SARKAR

Keyword(s):

Learning Automata ◽

Decision Problems ◽

Average Reward ◽

Markov Decision Problems ◽

Long Run ◽

Simulation Based ◽

Markov Decision ◽

Long Run Average Reward

Download Full-text

Optimality of Randomized Trunk Reservation

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964800003570 ◽

1994 ◽

Vol 8 (4) ◽

pp. 463-489 ◽

Cited By ~ 30

Author(s):

Eugene A. Feinberg ◽

Martin I. Reiman

Keyword(s):

Optimal Policy ◽

Queueing System ◽

Blocking Probability ◽

Average Reward ◽

Constrained Problem ◽

Optimal Policies ◽

Customer Type ◽

Time Subject

We consider a controlled queueing system that is a generalization of the M/M/c/W queue. There are m types of customers that arrive according to independent Poisson processes. Service times are exponential and independent and do not depend on the customer type. There is room in the system for a total of N customers; if there are N customers in the system, new arrivals are lost. Type j customers are more profitable than type (j + 1 ) customers, j = 2,…, m —, and type 1 customers are at least as profitable as type 2 customers. The allowed control is to accept or reject customers at arrival. No preemption of customers in service is allowed. The goal is to maximize the average reward per unit of time subject to a constraint that the blocking probability of type 1 customers is no greater than a given level.For an M/M/c/c system without a constraint, Miller [12] proved that an optimal policy has a simple threshold structure. We show that, for the constrained problem described above, an optimal policy has a similar structure, but one of the thresholds might have to be randomized. We also derive an algorithm that constructs an optimal policy and describe other forms of optimal policies.

Download Full-text

STOCHASTIC DISCRETIZATION FOR THE LONG-RUN AVERAGE REWARD IN FLUID MODELS

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964803172075 ◽

2003 ◽

Vol 17 (2) ◽

pp. 251-265 ◽

Cited By ~ 3

Author(s):

I.J.B.F. Adan ◽

J.A.C. Resing ◽

V.G. Kulkarni

Keyword(s):

Fluid Model ◽

Geometric Approach ◽

Random Variable ◽

Death Process ◽

Average Reward ◽

Discrete State ◽

Limiting Behavior ◽

Long Run ◽

Continuous State ◽

Long Run Average Reward

Stochastic discretization is a technique of representing a continuous random variable as a random sum of i.i.d. exponential random variables. In this article, we apply this technique to study the limiting behavior of a stochastic fluid model. Specifically, we consider an infinite-capacity fluid buffer, where the net input of fluid is regulated by a finite-state irreducible continuous-time Markov chain. Most long-run performance characteristics for such a fluid system can be expressed as the long-run average reward for a suitably chosen reward structure. In this article, we use stochastic discretization of the fluid content process to efficiently determine the long-run average reward. This method transforms the continuous-state Markov process describing the fluid model into a discrete-state quasi-birth–death process. Hence, standard tools, such as the matrix-geometric approach, become available for the analysis of the fluid buffer. To demonstrate this approach, we analyze the output of a buffer processing fluid from K sources on a first-come first-served basis.

Download Full-text

Semi-Markov decision processes with polynomial reward

Journal of Applied Probability ◽

10.2307/3213482 ◽

1982 ◽

Vol 19 (2) ◽

pp. 301-309 ◽

Cited By ~ 6

Author(s):

Zvi Rosberg

Keyword(s):

Transition Period ◽

Queueing Network ◽

Decision Processes ◽

Average Reward ◽

Network Scheduling ◽

Long Run ◽

Markov Decision ◽

Average Reward Criterion ◽

Long Run Average Reward ◽

Reward Criterion

A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached in one transition. It is shown that under an ergodicity assumption there is a stationary optimal policy for the long-run average reward criterion. A queueing network scheduling problem, for which previous criteria are inapplicable, is given as an application.

Download Full-text

Some memoryless bandit policies

Journal of Applied Probability ◽

10.1239/jap/1044476838 ◽

2003 ◽

Vol 40 (1) ◽

pp. 250-256 ◽

Cited By ~ 1

Author(s):

Erol A. Peköz

Keyword(s):

Average Reward ◽

Bandit Problem ◽

Unknown Distribution ◽

Long Run ◽

Long Run Average Reward ◽

Multiarmed Bandit ◽

Independent And Identically Distributed

We consider a multiarmed bandit problem, where each arm when pulled generates independent and identically distributed nonnegative rewards according to some unknown distribution. The goal is to maximize the long-run average reward per pull with the restriction that any previously learned information is forgotten whenever a switch between arms is made. We present several policies and a peculiarity surrounding them.

Download Full-text