scholarly journals OPTIMAL ADMISSION AND ROUTING WITH CONGESTION-SENSITIVE CUSTOMER CLASSES

Author(s):  
Ayse Aslan

This paper considers optimal admission and routing control in multi-class service systems in which customers can either receive quality regular service which is subject to congestion or can receive congestion-free but less desirable service at an alternative service station, which we call the self-service station. We formulate the problem within the Markov decision process framework and focus on characterizing the structure of dynamic optimal policies which maximize the expected long-run rewards. For this, value function and sample path arguments are used. The congestion sensitivity of customers is modeled with class-independent holding costs at the regular service station. The results show how the admission rewards of customer classes affect their priorities at the regular and self-service stations. We explore that the priority for regular service may not only depend on regular service admission rewards of classes but also on the difference between regular and self-service admission rewards. We show that optimal policies have monotonicity properties, regarding the optimal decisions of individual customer classes such that they divide the state space into three connected regions per class.

1984 ◽  
Vol 16 (1) ◽  
pp. 8-8
Author(s):  
J. S. Baras ◽  
A. J. Dorsey ◽  
A. M. Makowski

A state-space model is presented for a queueing system where two classes of customer compete in discrete-time for the service attention of a single server with infinite buffer capacity. The arrivals are modelled by an independent identically distributed random sequence of a general type while the service completions are generated by independent Bernoulli streams; the allocation of service attention is governed by feedback policies which are based on past decisions and buffer content histories. The cost of operation per unit time is a linear function of the queue sizes. Under the model assumptions, a fixed prioritization scheme, known as the μc -rule, is shown to be optimal when the expected long-run average criterion and the expected discounted criterion, over both finite and infinite horizons, are used. This static prioritization of the two classes of customers is done solely on the basis of service and cost parameters. The analysis is based on the dynamic programming methodology for Markov decision processes and takes advantage of the sample-path properties of the adopted state-space model.


1979 ◽  
Vol 36 (8) ◽  
pp. 939-947 ◽  
Author(s):  
Roy Mendelssohn

Conditions are given that imply there exist policies that "minimize risk" of undesirable events for stochastic harvesting models. It is shown that for many problems, either such a policy will not exist, or else it is an "extreme" policy that is equally undesirable. Techniques are given to systematically trade-off decreases in the long-run expected return with decreases in the long-run risk. Several numerical examples are given for models of salmon runs, when both population-based risks and harvest-based risks are considered. Key words: Markov decision processes, risk, salmon management, Pareto optimal policies, trade-off curves, linear programing


2017 ◽  
Vol 32 (2) ◽  
pp. 163-178 ◽  
Author(s):  
Kenneth C. Chong ◽  
Shane G. Henderson ◽  
Mark E. Lewis

We consider the problem of routing and admission control in a loss system featuring two classes of arriving jobs (high-priority and low-priority jobs) and two types of servers, in which decision-making for high-priority jobs is forced, and rewards influence the desirability of each of the four possible routing decisions. We seek a policy that maximizes expected long-run reward, under both the discounted reward and long-run average reward criteria, and formulate the problem as a Markov decision process. When the reward structure favors high-priority jobs, we demonstrate that there exists an optimal monotone switching curve policy with slope of at least −1. When the reward structure favors low-priority jobs, we demonstrate that the value function, in general, lacks structure, which complicates the search for structure in optimal policies. However, we identify conditions under which optimal policies can be characterized in greater detail. We also examine the performance of heuristic policies in a brief numerical study.


1985 ◽  
Vol 17 (1) ◽  
pp. 186-209 ◽  
Author(s):  
J. S. Baras ◽  
A. J. Dorsey ◽  
A. M. Makowski

A discrete-time model is presented for a system of two queues competing for the service attention of a single server with infinite buffer capacity. The service requirements are geometrically distributed and independent from customer to customer as well as from the arrivals. The allocation of service attention is governed by feedback policies which are based on past decisions and buffer content histories. The cost of operation per unit time is a linear function of the queue sizes. Under the model assumptions, a fixed prioritization scheme, known as the μc-rule, is shown to be optimal for the expected long-run average criterion and for the expected discounted criterion, over both finite and infinite horizons. Two different approaches are proposed for solving these problems. One is based on the dynamic programming methodology for Markov decision processes, and assumes the arrivals to be i.i.d. The other is valid under no additional assumption on the arrival stream and uses direct comparison arguments. In both cases, the sample path properties of the adopted state-space model are exploited.


1994 ◽  
Vol 8 (3) ◽  
pp. 419-429 ◽  
Author(s):  
A. Hordijk ◽  
G. M. Koole ◽  
J. A. Loeve

In this paper we analyze a queueing network consisting of parallel queues and arriving customers that have to be assigned to one of the queues. The assignment rule may not depend on the numbers of customers in the queues. Our goal is to find a policy that is optimal with respect to the long-run average cost. We will consider two cases: holding costs and waiting times. A recently developed algorithm for Markov decision chains with partial state information is applied. It turns out that the periodic policies found by this algorithm are close, if not equal, to the optimal ones.


1985 ◽  
Vol 17 (01) ◽  
pp. 186-209
Author(s):  
J. S. Baras ◽  
A. J. Dorsey ◽  
A. M. Makowski

A discrete-time model is presented for a system of two queues competing for the service attention of a single server with infinite buffer capacity. The service requirements are geometrically distributed and independent from customer to customer as well as from the arrivals. The allocation of service attention is governed by feedback policies which are based on past decisions and buffer content histories. The cost of operation per unit time is a linear function of the queue sizes. Under the model assumptions, a fixed prioritization scheme, known as the μc-rule, is shown to be optimal for the expected long-run average criterion and for the expected discounted criterion, over both finite and infinite horizons. Two different approaches are proposed for solving these problems. One is based on the dynamic programming methodology for Markov decision processes, and assumes the arrivals to be i.i.d. The other is valid under no additional assumption on the arrival stream and uses direct comparison arguments. In both cases, the sample path properties of the adopted state-space model are exploited.


Author(s):  
Ming-Sheng Ying ◽  
Yuan Feng ◽  
Sheng-Gang Ying

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.


2018 ◽  
Vol 5 (1) ◽  
pp. 151-172
Author(s):  
Andrew Lister

Abstract Jason Brennan and John Tomasi have argued that if we focus on income alone, the Difference Principle supports welfare-state capitalism over property-owning democracy, because capitalism maximizes long run income growth for the worst off. If so, the defense of property-owning democracy rests on the priority of equal opportunity for political influence and social advancement over raising the income of the worst off, or on integrating workplace control into the Difference Principle’s index of advantage. The thesis of this paper is that even based on income alone, the Difference Principle is not as hostile to property-owning democracy as it may seem, because the Difference Principle should not be interpreted to require maximizing long run income growth. The main idea is that it is unfair to make the present worst off accept inequality that doesn’t benefit them, for the sake of benefitting the future worst off, if the future worst off will be better off than they are anyway.


Sign in / Sign up

Export Citation Format

Share Document