INDEXABILITY OF BANDIT PROBLEMS WITH RESPONSE DELAYS

2010 ◽  
Vol 24 (3) ◽  
pp. 349-374 ◽  
Author(s):  
Felipe Caro ◽  
Onesun Steve Yoo

This article considers an important class of discrete time restless bandits, given by the discounted multiarmed bandit problems with response delays. The delays in each period are independent random variables, in which the delayed responses do not cross over. For a bandit arm in this class, we use a coupling argument to show that in each state there is a unique subsidy that equates the pulling and nonpulling actions (i.e., the bandit satisfies the indexibility criterion introduced by Whittle (1988). The result allows for infinite or finite horizon and holds for arbitrary delay lengths and infinite state spaces. We compute the resulting marginal productivity indexes (MPI) for the Beta-Bernoulli Bayesian learning model, formulate and compute a tractable upper bound, and compare the suboptimality gap of the MPI policy to those of other heuristics derived from different closed-form indexes. The MPI policy performs near optimally and provides a theoretical justification for the use of the other heuristics.

1997 ◽  
Vol 29 (01) ◽  
pp. 114-137
Author(s):  
Linn I. Sennott

This paper studies the expected average cost control problem for discrete-time Markov decision processes with denumerably infinite state spaces. A sequence of finite state space truncations is defined such that the average costs and average optimal policies in the sequence converge to the optimal average cost and an optimal policy in the original process. The theory is illustrated with several examples from the control of discrete-time queueing systems. Numerical results are discussed.


1997 ◽  
Vol 29 (1) ◽  
pp. 114-137 ◽  
Author(s):  
Linn I. Sennott

This paper studies the expected average cost control problem for discrete-time Markov decision processes with denumerably infinite state spaces. A sequence of finite state space truncations is defined such that the average costs and average optimal policies in the sequence converge to the optimal average cost and an optimal policy in the original process. The theory is illustrated with several examples from the control of discrete-time queueing systems. Numerical results are discussed.


Symmetry ◽  
2021 ◽  
Vol 13 (7) ◽  
pp. 1134
Author(s):  
Kenta Higuchi ◽  
Takashi Komatsu ◽  
Norio Konno ◽  
Hisashi Morioka ◽  
Etsuo Segawa

We consider the discrete-time quantum walk whose local dynamics is denoted by a common unitary matrix C at the perturbed region {0,1,⋯,M−1} and free at the other positions. We obtain the stationary state with a bounded initial state. The initial state is set so that the perturbed region receives the inflow ωn at time n(|ω|=1). From this expression, we compute the scattering on the surface of −1 and M and also compute the quantity how quantum walker accumulates in the perturbed region; namely, the energy of the quantum walk, in the long time limit. The frequency of the initial state of the influence to the energy is symmetric on the unit circle in the complex plain. We find a discontinuity of the energy with respect to the frequency of the inflow.


1992 ◽  
Vol 96 (1) ◽  
pp. 157-174 ◽  
Author(s):  
Julian Bradfield ◽  
Colin Stirling

2021 ◽  
Vol 1 (4) ◽  
pp. 234-237
Author(s):  
Hamza Khalifa , ,, , Ibrahim ◽  
Abdulfatah Saed ◽  
Naser Ramdan R. Amaizah ◽  
Aejeeliyah Yousuf ◽  
Malak Abdalh Akim Esdera

The efficacy profile of lidocaine as a local anesthetic is characterized by a rapid onset of action and an intermediate duration of efficacy. Therefore, lidocaine is suitable for infiltration, block, and surface anesthesia. Longer-acting substances such as bupivacaine are sometimes given preference for spinal and peridural anesthesias, however, lidocaine, on the other hand, has the advantage of a rapid onset of action. Adrenaline supplements could delay the resorption and the duration of efficacy could be doubled. Lidocaine is the most important class 1B antiarrhythmic drug: it is used intravenously for the treatment of ventricular arrhythmias (for acute myocardial infarction, digitalis poisoning, cardioversion, or cardiac catheterization). However, a routine prophylactic administration is no longer recommended for acute cardiac infarction. The overall benefit of this measure is not convincing. Lidocaine has also been efficient in refractory cases of status epilepticus.


2019 ◽  
Vol 66 ◽  
pp. 197-223
Author(s):  
Michal Jozef Knapik ◽  
Etienne Andre ◽  
Laure Petrucci ◽  
Wojciech Jamroga ◽  
Wojciech Penczek

In this paper we investigate the Timed Alternating-Time Temporal Logic (TATL), a discrete-time extension of ATL. In particular, we propose, systematize, and further study semantic variants of TATL, based on different notions of a strategy. The notions are derived from different assumptions about the agents’ memory and observational capabilities, and range from timed perfect recall to untimed memoryless plans. We also introduce a new semantics based on counting the number of visits to locations during the play. We show that all the semantics, except for the untimed memoryless one, are equivalent when punctuality constraints are not allowed in the formulae. In fact, abilities in all those notions of a strategy collapse to the “counting” semantics with only two actions allowed per location. On the other hand, this simple pattern does not extend to the full TATL. As a consequence, we establish a hierarchy of TATL semantics, based on the expressivity of the underlying strategies, and we show when some of the semantics coincide. In particular, we prove that more compact representations are possible for a reasonable subset of TATL specifications, which should improve the efficiency of model checking and strategy synthesis.


2019 ◽  
Vol 489 (3) ◽  
pp. 227-231
Author(s):  
G. M. Feldman

According to the Heyde theorem the Gaussian distribution on the real line is characterized by the symmetry of the conditional distribution of one linear form of independent random variables given the other. We prove an analogue of this theorem for linear forms of two independent random variables taking values in an -adic solenoid containing no elements of order 2. Coefficients of the linear forms are topological automorphisms of the -adic solenoid.


1981 ◽  
Vol 18 (3) ◽  
pp. 652-659 ◽  
Author(s):  
M. J. Phillips

The negative exponential distribution is characterized in terms of two independent random variables. Only one of the random variables has a negative exponential distribution whilst the other can belong to a wide class of distributions. This result is then applied to two models for the reliability of a system of two modules subject to revealed and unrevealed faults to show when the models are equivalent. It is also shown, under certain conditions, that the system availability is only independent of the distribution of revealed failure times in one module when unrevealed failure times in the other module have a negative exponential distribution.


Author(s):  
Andrew C. Willford ◽  
S. Nagarajan

This chapter focuses on the professionals of the Tamil population. A cultural displacement, as experienced by the Indian middle class, has produced its own narrative that was subsequently hijacked by Malay “extremists.” This sense of betrayal among the Indian middle class is important because their narrative of victimization takes cohesive ideological shape in a form that disseminates to the working class through the work of activists, politicians, writers, NGOs, and lawyers. Through this, one sees an important class dialectic within the Indian community that is divisive, as well as signs that recent legal decisions and events have exacerbated a sense of insecurity. Ultimately, a deep sense of political betrayal within this elite class is producing nostalgia for a nonracialized Malaysia on the one hand, and a consolidation of Indianness on the other.


2019 ◽  
Vol 9 (16) ◽  
pp. 3220 ◽  
Author(s):  
Ryo Kurokawa ◽  
Takao Sato ◽  
Ramon Vilanova ◽  
Yasuo Konishi

The present study proposes a novel proportional-integral-derivative (PID) control design method in discrete time. In the proposed method, a PID controller is designed for first-order plus dead-time (FOPDT) systems so that the prescribed robust stability is accomplished. Furthermore, based on the control performance, the relationship between the servo performance and the regulator performance is a trade-off relationship, and hence, these items are not simultaneously optimized. Therefore, the proposed method provides an optimal design method of the PID parameters for optimizing the reference tracking and disturbance rejection performances, respectively. Even though such a trade-off design method is being actively researched for continuous time, few studies have examined such a method for discrete time. In conventional discrete time methods, the robust stability is not directly prescribed or available systems are restricted to systems for which the dead-time in the continuous time model is an integer multiple of the sampling interval. On the other hand, in the proposed method, even when a discrete time zero is included in the controlled plant, the optimal PID parameters are obtained. In the present study, as well as the other plant parameters, a zero in the FOPDT system is newly normalized, and then, a universal design method is obtained for the FOPDT system with the zero. Finally, the effectiveness of the proposed method is demonstrated through numerical examples.


Sign in / Sign up

Export Citation Format

Share Document