The best choice problem with random arrivals: How to beat the 1/e-strategy

Author(s):  
Alexander Gnedin
2016 ◽  
Vol 48 (3) ◽  
pp. 726-743 ◽  
Author(s):  
Mitsushi Tamaki

Abstract The best-choice problem and the duration problem, known as versions of the secretary problem, are concerned with choosing an object from those that appear sequentially. Let (B,p) denote the best-choice problem and (D,p) the duration problem when the total number N of objects is a bounded random variable with prior p=(p1, p2,...,pn) for a known upper bound n. Gnedin (2005) discovered the correspondence relation between these two quite different optimal stopping problems. That is, for any given prior p, there exists another prior q such that (D,p) is equivalent to (B,q). In this paper, motivated by his discovery, we attempt to find the alternate correspondence {p(m),m≥0}, i.e. an infinite sequence of priors such that (D,p(m-1)) is equivalent to (B,p(m)) for all m≥1, starting with p(0)=(0,...,0,1). To be more precise, the duration problem is distinguished into (D1,p) or (D2,p), referred to as model 1 or model 2, depending on whether the planning horizon is N or n. The aforementioned problem is model 1. For model 2 as well, we can find the similar alternate correspondence {p[m],m≥ 0}. We treat both the no-information model and the full-information model and examine the limiting behaviors of their optimal rules and optimal values related to the alternate correspondences as n→∞. A generalization of the no-information model is given. It is worth mentioning that the alternate correspondences for model 1 and model 2 are respectively related to the urn sampling models without replacement and with replacement.


2004 ◽  
Vol 36 (2) ◽  
pp. 398-416 ◽  
Author(s):  
Stephen M. Samuels

The full-information best-choice problem, as posed by Gilbert and Mosteller in 1966, asks us to find a stopping rule which maximizes the probability of selecting the largest of a sequence of n i.i.d. standard uniform random variables. Porosiński, in 1987, replaced a fixed n by a random N, uniform on {1,2,…,n} and independent of the observations. A partial-information problem, imbedded in a 1980 paper of Petruccelli, keeps n fixed but allows us to observe only the sequence of ranges (max - min), as well as whether or not the current observation is largest so far. Recently, Porosiński compared the solutions to his and Petruccelli's problems and found that the two problems have identical optimal rules as well as risks that are asymptotically equal. His discovery prompts the question: why? This paper gives a good explanation of the equivalence of the optimal rules. But even under the lens of a planar Poisson process model, it leaves the equivalence of the asymptotic risks as somewhat of a mystery. Meanwhile, two other problems have been shown to have the same limiting risks: the full-information problem with the (suboptimal) Porosiński-Petruccelli stopping rule, and the full-information ‘duration of holding the best’ problem of Ferguson, Hardwick and Tamaki, which turns out to be nothing but the Porosiński problem in disguise.


1973 ◽  
Vol 17 (4) ◽  
pp. 657-668 ◽  
Author(s):  
E. L. Presman ◽  
I. M. Sonin

1983 ◽  
Vol 20 (1) ◽  
pp. 165-171 ◽  
Author(s):  
Joseph D. Petruccelli

We consider the problem of maximizing the probability of choosing the largest from a sequence of N observations when N is a bounded random variable. The present paper gives a necessary and sufficient condition, based on the distribution of N, for the optimal stopping rule to have a particularly simple form: what Rasmussen and Robbins (1975) call an s(r) rule. A second result indicates that optimal stopping rules for this problem can, with one restriction, take virtually any form.


1988 ◽  
Vol 25 (3) ◽  
pp. 544-552 ◽  
Author(s):  
Masami Yasuda

This paper treats stopping problems on Markov chains in which the OLA (one-step look ahead) policy is optimal. Its associated optimal value can be explicitly expressed by a potential for a charge function of the difference between the immediate reward and the one-step-after reward. As an application to the best choice problem, we shall obtain the value of three problems: the classical secretary problem, a problem with a refusal probability and a problem with a random number of objects.


2015 ◽  
Vol 52 (4) ◽  
pp. 926-940 ◽  
Author(s):  
Mitsushi Tamaki

As a class of optimal stopping problems with monotone thresholds, we define the candidate-choice problem (CCP) and derive two formulae for calculating its expected payoff. We apply the first formula to a particular CCP, i.e. the best-choice duration problem treated by Ferguson et al. (1992). The recall case is also examined as a comparison. We also derive the distribution of the stopping time of the CCP and find, as a by-product, that the best-choice problem has a remarkable feature in that the optimal probability of choosing the best is just the expected value of the (proportional) stopping time. The similarity between the best-choice duration problem and the best-choice problem with uniform freeze studied by Samuel-Cahn (1996) is recognized.


Sign in / Sign up

Export Citation Format

Share Document