The optimal value of markov stopping problems with one-step look ahead policy

1988 ◽  
Vol 25 (03) ◽  
pp. 544-552 ◽  
Author(s):  
Masami Yasuda

This paper treats stopping problems on Markov chains in which the OLA (one-step look ahead) policy is optimal. Its associated optimal value can be explicitly expressed by a potential for a charge function of the difference between the immediate reward and the one-step-after reward. As an application to the best choice problem, we shall obtain the value of three problems: the classical secretary problem, a problem with a refusal probability and a problem with a random number of objects.

1988 ◽  
Vol 25 (3) ◽  
pp. 544-552 ◽  
Author(s):  
Masami Yasuda

This paper treats stopping problems on Markov chains in which the OLA (one-step look ahead) policy is optimal. Its associated optimal value can be explicitly expressed by a potential for a charge function of the difference between the immediate reward and the one-step-after reward. As an application to the best choice problem, we shall obtain the value of three problems: the classical secretary problem, a problem with a refusal probability and a problem with a random number of objects.


2016 ◽  
Vol 48 (3) ◽  
pp. 726-743 ◽  
Author(s):  
Mitsushi Tamaki

Abstract The best-choice problem and the duration problem, known as versions of the secretary problem, are concerned with choosing an object from those that appear sequentially. Let (B,p) denote the best-choice problem and (D,p) the duration problem when the total number N of objects is a bounded random variable with prior p=(p1, p2,...,pn) for a known upper bound n. Gnedin (2005) discovered the correspondence relation between these two quite different optimal stopping problems. That is, for any given prior p, there exists another prior q such that (D,p) is equivalent to (B,q). In this paper, motivated by his discovery, we attempt to find the alternate correspondence {p(m),m≥0}, i.e. an infinite sequence of priors such that (D,p(m-1)) is equivalent to (B,p(m)) for all m≥1, starting with p(0)=(0,...,0,1). To be more precise, the duration problem is distinguished into (D1,p) or (D2,p), referred to as model 1 or model 2, depending on whether the planning horizon is N or n. The aforementioned problem is model 1. For model 2 as well, we can find the similar alternate correspondence {p[m],m≥ 0}. We treat both the no-information model and the full-information model and examine the limiting behaviors of their optimal rules and optimal values related to the alternate correspondences as n→∞. A generalization of the no-information model is given. It is worth mentioning that the alternate correspondences for model 1 and model 2 are respectively related to the urn sampling models without replacement and with replacement.


1973 ◽  
Vol 17 (4) ◽  
pp. 657-668 ◽  
Author(s):  
E. L. Presman ◽  
I. M. Sonin

1977 ◽  
Vol 14 (1) ◽  
pp. 162-169 ◽  
Author(s):  
M. Abdel-Hameed

The optimality of the one step look-ahead stopping rule is shown to hold under conditions different from those discussed by Chow, Robbins and Seigmund [5]. These results are corollaries of the following theorem: Let {Xn, n = 0, 1, …}; X0 = x be a discrete-time homogeneous Markov process with state space (E, ℬ). For any ℬ-measurable function g and α in (0, 1], define Aαg(x) = αExg(X1) – g(x) to be the infinitesimal generator of g. If τ is any stopping time satisfying the conditions: Ex[αNg(XN)I(τ > N)]→0 as as N → ∞, then Applications of the results are considered.


2010 ◽  
Vol 3 (6) ◽  
pp. 5645-5670
Author(s):  
M. Antón ◽  
J. E. Gil ◽  
A. Cazorla ◽  
J. M. Vilaplana ◽  
F. J. Olmo ◽  
...  

Abstract. The ultraviolet (UV) index is the variable most commonly used to inform the general public about the levels and potential harmful effects of UV radiation incident at Earth's surface. This variable is derived from the output signal of the UV radiometers applying conversion factors obtained by calibration methods. This paper focused on the influence of the use of two of these methods (called one-step and two-steps methods) on the resulting experimental UV Index (UVI) as measured by a YES UVB-1 radiometer located in a midlatitude station, Granada (Spain) for the period 2006–2009. In addition, it is also analyzed the difference with the UVI values obtained when the calibration factors provided by the manufacturer are used. For this goal, the detailed characterization of the UVB-1 radiometer obtained in the first Spanish calibration campaign of broadband UV radiometers at the "El Arenosillo" INTA station in 2007 is used. In addition, modeled UVI data derived from the LibRadtran/UVSPEC radiative transfer code are compared with the experimental values recorded at Granada for cloud-free conditions. The absolute mean differences between the measured and modeled UVI data at Granada are around 5% using the one-step and two-steps calibration methods. This result indicates the excellent performance of these two techniques for obtaining UVI data from the UVB-1 radiometer. In contrast, the application of the calibration factor supplied by the manufacturer produces a high overestimation (~14%) of the UVI values. This fact generates unreliable alarming high UVI data in summer when the manufacturer's factor is used. Thus, days with an extreme erythemal risk (UVI higher than 10) increase up to 46% of all cases measured between May and September at Granada when the manufacturer's factor is applied. This percentage is reduced to a more reliable value of 3% when the conversion factors obtained with the two-steps calibration method are used. All these results report about the need of a sound calibration of the broadband UV instruments in order to obtain reliable measurements.


1984 ◽  
Vol 21 (3) ◽  
pp. 521-536 ◽  
Author(s):  
Masami Yasuda

This paper considers the best-choice problem with a random number of objects having a known distribution. The optimality equation of the problem reduces to an integral equation by a scaling limit. The equation is explicitly solved under conditions on the distribution, which relate to the condition for an OLA policy to be optimal in Markov decision processes. This technique is then applied to three different versions of the problem and an exact value for the asymptotic optimal strategy is found.


1984 ◽  
Vol 21 (03) ◽  
pp. 521-536 ◽  
Author(s):  
Masami Yasuda

This paper considers the best-choice problem with a random number of objects having a known distribution. The optimality equation of the problem reduces to an integral equation by a scaling limit. The equation is explicitly solved under conditions on the distribution, which relate to the condition for an OLA policy to be optimal in Markov decision processes. This technique is then applied to three different versions of the problem and an exact value for the asymptotic optimal strategy is found.


2015 ◽  
Vol 713-715 ◽  
pp. 760-763
Author(s):  
Jia Lei Zhang ◽  
Zhen Lin Jin ◽  
Dong Mei Zhao

We have analyzed some reliability problems of the 2UPS+UP mechanism using continuous Markov repairable model in our previous work. According to the check and repair of the robot is periodic, the discrete time Markov repairable model should be more appropriate. Firstly we built up the discrete time repairable model and got the one step transition probability matrix. Secondly solved the steady state equations and got the steady state availability of the mechanical leg, by the solution of the difference equations the reliability and the mean time to first failure were obtained. In the end we compared the reliability indexes with the continuous model.


Sign in / Sign up

Export Citation Format

Share Document