maximum regret
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 7)

H-INDEX

7
(FIVE YEARS 0)

2021 ◽  
pp. 1-16
Author(s):  
Pegah Alizadeh ◽  
Emiliano Traversi ◽  
Aomar Osmani

Markov Decision Process Models (MDPs) are a powerful tool for planning tasks and sequential decision-making issues. In this work we deal with MDPs with imprecise rewards, often used when dealing with situations where the data is uncertain. In this context, we provide algorithms for finding the policy that minimizes the maximum regret. To the best of our knowledge, all the regret-based methods proposed in the literature focus on providing an optimal stochastic policy. We introduce for the first time a method to calculate an optimal deterministic policy using optimization approaches. Deterministic policies are easily interpretable for users because for a given state they provide a unique choice. To better motivate the use of an exact procedure for finding a deterministic policy, we show some (theoretical and experimental) cases where the intuitive idea of using a deterministic policy obtained after “determinizing” the optimal stochastic policy leads to a policy far from the exact deterministic policy.


Author(s):  
Valentyn Litvin ◽  
Charles F. Manski

In this article, we present the wald_tc command, which computes the maximum regret (MR) of a user-specified statistical treatment rule that uses sample data on realized treatment response (and optionally an instrumental variable) to determine a treatment choice for a population. Because the outcomes of counterfactual treatments are not observed and treatment selection in the study population may not be random, decision makers may be able only to partially identify average treatment effects. wald_tc allows users to compute the MR of a proposed statistical treatment rule under a flexible specification of the data-generating process and determines the state that generates MR.


Econometrica ◽  
2021 ◽  
Vol 89 (2) ◽  
pp. 825-848
Author(s):  
Eric Mbakop ◽  
Max Tabord-Meehan

This paper studies a penalized statistical decision rule for the treatment assignment problem. Consider the setting of a utilitarian policy maker who must use sample data to allocate a binary treatment to members of a population, based on their observable characteristics. We model this problem as a statistical decision problem where the policy maker must choose a subset of the covariate space to assign to treatment, out of a class of potential subsets. We focus on settings in which the policy maker may want to select amongst a collection of constrained subset classes: examples include choosing the number of covariates over which to perform best‐subset selection, and model selection when approximating a complicated class via a sieve. We adapt and extend results from statistical learning to develop the Penalized Welfare Maximization (PWM) rule. We establish an oracle inequality for the regret of the PWM rule which shows that it is able to perform model selection over the collection of available classes. We then use this oracle inequality to derive relevant bounds on maximum regret for PWM. An important consequence of our results is that we are able to formalize model‐selection using a “holdout” procedure, where the policy maker would first estimate various policies using half of the data, and then select the policy which performs the best when evaluated on the other half of the data.


Author(s):  
Charles F. Manski

This chapter considers reasonable decision making with sample data from randomized trials. It continues discussion of reasonable patient care under uncertainty. Because of its centrality to evidence-based medicine, the chapter focuses on the use of sample trial data in treatment choice. Moreover, having already addressed identification, the chapter considers only statistical imprecision, as has been the case in the statistical literature on trials. The Wald (1950) development of statistical decision theory provides a coherent framework for use of sample data to make decisions. A body of recent research applies statistical decision theory to determine treatment choices that achieve adequate performance in all states of nature, in the sense of maximum regret. This chapter describes the basic ideas and findings, which provide an appealing practical alternative to use of hypothesis tests.


Author(s):  
Sabine Storandt ◽  
Stefan Funke

In this paper, we study a problem from the realm of multicriteria decision making in which the goal is to select from a given set S of d-dimensional objects a minimum sized subset S0 with bounded regret. Thereby, regret measures the unhappiness of users which would like to select their favorite object from set S but now can only select their favorite object from the subset S0. Previous work focused on bounding the maximum regret which is determined by the most unhappy user. We propose to consider the average regret instead which is determined by the sum of (un)happiness of all possible users. We show that this regret measure comes with desirable properties as supermodularity which allows to construct approximation algorithms. Furthermore, we introduce the regret minimizing permutation problem and discuss extensions of our algorithms to the recently proposed k-regret measure. Our theoretical results are accompanied with experiments on a variety of inputs with d up to 7.


2016 ◽  
Vol 26 (1) ◽  
pp. 47-63 ◽  
Author(s):  
Zhi-Long Chen ◽  
Nicholas G. Hall ◽  
Hans Kellerer

Sign in / Sign up

Export Citation Format

Share Document