Refined knowledge-gradient policy for learning probabilities

2015 ◽  
Vol 43 (2) ◽  
pp. 143-147 ◽  
Author(s):  
Bogumił Kamiński
Keyword(s):  
Author(s):  
Xiaozhou Wang ◽  
Xi Chen ◽  
Qihang Lin ◽  
Weidong Liu

The performance of clustering depends on an appropriately defined similarity between two items. When the similarity is measured based on human perception, human workers are often employed to estimate a similarity score between items in order to support clustering, leading to a procedure called crowdsourced clustering. Assuming a monetary reward is paid to a worker for each similarity score and assuming the similarities between pairs and workers' reliability have a large diversity, when the budget is limited, it is critical to wisely assign pairs of items to different workers to optimize the clustering result. We model this budget allocation problem as a Markov decision process where item pairs are dynamically assigned to workers based on the historical similarity scores they provided. We propose an optimistic knowledge gradient policy where the assignment of items in each stage is based on the minimum-weight K-cut defined on a similarity graph. We provide simulation studies and real data analysis to demonstrate the performance of the proposed method.


2011 ◽  
Vol 23 (3) ◽  
pp. 346-363 ◽  
Author(s):  
Diana M. Negoescu ◽  
Peter I. Frazier ◽  
Warren B. Powell

2016 ◽  
Vol 31 (2) ◽  
pp. 239-263 ◽  
Author(s):  
James Edwards ◽  
Paul Fearnhead ◽  
Kevin Glazebrook

The knowledge gradient (KG) policy was originally proposed for online ranking and selection problems but has recently been adapted for use in online decision-making in general and multi-armed bandit problems (MABs) in particular. We study its use in a class of exponential family MABs and identify weaknesses, including a propensity to take actions which are dominated with respect to both exploitation and exploration. We propose variants of KG which avoid such errors. These new policies include an index heuristic, which deploys a KG approach to develop an approximation to the Gittins index. A numerical study shows this policy to perform well over a range of MABs including those for which index policies are not optimal. While KG does not take dominated actions when bandits are Gaussian, it fails to be index consistent and appears not to enjoy a performance advantage over competitor policies when arms are correlated to compensate for its greater computational demands.


2021 ◽  
pp. 146144562110168
Author(s):  
Paulien Harms ◽  
Tom Koole ◽  
Ninke Stukker ◽  
Jaap Tulleken

This paper examines how expertise is treated as a separable domain of epistemics by looking at simulated intensive care shift-handovers between resident physicians. In these handovers, medical information about a patient is transferred from an outgoing physician (OP) to an incoming physician (IP). These handovers contain different interactional activities, such as discussing the patient identifiers, giving a clinical impression, and discussing tasks and focus points. We found that with respect to (factual) knowledge about the patient, the OPs display an orientation to a knowledge imbalance, but with respect to (clinical) procedures, reasoning, and activities, they display an orientation to a knowledge balance. We use ‘expertise’ to refer to this latter type of knowledge. ‘Expertise’ differs from, and adds to, how knowledge is often treated in epistemics in that it is concerned with professional competence or ‘knowing how’. In terms of epistemics, the participants in the handovers orient to a steep epistemic or knowledge gradient when it concerns the patient, while simultaneously displaying an orientation to a horizontal expertise gradient.


2018 ◽  
Vol 30 (4) ◽  
pp. 750-767 ◽  
Author(s):  
Yan Li ◽  
Kristofer G. Reyes ◽  
Jorge Vazquez-Anderson ◽  
Yingfei Wang ◽  
Lydia M. Contreras ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document