Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability

On transforming an index for generalised bandit problems

Journal of Applied Probability ◽

10.2307/3214927 ◽

1995 ◽

Vol 32 (1) ◽

pp. 168-182 ◽

Cited By ~ 4

Author(s):

K. D. Glazebrook ◽

S. Greatrix

Keyword(s):

Dynamic Programming ◽

Policy Evaluation ◽

Gittins Index ◽

Bandit Problem ◽

Bandit Problems ◽

Index Policies

Nash (1980) demonstrated that index policies are optimal for a class of generalised bandit problem. A transform of the index concerned has many of the attributes of the Gittins index. The transformed index is positive-valued, with maximal values yielding optimal actions. It may be characterised as the value of a restart problem and is hence computable via dynamic programming methodologies. The transformed index can also be used in procedures for policy evaluation.

Download Full-text

Minimization in Cooperative Response to Failing Database Queries

International Journal of Cooperative Information Systems ◽

10.1142/s0218843097000070 ◽

1997 ◽

Vol 06 (02) ◽

pp. 95-149 ◽

Cited By ~ 62

Author(s):

Parke Godfrey

Keyword(s):

Optimal Algorithm ◽

Simple Algorithm ◽

Search Space ◽

Database Systems ◽

Identification Problems ◽

Database Queries ◽

The Past ◽

Cause Of Failure ◽

Algorithmic Approaches ◽

Answer Set

When a query fails, it is more cooperative to identify the cause of failure, rather than just to report the empty answer set. When there is not a cause per se for the query's failure, it is then worthwhile to report the part of the query which failed. To identify a Minimal Failing Subquery (MFS) of the query is the best way to do this. (This MFS is not unique; there may be many of them.) Likewise, to identify a Maximal Succeeding Subquery (XSS) can help a user to recast a new query that leads to a non-empty answer set. Database systems do not provide the functionality of these types of cooperative responses. This may be, in part, because algorithmic approaches to finding the MFSs and the XSSs to a failing query are not obvious. The search space of subqueries is large. Despite work on MFSs in the past, the algorithmic complexity of these identification problems had remained uncharted. This paper shows the complexity profile of MFS and XSS identification. It is shown that there exists a simple algorithm for finding an MFS or an XSS by asking N subsequent queries, in which N is the length of the query. To find more MFSs (or XSSs) can be hard. It is shown that to find N MFSs (or XSSs) is NP-hard. To find k MFSs (or XSSs), for a fixed k, remains polynomial. An optimal algorithm for enumerating MFSs and XSSs, ISHMAEL, is developed and presented. The algorithm has ideal performance in enumeration, finding the first answers quickly, and only decaying toward intractability in a predictable manner as further answers are found. The complexity results and the algorithmic approaches given in this paper should allow for the construction of cooperative facilities which identify MFSs and XSSs for database systems. These results are relevant to a number of problems outside of databases too, and may find further application.

Download Full-text

Index policies for discounted bandit problems with availability constraints

Advances in Applied Probability ◽

10.1017/s0001867800002573 ◽

2008 ◽

Vol 40 (02) ◽

pp. 377-400 ◽

Cited By ~ 1

Author(s):

Savas Dayanik ◽

Warren Powell ◽

Kazutoshi Yamazaki

Keyword(s):

Bandit Problem ◽

Bandit Problems ◽

Index Policy ◽

State Action ◽

Index Policies ◽

Availability Constraints ◽

Whittle Index ◽

Multiarmed Bandit

A multiarmed bandit problem is studied when the arms are not always available. The arms are first assumed to be intermittently available with some state/action-dependent probabilities. It is proven that no index policy can attain the maximum expected total discounted reward in every instance of that problem. The Whittle index policy is derived, and its properties are studied. Then it is assumed that the arms may break down, but repair is an option at some cost, and the new Whittle index policy is derived. Both problems are indexable. The proposed index policies cannot be dominated by any other index policy over all multiarmed bandit problems considered here. Whittle indices are evaluated for Bernoulli arms with unknown success probabilities.

Download Full-text

Optimal Posted-Price Mechanism in Microtask Crowdsourcing

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/33 ◽

2017 ◽

Cited By ~ 3

Author(s):

Zehong Hu ◽

Jie Zhang

Keyword(s):

Upper Bound ◽

Optimal Algorithm ◽

Price Mechanism ◽

Bandit Problem ◽

Price Data ◽

Real Price ◽

Price Range ◽

Posted Price

Posted-price mechanisms are widely-adopted to decide the price of tasks in popular microtask crowdsourcing. In this paper, we propose a novel posted-price mechanism which not only outperforms existing mechanisms on performance but also avoids their need of a finite price range. The advantages are achieved by converting the pricing problem into a multi-armed bandit problem and designing an optimal algorithm to exploit the unique features of microtask crowdsourcing. We theoretically show the optimality of our algorithm and prove that the performance upper bound can be achieved without the need of a prior price range. We also conduct extensive experiments using real price data to verify the advantages and practicability of our mechanism.

Download Full-text

A note about a partial no-go theorem for quantum PCP

Quantum Information and Computation ◽

10.26421/qic11.11-12-10 ◽

2011 ◽

Vol 11 (11&12) ◽

pp. 1019-1027

Author(s):

Itai Itai Arad

Keyword(s):

Critical Point ◽

Upper Bound ◽

Open Problem ◽

Preliminary Step ◽

Important Open Problem

This is not a disproof of the quantum PCP conjecture! In this note we use perturbation on the commuting Hamiltonian problem on a graph, based on results by Bravyi and Vyalyi, to provide a very partial no-go theorem for quantum PCP. Specifically, we derive an upper bound on how large the promise gap can be for the quantum PCP still to hold, as a function of the non-commuteness of the system. As the system becomes more and more commuting, the maximal promise gap shrinks. We view these results as possibly a preliminary step towards disproving the quantum PCP conjecture posed in \cite{ref:Aha09}. A different way to view these results is actually as indications that a critical point exists, beyond which quantum PCP indeed holds; in any case, we hope that these results will lead to progress on this important open problem.

Download Full-text

An Online Minimax Optimal Algorithm for Adversarial Multiarmed Bandit Problem

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2018.2806006 ◽

2018 ◽

Vol 29 (11) ◽

pp. 5565-5580 ◽

Cited By ~ 1

Author(s):

Kaan Gokcesu ◽

Suleyman Serdar Kozat

Keyword(s):

Optimal Algorithm ◽

Bandit Problem ◽

Multiarmed Bandit

Download Full-text

Robust separations in inductive inference

Journal of Symbolic Logic ◽

10.2178/jsl/1305810752 ◽

2011 ◽

Vol 76 (2) ◽

pp. 368-376 ◽

Cited By ~ 1

Author(s):

Mark Fulk

Keyword(s):

Language Learning ◽

Open Problem ◽

Formal Language ◽

Inductive Inference ◽

Function Learning ◽

Preliminary Results ◽

Original Idea ◽

Important Open Problem ◽

Inference Theory ◽

Formal Language Learning

AbstractResults in recursion-theoretic inductive inference have been criticized as depending on unrealistic self-referential examples. J. M. Bārzdiņš proposed a way of ruling out such examples, and conjectured that one of the earliest results of inductive inference theory would fall if his method were used. In this paper we refute Bārzdiņš' conjecture.We propose a new line of research examining robust separations; these are defined using a strengthening of Bārzdiņš' original idea. The preliminary results of the new line of research are presented, and the most important open problem is stated as a conjecture. Finally, we discuss the extension of this work from function learning to formal language learning.

Download Full-text

Index policies for discounted bandit problems with availability constraints

Advances in Applied Probability ◽

10.1239/aap/1214950209 ◽

2008 ◽

Vol 40 (2) ◽

pp. 377-400 ◽

Cited By ~ 5

Author(s):

Savas Dayanik ◽

Warren Powell ◽

Kazutoshi Yamazaki

Keyword(s):

Bandit Problem ◽

Bandit Problems ◽

Index Policy ◽

State Action ◽

Index Policies ◽

Availability Constraints ◽

Whittle Index ◽

Multiarmed Bandit

A multiarmed bandit problem is studied when the arms are not always available. The arms are first assumed to be intermittently available with some state/action-dependent probabilities. It is proven that no index policy can attain the maximum expected total discounted reward in every instance of that problem. The Whittle index policy is derived, and its properties are studied. Then it is assumed that the arms may break down, but repair is an option at some cost, and the new Whittle index policy is derived. Both problems are indexable. The proposed index policies cannot be dominated by any other index policy over all multiarmed bandit problems considered here. Whittle indices are evaluated for Bernoulli arms with unknown success probabilities.

Download Full-text

On transforming an index for generalised bandit problems

Journal of Applied Probability ◽

10.1017/s0021900200102633 ◽

1995 ◽

Vol 32 (01) ◽

pp. 168-182

Author(s):

K. D. Glazebrook ◽

S. Greatrix

Keyword(s):

Dynamic Programming ◽

Policy Evaluation ◽

Gittins Index ◽

Bandit Problem ◽

Bandit Problems ◽

Index Policies

Nash (1980) demonstrated that index policies are optimal for a class of generalised bandit problem. A transform of the index concerned has many of the attributes of the Gittins index. The transformed index is positive-valued, with maximal values yielding optimal actions. It may be characterised as the value of a restart problem and is hence computable via dynamic programming methodologies. The transformed index can also be used in procedures for policy evaluation.

Download Full-text

How truth wins in opinion dynamics along issue sequences

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1710603114 ◽

2017 ◽

Vol 114 (43) ◽

pp. 11380-11385 ◽

Cited By ~ 22

Author(s):

Noah E. Friedkin ◽

Francesco Bullo

Keyword(s):

Mathematical Model ◽

Open Problem ◽

General Model ◽

Network Science ◽

Social Groups ◽

Opinion Dynamics ◽

Interpersonal Influence ◽

Important Open Problem

How truth wins in social groups is an important open problem. Classic experiments on social groups dealing with truth statement issues present mixed findings on the conditions of truth abandonment and reaching a consensus on the truth. No theory has been developed and evaluated that might integrate these findings with a mathematical model of the interpersonal influence system that alters some or all of its members’ positions on an issue. In this paper we provide evidence that a general model in the network science on opinion dynamics substantially clarifies how truth wins in groups.

Download Full-text