Cost-aware Cascading Bandits

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/448 ◽

2018 ◽

Cited By ~ 3

Author(s):

Ruida Zhou ◽

Chao Gan ◽

Jing Yang ◽

Cong Shen

Keyword(s):

Lower Bound ◽

Real World ◽

Unit Cost ◽

Confidence Bound ◽

Real World Data ◽

Total Cost ◽

Learning Agent ◽

New Variant ◽

Online Setting ◽

Upper Confidence Bound

In this paper, we propose a cost-aware cascading bandits model, a new variant of multi-armed bandits with cascading feedback, by considering the random cost of pulling arms. In each step, the learning agent chooses an {\it ordered} list of items and \congr{examines} them sequentially, until certain stopping condition is satisfied. Our objective is then to maximize the expected {\it net reward} in each step, i.e., the reward obtained in each step minus the total cost incurred in examining the items, by deciding the ordered list of items, as well as when to stop examination. We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the offline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cascading Upper Confidence Bound (CC-UCB) algorithm, and show that the cumulative regret scales in $O(\log T)$. We also provide a lower bound for all $\alpha$-consistent policies, which scales in $\Omega(\log T)$ and matches our upper bound. The performance of the CC-UCB algorithm is evaluated with both synthetic and real-world data.

Download Full-text

Randomised Gaussian Process Upper Confidence Bound for Bayesian Optimisation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/316 ◽

2020 ◽

Author(s):

Julian Berk ◽

Sunil Gupta ◽

Santu Rana ◽

Svetha Venkatesh

Keyword(s):

Gaussian Process ◽

Real World ◽

Confidence Bound ◽

Trade Off ◽

Upper Confidence Bound ◽

Exploration Exploitation

In order to improve the performance of Bayesian optimisation, we develop a modified Gaussian process upper confidence bound (GP-UCB) acquisition function. This is done by sampling the exploration-exploitation trade-off parameter from a distribution. We prove that this allows the expected trade-off parameter to be altered to better suit the problem without compromising a bound on the function's Bayesian regret. We also provide results showing that our method achieves better performance than GP-UCB in a range of real-world and synthetic problems.

Download Full-text

Learning Disentangled Representation with Pairwise Independence

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014245 ◽

2019 ◽

Vol 33 ◽

pp. 4245-4252

Author(s):

Zejian Li ◽

Yongchuan Tang ◽

Wei Li ◽

Yongxing He

Keyword(s):

Lower Bound ◽

Real World ◽

State Of The Art ◽

Representation Learning ◽

New Method ◽

Real World Data ◽

Independence Assumption ◽

Mutual Independence ◽

Pairwise Independence ◽

Mutually Independent

Unsupervised disentangled representation learning is one of the foundational methods to learn interpretable factors in the data. Existing learning methods are based on the assumption that disentangled factors are mutually independent and incorporate this assumption with the evidence lower bound. However, our experiment reveals that factors in real-world data tend to be pairwise independent. Accordingly, we propose a new method based on a pairwise independence assumption to learn the disentangled representation. The evidence lower bound implicitly encourages mutual independence of latent codes so it is too strong for our assumption. Therefore, we introduce another lower bound in our method. Extensive experiments show that our proposed method gives competitive performances as compared with other state-of-the-art methods.

Download Full-text

Optimal Bidding Strategy for Brand Advertising

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/59 ◽

2018 ◽

Cited By ~ 4

Author(s):

Takanori Maehara ◽

Atsuhiro Narita ◽

Jun Baba ◽

Takayuki Kawabata

Keyword(s):

Objective Function ◽

Real World ◽

Maximization Problem ◽

Memory Retention ◽

Real World Data ◽

Real World Setting ◽

Brand Advertising ◽

Online Setting ◽

Optimal Bidding ◽

Submodular Maximization

Brand advertising is a type of advertising that aims at increasing the awareness of companies or products. This type of advertising is well studied in economic, marketing, and psychological literature; however, there are no studies in the area of computational advertising because the effect of such advertising is difficult to observe. In this study, we consider a real-time biding strategy for brand advertising. Here, our objective to maximizes the total number of users who remember the advertisement, averaged over the time. For this objective, we first introduce a new objective function that captures the cognitive psychological properties of memory retention, and can be optimized efficiently in the online setting (i.e., it is a monotone submodular function). Then, we propose an algorithm for the bid optimization problem with the proposed objective function under the second price mechanism by reducing the problem to the online knapsack constrained monotone submodular maximization problem. We evaluated the proposed objective function and the algorithm in a real-world data collected from our system and a questionnaire survey. We observed that our objective function is reasonable in real-world setting, and the proposed algorithm outperformed the baseline online algorithms.

Download Full-text

AdaLinUCB: Opportunistic Learning for Contextual Bandits

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/336 ◽

2019 ◽

Author(s):

Xueying Guo ◽

Xiaoxiao Wang ◽

Xin Liu

Keyword(s):

Real World ◽

Upper Bound ◽

Environmental Conditions ◽

Confidence Bound ◽

Trade Off ◽

Network Load ◽

Opportunistic Learning ◽

Upper Confidence Bound ◽

Exploration Exploitation ◽

Special Case

In this paper, we propose and study opportunistic contextual bandits - a special case of contextual bandits where the exploration cost varies under different environmental conditions, such as network load or return variation in recommendations. When the exploration cost is low, so is the actual regret of pulling a sub-optimal arm (e.g., trying a suboptimal recommendation). Therefore, intuitively, we could explore more when the exploration cost is relatively low and exploit more when the exploration cost is relatively high. Inspired by this intuition, for opportunistic contextual bandits with Linear payoffs, we propose an Adaptive Upper-Confidence-Bound algorithm (AdaLinUCB) to adaptively balance the exploration-exploitation trade-off for opportunistic learning. We prove that AdaLinUCB achieves O((log T)^2) problem-dependent regret upper bound, which has a smaller coefficient than that of the traditional LinUCB algorithm. Moreover, based on both synthetic and real-world dataset, we show that AdaLinUCB significantly outperforms other contextual bandit algorithms, under large exploration cost fluctuations.

Download Full-text

Incorporating Behavioral Constraints in Online AI Systems

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013 ◽

2019 ◽

Vol 33 ◽

pp. 3-11 ◽

Cited By ~ 1

Author(s):

Avinash Balakrishnan ◽

Djallel Bouneffouf ◽

Nicholas Mattei ◽

Francesca Rossi

Keyword(s):

Real World ◽

Upper Bound ◽

Daily Life ◽

Exploration And Exploitation ◽

Real World Data ◽

Thompson Sampling ◽

Behavioral Constraints ◽

Online Setting ◽

Additional Constraints

AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. However, in many cases the online rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online agent that learns a set of behavioral constraints by observation and uses these learned constraints as a guide when making decisions in an online setting while still being reactive to reward feedback. To define this agent, we propose to adopt a novel extension to the classical contextual multi-armed bandit setting and we provide a new algorithm called Behavior Constrained Thompson Sampling (BCTS) that allows for online learning while obeying exogenous constraints. Our agent learns a constrained policy that implements the observed behavioral constraints demonstrated by a teacher agent, and then uses this constrained policy to guide the reward-based online exploration and exploitation. We characterize the upper bound on the expected regret of the contextual bandit algorithm that underlies our agent and provide a case study with real world data in two application domains. Our experiments show that the designed agent is able to act within the set of behavior constraints without significantly degrading its overall reward performance.

Download Full-text

Abstract #946: Assessment of Acromegaly Patients with and Without Diabetes Treated with Lanreotide Depot: 2-Year Real World Data

Endocrine Practice ◽

10.1016/s1530-891x(20)45236-x ◽

2016 ◽

Vol 22 ◽

pp. 219

Author(s):

Roberto Salvatori ◽

Olga Gambetti ◽

Whitney Woodmansee ◽

David Cox ◽

Beloo Mirakhur ◽

...

Keyword(s):

Real World ◽

Real World Data ◽

World Data

Download Full-text

Safety and efficacy of direct acting oral anticoagulants and vitamin K antagonists in nonvalvular atrial fibrillation – a network meta-analysis of real-world data

VASA ◽

10.1024/0301-1526/a000746 ◽

2019 ◽

Vol 48 (2) ◽

pp. 134-147 ◽

Cited By ~ 8

Author(s):

Mirko Hirschl ◽

Michael Kundi

Keyword(s):

Real World ◽

Meta Analysis ◽

Vitamin K Antagonists ◽

Oral Anticoagulants ◽

Efficacy And Safety ◽

Nonvalvular Atrial Fibrillation ◽

Real World Data ◽

Hazard Ratios ◽

Direct Acting ◽

Event Rates

Abstract. Background: In randomized controlled trials (RCTs) direct acting oral anticoagulants (DOACs) showed a superior risk-benefit profile in comparison to vitamin K antagonists (VKAs) for patients with nonvalvular atrial fibrillation. Patients enrolled in such studies do not necessarily reflect the whole target population treated in real-world practice. Materials and methods: By a systematic literature search, 88 studies including 3,351,628 patients providing over 2.9 million patient-years of follow-up were identified. Hazard ratios and event-rates for the main efficacy and safety outcomes were extracted and the results for DOACs and VKAs combined by network meta-analysis. In addition, meta-regression was performed to identify factors responsible for heterogeneity across studies. Results: For stroke and systemic embolism as well as for major bleeding and intracranial bleeding real-world studies gave virtually the same result as RCTs with higher efficacy and lower major bleeding risk (for dabigatran and apixaban) and lower risk of intracranial bleeding (all DOACs) compared to VKAs. Results for gastrointestinal bleeding were consistently better for DOACs and hazard ratios of myocardial infarction were significantly lower in real-world for dabigatran and apixaban compared to RCTs. By a ranking analysis we found that apixaban is the safest anticoagulant drug, while rivaroxaban closely followed by dabigatran are the most efficacious. Risk of bias and heterogeneity was assessed and had little impact on the overall results. Analysis of effect modification could guide the clinical decision as no single DOAC was superior/inferior to the others under all conditions. Conclusions: DOACs were at least as efficacious as VKAs. In terms of safety endpoints, DOACs performed better under real-world conditions than in RCTs. The current real-world data showed that differences in efficacy and safety, despite generally low event rates, exist between DOACs. Knowledge about these differences in performance can contribute to a more personalized medicine.

Download Full-text

Evaluating a Macrocognition Model of Team Collaboration using Real-world Data from the Haiti Relief Effort

PsycEXTRA Dataset ◽

10.1037/e578902012-052 ◽

2011 ◽

Author(s):

Susan G. Hutchins

Keyword(s):

Real World ◽

Real World Data ◽

Team Collaboration ◽

World Data ◽

Relief Effort

Download Full-text

Review for "Experience with clarithromycin as antineoplastic therapy for extranodal marginal zone B‐cell lymphoma of the mucosa associated lymphoid tissue ( MALT ‐lymphoma) outside of clinical trials: Real‐world data from the University of Vienna"

10.1002/hon.2738/v1/review1 ◽

2020 ◽

Keyword(s):

Clinical Trials ◽

B Cell ◽

Real World ◽

Lymphoid Tissue ◽

Marginal Zone ◽

Cell Lymphoma ◽

B Cell Lymphoma ◽

Real World Data ◽

University Of Vienna ◽

The University

Download Full-text

Author response for "Real‐world data on efficacy and safety of obinutuzumab plus chlorambucil, rituximab plus chlorambucil and rituximab plus bendamustine in the frontline treatment of chronic lymphocytic leukemia: The GO‐CLLEAR Study by the Czech CLL Study Group"

10.1002/hon.2744/v2/response1 ◽

2020 ◽

Author(s):

Anna Panovská ◽

Lucie NĚmcová ◽

Lucie Nekvindová ◽

Martin Špaček ◽

Martin Šimkovič ◽

...

Keyword(s):

Chronic Lymphocytic Leukemia ◽

Real World ◽

Lymphocytic Leukemia ◽

Author Response ◽

Study Group ◽

Efficacy And Safety ◽

Real World Data ◽

World Data ◽

Frontline Treatment

Download Full-text