Improved Algorithms for Conservative Exploration in Bandits

Evrard Garcelon; Mohammad Ghavamzadeh; Alessandro Lazaric; Matteo Pirotta

doi:10.1609/aaai.v34i04.5812

Improved Algorithms for Conservative Exploration in Bandits

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5812 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3962-3969

Author(s):

Evrard Garcelon ◽

Mohammad Ghavamzadeh ◽

Alessandro Lazaric ◽

Matteo Pirotta

Keyword(s):

State Of The Art ◽

Digital Marketing ◽

Learning Problem ◽

Online Learning Algorithms ◽

Empirical Performance ◽

Regret Bounds ◽

Healthcare Finance ◽

And Robotics ◽

Novel Algorithm ◽

Real World Problems

In many fields such as digital marketing, healthcare, finance, and robotics, it is common to have a well-tested and reliable baseline policy running in production (e.g., a recommender system). Nonetheless, the baseline policy is often suboptimal. In this case, it is desirable to deploy online learning algorithms (e.g., a multi-armed bandit algorithm) that interact with the system to learn a better/optimal policy under the constraint that during the learning process the performance is almost never worse than the performance of the baseline itself. In this paper, we study the conservative learning problem in the contextual linear bandit setting and introduce a novel algorithm, the Conservative Constrained LinUCB (CLUCB2). We derive regret bounds for CLUCB2 that match existing results and empirically show that it outperforms state-of-the-art conservative bandit algorithms in a number of synthetic and real-world problems. Finally, we consider a more realistic constraint where the performance is verified only at predefined checkpoints (instead of at every step) and show how this relaxed constraint favorably impacts the regret and empirical performance of CLUCB2.

Download Full-text

Unifying the Stochastic and the Adversarial Bandits with Knapsack

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/459 ◽

2019 ◽

Cited By ~ 1

Author(s):

Anshuka Rangi ◽

Massimo Franceschetti ◽

Long Tran-Thanh

Keyword(s):

Learning Algorithm ◽

Unit Cost ◽

Learning Problem ◽

Maximum Cost ◽

Common Assumption ◽

Performance Guarantees ◽

Fixed Action ◽

Regret Bounds ◽

The Difference ◽

Novel Algorithm

This work investigates the adversarial Bandits with Knapsack (BwK) learning problem, where a player repeatedly chooses to perform an action, pays the corresponding cost of the action, and receives a reward associated with the action. The player is constrained by the maximum budget that can be spent to perform the actions, and the rewards and the costs of these actions are assigned by an adversary. This setting is studied in terms of expected regret, defined as the difference between the total expected rewards per unit cost corresponding the best fixed action and the total expected rewards per unit cost of the learning algorithm. We propose a novel algorithm EXP3.BwK and show that the expected regret of the algorithm is order optimal in the budget. We then propose another algorithm EXP3++.BwK, which is order optimal in the adversarial BwK setting, and incurs an almost optimal expected regret in the stochastic BwK setting where the rewards and the costs are drawn from unknown underlying distributions. These results are then extended to a more general online learning setting, by designing another algorithm EXP3++.LwK and providing its performance guarantees. Finally, we investigate the scenario where the costs of the actions are large and comparable to the budget. We show that for the adversarial setting, the achievable regret bounds scale at least linearly with the maximum cost for any learning algorithm, and are significantly worse in comparison to the case of having costs bounded by a constant, which is a common assumption in the BwK literature.

Download Full-text

Computer Vision and robotics in postal automation

Human Systems Management ◽

10.3233/hsm-1999-183-411 ◽

1999 ◽

Vol 18 (3-4) ◽

pp. 265-273

Author(s):

Giovanni B. Garibotto

Keyword(s):

Image Processing ◽

Computer Vision ◽

Pattern Recognition ◽

Material Handling ◽

State Of The Art ◽

Short Description ◽

The Other ◽

Functional Requirements ◽

Postal Automation ◽

And Robotics

The paper is intended to provide an overview of advanced robotic technologies within the context of Postal Automation services. The main functional requirements of the application are briefly referred, as well as the state of the art and new emerging solutions. Image Processing and Pattern Recognition have always played a fundamental role in Address Interpretation and Mail sorting and the new challenging objective is now off-line handwritten cursive recognition, in order to be able to handle all kind of addresses in a uniform way. On the other hand, advanced electromechanical and robotic solutions are extremely important to solve the problems of mail storage, transportation and distribution, as well as for material handling and logistics. Finally a short description of new services of Postal Automation is referred, by considering new emerging services of hybrid mail and paper to electronic conversion.

Download Full-text

Imbalanced Learning Based on Logistic Discrimination

Computational Intelligence and Neuroscience ◽

10.1155/2016/5423204 ◽

2016 ◽

Vol 2016 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Huaping Guo ◽

Weimei Zhi ◽

Hongbing Liu ◽

Mingliang Xu

Keyword(s):

Statistical Model ◽

Cost Function ◽

State Of The Art ◽

Class Imbalance ◽

Imbalanced Learning ◽

Learning Problem ◽

Logistic Discrimination ◽

Positive Class ◽

Negative Class ◽

Novel Method

In recent years, imbalanced learning problem has attracted more and more attentions from both academia and industry, and the problem is concerned with the performance of learning algorithms in the presence of data with severe class distribution skews. In this paper, we apply the well-known statistical model logistic discrimination to this problem and propose a novel method to improve its performance. To fully consider the class imbalance, we design a new cost function which takes into account the accuracies of both positive class and negative class as well as the precision of positive class. Unlike traditional logistic discrimination, the proposed method learns its parameters by maximizing the proposed cost function. Experimental results show that, compared with other state-of-the-art methods, the proposed one shows significantly better performance on measures of recall,g-mean,f-measure, AUC, and accuracy.

Download Full-text

Relocation of macrophages maintains the barrier function of the urothelium and protects against persistent infection

10.1101/649137 ◽

2019 ◽

Author(s):

Jenny Bottek ◽

Camille Soun ◽

Julia K Volke ◽

Akanksha Dixit ◽

Stephanie Thiebes ◽

...

Keyword(s):

Mass Spectrometry ◽

Urinary Tract Infection ◽

Connective Tissue ◽

Barrier Function ◽

Bacterial Infections ◽

Mass Spectrometry Imaging ◽

Recurrent Urinary Tract Infection ◽

State Of The Art ◽

Urothelial Cells ◽

Novel Algorithm

SUMMARYMacrophages perform essential functions during bacterial infections, such as phagocytosis of pathogens and elimination of neutrophils to reduce spreading of infection, inflammation and tissue damage. The spatial distribution of macrophages is critical to respond to tissue specific adaptations upon infections. Using a novel algorithm for correlative mass spectrometry imaging and state-of-the-art multiplex microscopy, we report here that macrophages within the urinary bladder are positioned in the connective tissue underneath the urothelium. Invading uropathogenic E.coli induced an IL-6–dependent CX3CL1 expression by urothelial cells, facilitating relocation of macrophages from the connective tissue into the urothelium. These cells phagocytosed UPECs and eliminated neutrophils to maintain barrier function of the urothelium, preventing persistent and recurrent urinary tract infection. GRAPHICAL ABSTRACT

Download Full-text

Efficient Lifted Planning with Regression-Based Heuristics

10.32920/ryerson.14647755 ◽

2021 ◽

Author(s):

Hadi Qovaizi

Keyword(s):

Organic Synthesis ◽

State Of The Art ◽

Transition System ◽

Modern State ◽

Synthetic Approach ◽

Best First Search ◽

Planning Task ◽

International Planning ◽

Domain Independent ◽

Novel Algorithm

Modern state-of-the-art planners operate by generating a grounded transition system prior to performing search for a solution to a given planning task. Some tasks involve a significant number of objects or entail managing predicates and action schemas with a significant number of arguments. Hence, this instantiation procedure can exhaust all available memory and therefore prevent a planner from performing search to find a solution. This thesis explores this limitation by presenting a benchmark set of problems based on Organic Chemistry Synthesis that was submitted to the latest International Planning Competition (IPC-2018). This benchmark was constructed to gauge the performance of the competing planners given that instantiation is an issue. Furthermore, a novel algorithm, the Regression-Based Heuristic Planner (RBHP), is developed with the aim of averting this issue. RBHP was inspired by the retro-synthetic approach commonly used to solve organic synthesis problems efficiently. RBHP solves planning tasks by applying domain independent heuristics, computed by regression, and performing best-first search. In contrast to most modern planners, RBHP computes heuristics backwards by applying the goal-directed regression operator. However, the best-first search proceeds forward similar to other planners. The proposed planner is evaluated on a set of planning tasks included in previous International Planning Competitions (IPC) against a subset of the top scoring state-of-the-art planners submitted to the IPC-2018.

Download Full-text

Review of single clustering methods

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v8.i3.pp221-227 ◽

2019 ◽

Vol 8 (3) ◽

pp. 221

Author(s):

Nurshazwani Muhamad Mahfuz ◽

Marina Yusoff ◽

Zakiah Ahmad

Keyword(s):

Data Analytics ◽

Review Paper ◽

State Of The Art ◽

Learning Method ◽

Clustering Methods ◽

Optimal Result ◽

Clustering Method ◽

Noise Data ◽

Research Findings ◽

Real World Problems

<div style="’text-align: justify;">Clustering provides a prime important role as an unsupervised learning method in data analytics to assist many real-world problems such as image segmentation, object recognition or information retrieval. It is often an issue of difficulty for traditional clustering technique due to non-optimal result exist because of the presence of outliers and noise data. This review paper provides a review of single clustering methods that were applied in various domains. The aim is to see the potential suitable applications and aspect of improvement of the methods. Three categories of single clustering methods were suggested, and it would be beneficial to the researcher to see the clustering aspects as well as to determine the requirement for clustering method for an employment based on the state of the art of the previous research findings.</div>

Download Full-text

Digi-HTA: Health technology assessment framework for digital healthcare services

Finnish Journal of eHealth and eWelfare ◽

10.23996/fjhw.82538 ◽

2019 ◽

Vol 11 (4) ◽

Cited By ~ 3

Author(s):

Jari Haverinen ◽

Niina Keränen ◽

Petra Falkenbach ◽

Anna Maijala ◽

Timo Kolehmainen ◽

...

Keyword(s):

Health Technology Assessment ◽

Technology Assessment ◽

State Of The Art ◽

Health Technology ◽

Healthcare Services ◽

The State ◽

Systematic Evaluation ◽

Health Technologies ◽

Digital Healthcare ◽

And Robotics

Health technology assessment (HTA) refers to the systematic evaluation of the properties, effects, and/or impacts of health technology. The main purpose of the assessment is to inform decisionmakers in order to better support the introduction of new health technologies. New digital healthcare solutions like mHealth, artificial intelligence (AI), and robotics have brought with them a great potential to further develop healthcare services, but their introduction should follow the same criteria as that of other healthcare methods. They must provide evidence-based benefits and be safe to use, and their impacts on patients and organizations need to be clarified. The first objective of this study was to describe the state-of-the-art HTA methods for mHealth, AI, and robotics. The second objective of this study was to evaluate the domains needed in the assessment. The final aim was to develop an HTA framework for digital healthcare services to support the introduction of novel technologies into Finnish healthcare. In this study, the state-of-the-art HTA methods were evaluated using a literature review and interviews. It was noted that some good practices already existed, but the overall picture showed that further development is still needed, especially in the AI and robotics fields. With the cooperation of professionals, key aspects and domains that should be taken into account to make fast but comprehensive assessments were identified. Based on this information, we created a new framework which supports the HTA process for digital healthcare services. The framework was named Digi-HTA.

Download Full-text

Transfer Incremental Learning Using Data Augmentation

Applied Sciences ◽

10.3390/app8122512 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2512 ◽

Cited By ~ 2

Author(s):

Ghouthi Boukli Hacene ◽

Vincent Gripon ◽

Nicolas Farrugia ◽

Matthieu Arzel ◽

Michel Jezequel

Keyword(s):

Incremental Learning ◽

Deep Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Low Complexity ◽

Computational Power ◽

Learning Problem ◽

Learning Techniques ◽

Using Data ◽

Selection Of

Deep learning-based methods have reached state of the art performances, relying on a large quantity of available data and computational power. Such methods still remain highly inappropriate when facing a major open machine learning problem, which consists of learning incrementally new classes and examples over time. Combining the outstanding performances of Deep Neural Networks (DNNs) with the flexibility of incremental learning techniques is a promising venue of research. In this contribution, we introduce Transfer Incremental Learning using Data Augmentation (TILDA). TILDA is based on pre-trained DNNs as feature extractors, robust selection of feature vectors in subspaces using a nearest-class-mean based technique, majority votes and data augmentation at both the training and the prediction stages. Experiments on challenging vision datasets demonstrate the ability of the proposed method for low complexity incremental learning, while achieving significantly better accuracy than existing incremental counterparts.

Download Full-text

Partial Classifier Chains with Feature Selection by Exploiting Label Correlation in Multi-Label Classification

Entropy ◽

10.3390/e22101143 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1143

Author(s):

Zhenwu Wang ◽

Tielin Wang ◽

Benting Wan ◽

Mengjie Han

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Predictive Performance ◽

Chain Structure ◽

Classification Performance ◽

Learning Problem ◽

Feature Spaces ◽

Label Correlations ◽

Classifier Chains ◽

Label Correlation

Multi-label classification (MLC) is a supervised learning problem where an object is naturally associated with multiple concepts because it can be described from various dimensions. How to exploit the resulting label correlations is the key issue in MLC problems. The classifier chain (CC) is a well-known MLC approach that can learn complex coupling relationships between labels. CC suffers from two obvious drawbacks: (1) label ordering is decided at random although it usually has a strong effect on predictive performance; (2) all the labels are inserted into the chain, although some of them may carry irrelevant information that discriminates against the others. In this work, we propose a partial classifier chain method with feature selection (PCC-FS) that exploits the label correlation between label and feature spaces and thus solves the two previously mentioned problems simultaneously. In the PCC-FS algorithm, feature selection is performed by learning the covariance between feature set and label set, thus eliminating the irrelevant features that can diminish classification performance. Couplings in the label set are extracted, and the coupled labels of each label are inserted simultaneously into the chain structure to execute the training and prediction activities. The experimental results from five metrics demonstrate that, in comparison to eight state-of-the-art MLC algorithms, the proposed method is a significant improvement on existing multi-label classification.

Download Full-text

Multilabel Classification with Principal Label Space Transformation

Neural Computation ◽

10.1162/neco_a_00320 ◽

2012 ◽

Vol 24 (9) ◽

pp. 2508-2542 ◽

Cited By ~ 117

Author(s):

Farbound Tai ◽

Hsuan-Tien Lin

Keyword(s):

Singular Value ◽

Data Sets ◽

Classification Problems ◽

Real World Data ◽

Multilabel Classification ◽

Binary Relevance ◽

Space Transformation ◽

Empirical Performance ◽

Value Decomposition ◽

Novel Algorithm

We consider a hypercube view to perceive the label space of multilabel classification problems geometrically. The view allows us not only to unify many existing multilabel classification approaches but also design a novel algorithm, principal label space transformation (PLST), that captures key correlations between labels before learning. The simple and efficient PLST relies on only singular value decomposition as the key step. We derive the theoretical guarantee of PLST and evaluate its empirical performance using real-world data sets. Experimental results demonstrate that PLST is faster than the traditional binary relevance approach and is superior to the modern compressive sensing approach in terms of both accuracy and efficiency.

Download Full-text