regret bound
Recently Published Documents


TOTAL DOCUMENTS

31
(FIVE YEARS 26)

H-INDEX

2
(FIVE YEARS 1)

Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1257
Author(s):  
Dimitri Meunier ◽  
Pierre Alquier

Online learning methods, similar to the online gradient algorithm (OGA) and exponentially weighted aggregation (EWA), often depend on tuning parameters that are difficult to set in practice. We consider an online meta-learning scenario, and we propose a meta-strategy to learn these parameters from past tasks. Our strategy is based on the minimization of a regret bound. It allows us to learn the initialization and the step size in OGA with guarantees. It also allows us to learn the prior or the learning rate in EWA. We provide a regret analysis of the strategy. It allows to identify settings where meta-learning indeed improves on learning each task in isolation.


Author(s):  
Yanchen Deng ◽  
Runsheng Yu ◽  
Xinrun Wang ◽  
Bo An

Distributed constraint optimization problems (DCOPs) are a powerful model for multi-agent coordination and optimization, where information and controls are distributed among multiple agents by nature. Sampling-based algorithms are important incomplete techniques for solving medium-scale DCOPs. However, they use tables to exactly store all the information (e.g., costs, confidence bounds) to facilitate sampling, which limits their scalability. This paper tackles the limitation by incorporating deep neural networks in solving DCOPs for the first time and presents a neural-based sampling scheme built upon regret-matching. In the algorithm, each agent trains a neural network to approximate the regret related to its local problem and performs sampling according to the estimated regret. Furthermore, to ensure exploration we propose a regret rounding scheme that rounds small regret values to positive numbers. We theoretically show the regret bound of our algorithm and extensive evaluations indicate that our algorithm can scale up to large-scale DCOPs and significantly outperform the state-of-the-art methods.


Author(s):  
Xi Chen ◽  
Yining Wang ◽  
Yuan Zhou

We study the dynamic assortment planning problem, where for each arriving customer, the seller offers an assortment of substitutable products and the customer makes the purchase among offered products according to an uncapacitated multinomial logit (MNL) model. Because all the utility parameters of the MNL model are unknown, the seller needs to simultaneously learn customers’ choice behavior and make dynamic decisions on assortments based on the current knowledge. The goal of the seller is to maximize the expected revenue, or, equivalently, to minimize the expected regret. Although dynamic assortment planning problem has received an increasing attention in revenue management, most existing policies require the estimation of mean utility for each product and the final regret usually involves the number of products [Formula: see text]. The optimal regret of the dynamic assortment planning problem under the most basic and popular choice model—the MNL model—is still open. By carefully analyzing a revenue potential function, we develop a trisection-based policy combined with adaptive confidence bound construction, which achieves an item-independent regret bound of [Formula: see text], where [Formula: see text] is the length of selling horizon. We further establish the matching lower bound result to show the optimality of our policy. There are two major advantages of the proposed policy. First, the regret of all our policies has no dependence on [Formula: see text]. Second, our policies are almost assumption-free: there is no assumption on mean utility nor any “separability” condition on the expected revenues for different assortments. We also extend our trisection search algorithm to capacitated MNL models and obtain the optimal regret [Formula: see text] (up to logrithmic factors) without any assumption on the mean utility parameters of items.


2021 ◽  
Author(s):  
Songnam Hong ◽  
Jeongmin Chae

<div>The random feature-based online multi-kernel learning (RF-OMKL) is a promising framework in functional learning tasks. This framework is necessary for an online learning with continuous streaming data due to its low-complexity and scalability. </div><div>In RF-OMKL framework, numerous algorithms can be presented according to an underlying online learning and optimization techniques. The best known algorithm (termed Raker) has been proposed with the lens of the famous online learning with expert advice, where each kernel from a kernel dictionary is viewed as an expert. Harnessing this relation, it was proved that Raker yields a sublinear {\em expert} regret bound, in which as the name implies, the best function is further restricted as the expert-based framework. Namely, it is not an actual sublinear regret bound under RF-OMKL framework. In this paper, we propose a novel algorithm (named BestOMKL) for RF-OMKL framework and prove that it achieves a sublinear regret bound under a certain condition. Beyond our theoretical contribution, we demonstrate the superiority of our algorithm via numerical tests with real datasets. Notably, BestOMKL outperforms the state-of-the-art kernel-based algorithms (including Raker) on various online learning tasks, while having a lower complexity as Raker. These suggest the practicality of BestOMKL.</div>


2021 ◽  
Author(s):  
Songnam Hong ◽  
Jeongmin Chae

<div>The random feature-based online multi-kernel learning (RF-OMKL) is a promising framework in functional learning tasks. This framework is necessary for an online learning with continuous streaming data due to its low-complexity and scalability. </div><div>In RF-OMKL framework, numerous algorithms can be presented according to an underlying online learning and optimization techniques. The best known algorithm (termed Raker) has been proposed with the lens of the famous online learning with expert advice, where each kernel from a kernel dictionary is viewed as an expert. Harnessing this relation, it was proved that Raker yields a sublinear {\em expert} regret bound, in which as the name implies, the best function is further restricted as the expert-based framework. Namely, it is not an actual sublinear regret bound under RF-OMKL framework. In this paper, we propose a novel algorithm (named BestOMKL) for RF-OMKL framework and prove that it achieves a sublinear regret bound under a certain condition. Beyond our theoretical contribution, we demonstrate the superiority of our algorithm via numerical tests with real datasets. Notably, BestOMKL outperforms the state-of-the-art kernel-based algorithms (including Raker) on various online learning tasks, while having a lower complexity as Raker. These suggest the practicality of BestOMKL.</div>


2021 ◽  
Author(s):  
Songnam Hong ◽  
Jeongmin Chae

Online federated learning (OFL) is a promising framework to learn a sequence of global functions using distributed sequential data at local devices. In this framework, we first introduce a {\em single} kernel-based OFL (termed S-KOFL) by incorporating the random-feature (RF) approximation, online gradient descent (OGD), and federated averaging (FedAvg) properly. However, it is nontrivial to develop a communication-efficient method with multiple kernels. One can construct a multi-kernel method (termed vM-KOFL) by following the extension principle in the centralized counterpart. This vanilla method is not practical as the communication overhead grows linearly with the size of a kernel dictionary. Moreover, this problem is not addressed via the existing communication-efficient techniques in federated learning such as quantization or sparsification. Our major contribution is to propose a novel randomized algorithm (named eM-KOFL), which can enjoy the advantage of multiple kernels while having a similar communication overhead with S-KOFL. It is theoretically proved that eM-KOFL yields the same asymptotic performance as vM-KOFL, i.e., both methods achieve an optimal sublinear regret bound. Mimicking the key principle of eM-KOFL efficiently, pM-KOFL is presented. Via numerical tests with real datasets, we demonstrate that pM-KOFL can yield the same performances as vM-KOFL and eM-KOFL on various online learning tasks while having the same communication overhead as S-KOFL. These suggest the practicality of the proposed pM-KOFL.


2021 ◽  
Author(s):  
Songnam Hong ◽  
Jeongmin Chae

Online federated learning (OFL) is a promising framework to learn a sequence of global functions using distributed sequential data at local devices. In this framework, we first introduce a {\em single} kernel-based OFL (termed S-KOFL) by incorporating the random-feature (RF) approximation, online gradient descent (OGD), and federated averaging (FedAvg) properly. However, it is nontrivial to develop a communication-efficient method with multiple kernels. One can construct a multi-kernel method (termed vM-KOFL) by following the extension principle in the centralized counterpart. This vanilla method is not practical as the communication overhead grows linearly with the size of a kernel dictionary. Moreover, this problem is not addressed via the existing communication-efficient techniques in federated learning such as quantization or sparsification. Our major contribution is to propose a novel randomized algorithm (named eM-KOFL), which can enjoy the advantage of multiple kernels while having a similar communication overhead with S-KOFL. It is theoretically proved that eM-KOFL yields the same asymptotic performance as vM-KOFL, i.e., both methods achieve an optimal sublinear regret bound. Mimicking the key principle of eM-KOFL efficiently, pM-KOFL is presented. Via numerical tests with real datasets, we demonstrate that pM-KOFL can yield the same performances as vM-KOFL and eM-KOFL on various online learning tasks while having the same communication overhead as S-KOFL. These suggest the practicality of the proposed pM-KOFL.


Author(s):  
Weijia Shao ◽  
Lukas Friedemann Radke ◽  
Fikret Sivrikaya ◽  
Sahin Albayrak

We study the problem of predicting time series data using the autoregressive integrated moving average (ARIMA) model in an online manner. Existing algorithms require model selection, which is time consuming and inapt for the setting of online learning. Using adaptive online learning techniques, we develop algorithms for fitting ARIMA models with fewest possible hyperparameters. We analyse the regret bound of the proposed algorithms and examine their performance using experiments on both synthetic and real world datasets


Author(s):  
Thibaut Cuvelier ◽  
Richard Combes ◽  
Eric Gourdin

We consider combinatorial semi-bandits over a set of arms X \subset \0,1\ ^d where rewards are uncorrelated across items. For this problem, the algorithm ESCB yields the smallest known regret bound R(T) = O( d (łn m)^2 (łn T) / Δ_\min ) after T rounds, where m = \max_x \in X 1^\top x. However, ESCB it has computational complexity O(|X|), which is typically exponential in d, and cannot be used in large dimensions. We propose the first algorithm that is both computationally and statistically efficient for this problem with regret R(T) = O( d (łn m)^2 (łn T) / Δ_\min ) and computational asymptotic complexity O(δ_T^-1 poly(d)), where δ_T is a function which vanishes arbitrarily slowly. Our approach involves carefully designing AESCB, an approximate version of ESCB with the same regret guarantees. We show that, whenever budgeted linear maximization over X can be solved up to a given approximation ratio, AESCB is implementable in polynomial time O(δ_T^-1 poly(d)) by repeatedly maximizing a linear function over X subject to a linear budget constraint, and showing how to solve these maximization problems efficiently.


Sign in / Sign up

Export Citation Format

Share Document