scholarly journals Fatigue-Aware Bandits for Dependent Click Models

2020 ◽  
Vol 34 (04) ◽  
pp. 3341-3348
Author(s):  
Junyu Cao ◽  
Wei Sun ◽  
Zuo-Jun (Max) Shen ◽  
Markus Ettl

As recommender systems send a massive amount of content to keep users engaged, users may experience fatigue which is contributed by 1) an overexposure to irrelevant content, 2) boredom from seeing too many similar recommendations. To address this problem, we consider an online learning setting where a platform learns a policy to recommend content that takes user fatigue into account. We propose an extension of the Dependent Click Model (DCM) to describe users' behavior. We stipulate that for each piece of content, its attractiveness to a user depends on its intrinsic relevance and a discount factor which measures how many similar contents have been shown. Users view the recommended content sequentially and click on the ones that they find attractive. Users may leave the platform at any time, and the probability of exiting is higher when they do not like the content. Based on user's feedback, the platform learns the relevance of the underlying content as well as the discounting effect due to content fatigue. We refer to this learning task as “fatigue-aware DCM Bandit” problem. We consider two learning scenarios depending on whether the discounting effect is known. For each scenario, we propose a learning algorithm which simultaneously explores and exploits, and characterize its regret bound.

Author(s):  
Chuang Zhang ◽  
Chen Gong ◽  
Tengfei Liu ◽  
Xun Lu ◽  
Weiqiang Wang ◽  
...  

Positive and Unlabeled learning (PU learning) aims to build a binary classifier where only positive and unlabeled data are available for classifier training. However, existing PU learning methods all work on a batch learning mode, which cannot deal with the online learning scenarios with sequential data. Therefore, this paper proposes a novel positive and unlabeled learning algorithm in an online training mode, which trains a classifier solely on the positive and unlabeled data arriving in a sequential order. Specifically, we adopt an unbiased estimate for the loss induced by the arriving positive or unlabeled examples at each time. Then we show that for any coming new single datum, the model can be updated independently and incrementally by gradient based online learning method. Furthermore, we extend our method to tackle the cases when more than one example is received at each time. Theoretically, we show that the proposed online PU learning method achieves low regret even though it receives sequential positive and unlabeled data. Empirically, we conduct intensive experiments on both benchmark and real-world datasets, and the results clearly demonstrate the effectiveness of the proposed method.


Author(s):  
Junyu Cao ◽  
Wei Sun

Motivated by the observation that overexposure to unwanted marketing activities leads to customer dissatisfaction, we consider a setting where a platform offers a sequence of messages to its users and is penalized when users abandon the platform due to marketing fatigue. We propose a novel sequential choice model to capture multiple interactions taking place between the platform and its user: Upon receiving a message, a user decides on one of the three actions: accept the message, skip and receive the next message, or abandon the platform. Based on user feedback, the platform dynamically learns users’ abandonment distribution and their valuations of messages to determine the length of the sequence and the order of the messages, while maximizing the cumulative payoff over a horizon of length T. We refer to this online learning task as the sequential choice bandit problem. For the offline combinatorial optimization problem, we show a polynomialtime algorithm. For the online problem, we propose an algorithm that balances exploration and exploitation, and characterize its regret bound. Lastly, we demonstrate how to extend the model with user contexts to incorporate personalization.


Algorithms ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 18
Author(s):  
Michael Li ◽  
Santoso Wibowo ◽  
Wei Li ◽  
Lily D. Li

Extreme learning machine (ELM) is a popular randomization-based learning algorithm that provides a fast solution for many regression and classification problems. In this article, we present a method based on ELM for solving the spectral data analysis problem, which essentially is a class of inverse problems. It requires determining the structural parameters of a physical sample from the given spectroscopic curves. We proposed that the unknown target inverse function is approximated by an ELM through adding a linear neuron to correct the localized effect aroused by Gaussian basis functions. Unlike the conventional methods involving intensive numerical computations, under the new conceptual framework, the task of performing spectral data analysis becomes a learning task from data. As spectral data are typical high-dimensional data, the dimensionality reduction technique of principal component analysis (PCA) is applied to reduce the dimension of the dataset to ensure convergence. The proposed conceptual framework is illustrated using a set of simulated Rutherford backscattering spectra. The results have shown the proposed method can achieve prediction inaccuracies of less than 1%, which outperform the predictions from the multi-layer perceptron and numerical-based techniques. The presented method could be implemented as application software for real-time spectral data analysis by integrating it into a spectroscopic data collection system.


Author(s):  
Weilin Nie ◽  
Cheng Wang

Abstract Online learning is a classical algorithm for optimization problems. Due to its low computational cost, it has been widely used in many aspects of machine learning and statistical learning. Its convergence performance depends heavily on the step size. In this paper, a two-stage step size is proposed for the unregularized online learning algorithm, based on reproducing Kernels. Theoretically, we prove that, such an algorithm can achieve a nearly min–max convergence rate, up to some logarithmic term, without any capacity condition.


2017 ◽  
Vol 10 (13) ◽  
pp. 284
Author(s):  
Ankush Rai ◽  
Jagadeesh Kannan R

In the past decade development of machine learning algorithm for network settings has witnessed little advancements owing to slow development of technologies for improving bandwidth and latency.  In this study we present a novel online learning algorithm for network based computational operations in image processing setting


Sign in / Sign up

Export Citation Format

Share Document