Pattern-enhanced Contrastive Policy Learning Network for Sequential Recommendation

Sequential recommendation aims to predict users’ future behaviors given their historical interactions. However, due to the randomness and diversity of a user’s behaviors, not all historical items are informative to tell his/her next choice. It is obvious that identifying relevant items and extracting meaningful sequential patterns are necessary for a better recommendation. Unfortunately, few works have focused on this sequence denoising process. In this paper, we propose a PatteRn-enhanced ContrAstive Policy Learning Network (RAP for short) for sequential recommendation, RAP formalizes the denoising problem in the form of Markov Decision Process (MDP), and sample actions for each item to determine whether it is relevant with the target item. To tackle the lack of relevance supervision, RAP fuses a series of mined sequential patterns into the policy learning process, which work as a prior knowledge to guide the denoising process. After that, RAP splits the initial item sequence into two disjoint subsequences: a positive subsequence and a negative subsequence. At this, a novel contrastive learning mechanism is introduced to guide the sequence denoising and achieve preference estimation from the positive subsequence simultaneously. Extensive experiments on four public real-world datasets demonstrate the effectiveness of our approach for sequential recommendation.

Download Full-text

Reinforcement Learning with Non-Markovian Rewards

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5814 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3980-3987

Author(s):

Maor Gaon ◽

Ronen Brafman

Keyword(s):

Reinforcement Learning ◽

Real World ◽

Decision Process ◽

Policy Learning ◽

Learning From Experience ◽

World Model ◽

Basic Premise ◽

Q Learning ◽

Markov Decision ◽

Automata Learning

The standard RL world model is that of a Markov Decision Process (MDP). A basic premise of MDPs is that the rewards depend on the last state and action only. Yet, many real-world rewards are non-Markovian. For example, a reward for bringing coffee only if requested earlier and not yet served, is non-Markovian if the state only records current requests and deliveries. Past work considered the problem of modeling and solving MDPs with non-Markovian rewards (NMR), but we know of no principled approaches for RL with NMR. Here, we address the problem of policy learning from experience with such rewards. We describe and evaluate empirically four combinations of the classical RL algorithm Q-learning and R-max with automata learning algorithms to obtain new RL algorithms for domains with NMR. We also prove that some of these variants converge to an optimal policy in the limit.

Download Full-text

COG-DICE: An Algorithm for Solving Continuous-Observation Dec-POMDPs

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/638 ◽

2017 ◽

Author(s):

Madison Clark-Turner ◽

Christopher Amato

Keyword(s):

Markov Decision Process ◽

Real World ◽

Decision Process ◽

Extended Version ◽

Continuous Observation ◽

Solution Methods ◽

Markov Decision ◽

Multi Agent ◽

Partially Observable Markov ◽

Partially Observable

The decentralized partially observable Markov decision process (Dec-POMDP) is a powerful model for representing multi-agent problems with decentralized behavior. Unfortunately, current Dec-POMDP solution methods cannot solve problems with continuous observations, which are common in many real-world domains. To that end, we present a framework for representing and generating Dec-POMDP policies that explicitly include continuous observations. We apply our algorithm to a novel tagging problem and an extended version of a common benchmark, where it generates policies that meet or exceed the values of equivalent discretized domains without the need for finding an adequate discretization.

Download Full-text

A Smart Cache Content Update Policy Based on Deep Reinforcement Learning

Wireless Communications and Mobile Computing ◽

10.1155/2020/8836592 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Lincan Li ◽

Chiew Foong Kwong ◽

Qianyu Liu ◽

Jing Wang

Keyword(s):

Decision Process ◽

Training Data ◽

Q Learning ◽

Learning Network ◽

The Neural Network ◽

Markov Decision ◽

Experience Replay ◽

Average Latency ◽

Simulation Results ◽

Cache Hit Ratio

This paper proposes a DRL-based cache content update policy in the cache-enabled network to improve the cache hit ratio and reduce the average latency. In contrast to the existing policies, a more practical cache scenario is considered in this work, in which the content requests vary by both time and location. Considering the constraint of the limited cache capacity, the dynamic content update problem is modeled as a Markov decision process (MDP). Besides that, the deep Q-learning network (DQN) algorithm is utilised to solve the MDP problem. Specifically, the neural network is optimised to approximate the Q value where the training data are chosen from the experience replay memory. The DQN agent derives the optimal policy for the cache decision. Compared with the existing policies, the simulation results show that our proposed policy is 56%–64% improved in terms of the cache hit ratio and 56%–59% decreased in terms of the average latency.

Download Full-text

DA-GCN: A Domain-aware Attentive Graph Convolution Network for Shared-account Cross-domain Sequential Recommendation

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/342 ◽

2021 ◽

Author(s):

Lei Guo ◽

Li Tang ◽

Tong Chen ◽

Lei Zhu ◽

Quoc Viet Hung Nguyen ◽

...

Keyword(s):

Real World ◽

Message Passing ◽

Sequential Patterns ◽

Graph Structure ◽

Domain Specific ◽

User Behaviors ◽

Cross Domain ◽

Latent Space ◽

Real World Datasets ◽

Multiple Domains

Shared-account Cross-domain Sequential Recommendation (SCSR) is the task of recommending the next item based on a sequence of recorded user behaviors, where multiple users share a single account, and their behaviours are available in multiple domains. Existing work on solving SCSR mainly relies on mining sequential patterns via RNN-based models, which are not expressive enough to capture the relationships among multiple entities. Moreover, all existing algorithms try to bridge two domains via knowledge transfer in the latent space, and the explicit cross-domain graph structure is unexploited. In this work, we propose a novel graph-based solution, namely DA-GCN, to address the above challenges. Specifically, we first link users and items in each domain as a graph. Then, we devise a domain-aware graph convolution network to learn user-specific node representations. To fully account for users' domain-specific preferences on items, two novel attention mechanisms are further developed to selectively guide the message passing process. Extensive experiments on two real-world datasets are conducted to demonstrate the superiority of our DA-GCN method.

Download Full-text

A Markov Decision Process Approach for Cost-Benefit Analysis of Infrastructure Resilience Upgrades

SSRN Electronic Journal ◽

10.2139/ssrn.3657479 ◽

2020 ◽

Author(s):

Qianru Zhu ◽

Benjamin D. Leibowicz

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Cost Benefit Analysis ◽

Cost Benefit ◽

Process Approach ◽

Benefit Analysis ◽

Markov Decision ◽

Infrastructure Resilience

Download Full-text

A Markov Decision Process Workflow for Automating Interior Design

KSCE Journal of Civil Engineering ◽

10.1007/s12205-021-1272-6 ◽

2021 ◽

Author(s):

Ebrahim Karan ◽

Sadegh Asgari ◽

Abbas Rashidi

Keyword(s):

Markov Decision Process ◽

Interior Design ◽

Decision Process ◽

Markov Decision

Download Full-text

Time-Efficient Ensemble Learning with Sample Exchange for Edge Computing

ACM Transactions on Internet Technology ◽

10.1145/3409265 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1-17

Author(s):

Wu Chen ◽

Yong Yu ◽

Keke Gai ◽

Jiamou Liu ◽

Kim-Kwang Raymond Choo

Keyword(s):

Ensemble Learning ◽

Real World ◽

Interaction Mechanism ◽

Training Model ◽

Edge Computing ◽

Learning Techniques ◽

Multi Agent ◽

Real World Datasets ◽

Entire Dataset ◽

Exchange Data

In existing ensemble learning algorithms (e.g., random forest), each base learner’s model needs the entire dataset for sampling and training. However, this may not be practical in many real-world applications, and it incurs additional computational costs. To achieve better efficiency, we propose a decentralized framework: Multi-Agent Ensemble. The framework leverages edge computing to facilitate ensemble learning techniques by focusing on the balancing of access restrictions (small sub-dataset) and accuracy enhancement. Specifically, network edge nodes (learners) are utilized to model classifications and predictions in our framework. Data is then distributed to multiple base learners who exchange data via an interaction mechanism to achieve improved prediction. The proposed approach relies on a training model rather than conventional centralized learning. Findings from the experimental evaluations using 20 real-world datasets suggest that Multi-Agent Ensemble outperforms other ensemble approaches in terms of accuracy even though the base learners require fewer samples (i.e., significant reduction in computation costs).

Download Full-text

A constraint partially observable semi-Markov decision process for the attack–defence relationships in various critical infrastructures

Cyber-Physical Systems ◽

10.1080/23335777.2021.1879935 ◽

2021 ◽

pp. 1-26

Author(s):

Nadia Niknami ◽

Jie Wu

Keyword(s):

Markov Decision Process ◽

Decision Process ◽

Critical Infrastructures ◽

Markov Decision ◽

Partially Observable

Download Full-text

OFCOD: On the Fly Clustering Based Outlier Detection Framework

Data ◽

10.3390/data6010001 ◽

2020 ◽

Vol 6 (1) ◽

pp. 1

Author(s):

Ahmed Elmogy ◽

Hamada Rizk ◽

Amany M. Sarhan

Keyword(s):

Data Mining ◽

Image Processing ◽

Intrusion Detection ◽

Real Time ◽

Outlier Detection ◽

Real World ◽

Medical Data ◽

Experimental Results ◽

Real Time Applications ◽

Real World Datasets

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.

Download Full-text

Overlapping Community Detection Based on Attribute Augmented Graph

Entropy ◽

10.3390/e23060680 ◽

2021 ◽

Vol 23 (6) ◽

pp. 680

Author(s):

Hanyang Lin ◽

Yongzhao Zhan ◽

Zizheng Zhao ◽

Yuzhong Chen ◽

Chen Dong

Keyword(s):

Community Detection ◽

Real World ◽

Detection Algorithm ◽

Overlapping Community Detection ◽

Overlapping Communities ◽

Adjustment Strategy ◽

Topology Information ◽

Overlapping Community ◽

Real World Datasets ◽

Community Detection Algorithm

There is a wealth of information in real-world social networks. In addition to the topology information, the vertices or edges of a social network often have attributes, with many of the overlapping vertices belonging to several communities simultaneously. It is challenging to fully utilize the additional attribute information to detect overlapping communities. In this paper, we first propose an overlapping community detection algorithm based on an augmented attribute graph. An improved weight adjustment strategy for attributes is embedded in the algorithm to help detect overlapping communities more accurately. Second, we enhance the algorithm to automatically determine the number of communities by a node-density-based fuzzy k-medoids process. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed algorithms can effectively detect overlapping communities with fewer parameters compared to the baseline methods.

Download Full-text