sDeepFM: Multi-Scale Stacking Feature Interactions for Click-Through Rate Prediction

Baohua Qiang; Yongquan Lu; Minghao Yang; Xianjun Chen; Jinlong Chen; Yawei Cao

doi:10.3390/electronics9020350

sDeepFM: Multi-Scale Stacking Feature Interactions for Click-Through Rate Prediction

Electronics ◽

10.3390/electronics9020350 ◽

2020 ◽

Vol 9 (2) ◽

pp. 350

Author(s):

Baohua Qiang ◽

Yongquan Lu ◽

Minghao Yang ◽

Xianjun Chen ◽

Jinlong Chen ◽

...

Keyword(s):

State Of The Art ◽

Receptive Fields ◽

Area Under The Curve ◽

Sparse Data ◽

High Order ◽

Multi Scale ◽

Feature Interactions ◽

Novel Structure ◽

Real World Datasets ◽

Click Through Rate

For estimating the click-through rate of advertisements, there are some problems in that the features cannot be automatically constructed, or the features built are relatively simple, or the high-order combination features are difficult to learn under sparse data. To solve these problems, we propose a novel structure multi-scale stacking pooling (MSSP) to construct multi-scale features based on different receptive fields. The structure stacks multi-scale features bi-directionally from the angles of depth and width by constructing multiple observers with different angles and different fields of view, ensuring the diversity of extracted features. Furthermore, by learning the parameters through factorization, the structure can ensure high-order features being effectively learned in sparse data. We further combine the MSSP with the classical deep neural network (DNN) to form a unified model named sDeepFM. Experimental results on two real-world datasets show that the sDeepFM outperforms state-of-the-art models with respect to area under the curve (AUC) and log loss.

Download Full-text

An Attention-Based Latent Information Extraction Network (ALIEN) for High-Order Feature Interactions

Applied Sciences ◽

10.3390/app10165468 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5468

Author(s):

Ruo Huang ◽

Shelby McIntyre ◽

Meina Song ◽

Haihong E ◽

Zhonghong Ou

Keyword(s):

Information Extraction ◽

Recommender Systems ◽

Vital Role ◽

High Order ◽

Sequential Patterns ◽

Sequence Information ◽

Recommendation Algorithm ◽

Feature Interactions ◽

Real World Datasets ◽

Click Through Rate

One of the primary tasks for commercial recommender systems is to predict the probabilities of users clicking items, e.g., advertisements, music and products. This is because such predictions have a decisive impact on profitability. The classic recommendation algorithm, collaborative filtering (CF), still plays a vital role in many industrial recommender systems. However, although straight CF is good at capturing similar users’ preferences for items based on their past interactions, it lacks regarding (1) modeling the influences of users’ sequential patterns from their individual history interaction sequences and (2) the relevance of users’ and items’ attributes. In this work, we developed an attention-based latent information extraction network (ALIEN) for click-through rate prediction, to integrate (1) implicit user similarity in terms of click patterns (analogous to CF), and (2) modeling the low and high-order feature interactions and (3) historical sequence information. The new model is based on the deep learning, which goes beyond the capabilities of econometric approaches, such as matrix factorization (MF) and k-means. In addition, the approach provides explainability to the recommendation by interpreting the contributions of different features and historical interactions. We have conducted experiments on real-world datasets that demonstrate considerable improvements over strong baselines.

Download Full-text

Embedding-Based Complex Feature Value Coupling Learning for Detecting Outliers in Non-IID Categorical Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015541 ◽

2019 ◽

Vol 33 ◽

pp. 5541-5548 ◽

Cited By ~ 2

Author(s):

Hongzuo Xu ◽

Yongjun Wang ◽

Zhiyue Wu ◽

Yijie Wang

Keyword(s):

Outlier Detection ◽

Categorical Data ◽

State Of The Art ◽

High Order ◽

Detection Methods ◽

Order Complex ◽

Value Network ◽

Learning Framework ◽

A Value ◽

Real World Datasets

Non-IID categorical data is ubiquitous and common in realworld applications. Learning various kinds of couplings has been proved to be a reliable measure when detecting outliers in such non-IID data. However, it is a critical yet challenging problem to model, represent, and utilise high-order complex value couplings. Existing outlier detection methods normally only focus on pairwise primary value couplings and fail to uncover real relations that hide in complex couplings, resulting in suboptimal and unstable performance. This paper introduces a novel unsupervised embedding-based complex value coupling learning framework EMAC and its instance SCAN to address these issues. SCAN first models primary value couplings. Then, coupling bias is defined to capture complex value couplings with different granularities and highlight the essence of outliers. An embedding method is performed on the value network constructed via biased value couplings, which further learns high-order complex value couplings and embeds these couplings into a value representation matrix. Bidirectional selective value coupling learning is proposed to show how to estimate value and object outlierness through value couplings. Substantial experiments show that SCAN (i) significantly outperforms five state-of-the-art outlier detection methods on thirteen real-world datasets; and (ii) has much better resilience to noise than its competitors.

Download Full-text

Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/435 ◽

2017 ◽

Cited By ~ 141

Author(s):

Jun Xiao ◽

Hao Ye ◽

Xiangnan He ◽

Hanwang Zhang ◽

Fei Wu ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Feature Interaction ◽

Model Parameters ◽

Learning Approach ◽

Attention Networks ◽

Feature Interactions ◽

Factorization Machine ◽

Real World Datasets ◽

Novel Model

Factorization Machines (FMs) are a supervised learning approach that enhances the linear regression model by incorporating the second-order feature interactions. Despite effectiveness, FM can be hindered by its modelling of all feature interactions with the same weight, as not all feature interactions are equally useful and predictive. For example, the interactions with useless features may even introduce noises and adversely degrade the performance. In this work, we improve FM by discriminating the importance of different feature interactions. We propose a novel model named Attentional Factorization Machine (AFM), which learns the importance of each feature interaction from data via a neural attention network. Extensive experiments on two real-world datasets demonstrate the effectiveness of AFM. Empirically, it is shown on regression task AFM betters FM with a 8.6% relative improvement, and consistently outperforms the state-of-the-art deep learning methods Wide&Deep [Cheng et al., 2016] and DeepCross [Shan et al., 2016] with a much simpler structure and fewer model parameters. Our implementation of AFM is publicly available at: https://github.com/hexiangnan/attentional_factorization_machine

Download Full-text

Multi-scale Information Diffusion Prediction with Reinforced Recurrent Networks

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/560 ◽

2019 ◽

Cited By ~ 7

Author(s):

Cheng Yang ◽

Jian Tang ◽

Maosong Sun ◽

Ganqu Cui ◽

Zhiyuan Liu

Keyword(s):

Information Diffusion ◽

State Of The Art ◽

Sequential Data ◽

Recurrent Networks ◽

Multi Scale ◽

Structural Context ◽

Learning Techniques ◽

Proposed Model ◽

Real World Datasets ◽

Diffusion Prediction

Information diffusion prediction is an important task which studies how information items spread among users. With the success of deep learning techniques, recurrent neural networks (RNNs) have shown their powerful capability in modeling information diffusion as sequential data. However, previous works focused on either microscopic diffusion prediction which aims at guessing the next influenced user or macroscopic diffusion prediction which estimates the total numbers of influenced users during the diffusion process. To the best of our knowledge, no previous works have suggested a unified model for both microscopic and macroscopic scales. In this paper, we propose a novel multi-scale diffusion prediction model based on reinforcement learning (RL). RL incorporates the macroscopic diffusion size information into the RNN-based microscopic diffusion model by addressing the non-differentiable problem. We also employ an effective structural context extraction strategy to utilize the underlying social graph information. Experimental results show that our proposed model outperforms state-of-the-art baseline models on both microscopic and macroscopic diffusion predictions on three real-world datasets.

Download Full-text

Multi-Scale Contrastive Siamese Networks for Self-Supervised Graph Representation Learning

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/204 ◽

2021 ◽

Author(s):

Ming Jin ◽

Yizhen Zheng ◽

Yuan-Fang Li ◽

Chen Gong ◽

Chuan Zhou ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Vital Role ◽

Graph Representation ◽

Input Graph ◽

Global Perspectives ◽

Multi Scale ◽

Recent Success ◽

Real World Datasets ◽

Siamese Networks

Graph representation learning plays a vital role in processing graph-structured data. However, prior arts on graph representation learning heavily rely on labeling information. To overcome this problem, inspired by the recent success of graph contrastive learning and Siamese networks in visual representation learning, we propose a novel self-supervised approach in this paper to learn node representations by enhancing Siamese self-distillation with multi-scale contrastive learning. Specifically, we first generate two augmented views from the input graph based on local and global perspectives. Then, we employ two objectives called cross-view and cross-network contrastiveness to maximize the agreement between node representations across different views and networks. To demonstrate the effectiveness of our approach, we perform empirical experiments on five real-world datasets. Our method not only achieves new state-of-the-art results but also surpasses some semi-supervised counterparts by large margins. Code is made available at https://github.com/GRAND-Lab/MERIT

Download Full-text

Deep context interaction network based on attention mechanism for click-through rate prediction

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210830 ◽

2021 ◽

pp. 1-16

Author(s):

Ling Yuan ◽

Zhuwen Pan ◽

Ping Sun ◽

Yinzhen Wei ◽

Haiping Yu

Keyword(s):

Online Advertising ◽

Dimensional Space ◽

Interaction Network ◽

High Order ◽

Attention Mechanism ◽

Feature Interaction ◽

Feature Interactions ◽

Low Dimensional ◽

Click Through Rate ◽

The Relationship

Click-through rate (CTR) prediction, which aims to predict the probability of a user clicking on an ad, is a critical task in online advertising systems. The problem is very challenging since(1) an effective prediction relies on high-order combinatorial features, and(2)the relationship to auxiliary ads that may impact the CTR. In this paper, we propose Deep Context Interaction Network on Attention Mechanism(DCIN-Attention) to process feature interaction and context at the same time. The context includes other ads in the current search page, historically clicked and unclicked ads of the user. Specifically, we use the attention mechanism to learn the interactions between the target ad and each type of auxiliary ad. The residual network is used to model the feature interactions in the low-dimensional space, and with the multi-head self-attention neural network, high-order feature interactions can be modeled. Experimental results on Avito dataset show that DCIN outperform several existing methods for CTR prediction.

Download Full-text

CFM: Convolutional Factorization Machines for Context-Aware Recommendation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/545 ◽

2019 ◽

Cited By ~ 11

Author(s):

Xin Xin ◽

Bo Chen ◽

Xiangnan He ◽

Dong Wang ◽

Yue Ding ◽

...

Keyword(s):

Second Order ◽

High Order ◽

Inner Product ◽

Context Aware ◽

Effective Solution ◽

High Order Interaction ◽

Outer Product ◽

Feature Interactions ◽

Factorization Machine ◽

Real World Datasets

Factorization Machine (FM) is an effective solution for context-aware recommender systems (CARS) which models second-order feature interactions by inner product. However, it is insufficient to capture high-order and nonlinear interaction signals. While several recent efforts have enhanced FM with neural networks, they assume the embedding dimensions are independent from each other and model high-order interactions in a rather implicit manner. In this paper, we propose Convolutional Factorization Machine (CFM) to address above limitations. Specifically, CFM models second-order interactions with outer product, resulting in ''images'' which capture correlations between embedding dimensions. Then all generated ''images'' are stacked, forming an interaction cube. 3D convolution is applied above it to learn high-order interaction signals in an explicit approach. Besides, we also leverage a self-attention mechanism to perform the pooling of features to reduce time complexity. We conduct extensive experiments on three real-world datasets, demonstrating significant improvement of CFM over competing methods for context-aware top-k recommendation.

Download Full-text

Multi-Scale Shape Adaptive Network for Raindrop Detection and Removal from a Single Image

Sensors ◽

10.3390/s20236733 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6733

Author(s):

Hao Luo ◽

Qingbo Wu ◽

King Ngi Ngan ◽

Hanxiao Luo ◽

Haoran Wei ◽

...

Keyword(s):

Real World ◽

Large Scale ◽

State Of The Art ◽

Limited Capacity ◽

Single Image ◽

Adaptive Network ◽

Multi Scale ◽

Shape Invariant ◽

Large Scale Dataset ◽

Real World Datasets

Removing raindrops from a single image is a challenging problem due to the complex changes in shape, scale, and transparency among raindrops. Previous explorations have mainly been limited in two ways. First, publicly available raindrop image datasets have limited capacity in terms of modeling raindrop characteristics (e.g., raindrop collision and fusion) in real-world scenes. Second, recent deraining methods tend to apply shape-invariant filters to cope with diverse rainy images and fail to remove raindrops that are especially varied in shape and scale. In this paper, we address these raindrop removal problems from two perspectives. First, we establish a large-scale dataset named RaindropCityscapes, which includes 11,583 pairs of raindrop and raindrop-free images, covering a wide variety of raindrops and background scenarios. Second, a two-branch Multi-scale Shape Adaptive Network (MSANet) is proposed to detect and remove diverse raindrops, effectively filtering the occluded raindrop regions and keeping the clean background well-preserved. Extensive experiments on synthetic and real-world datasets demonstrate that the proposed method achieves significant improvements over the recent state-of-the-art raindrop removal methods. Moreover, the extension of our method towards the rainy image segmentation and detection tasks validates the practicality of the proposed method in outdoor applications.

Download Full-text

A Novel Multi-Scale Attention PFE-UNet for Forest Image Segmentation

Forests ◽

10.3390/f12070937 ◽

2021 ◽

Vol 12 (7) ◽

pp. 937

Author(s):

Boyang Zhang ◽

Hongbo Mu ◽

Mingyu Gao ◽

Haiming Ni ◽

Jianfeng Chen ◽

...

Keyword(s):

Feature Extraction ◽

Image Segmentation ◽

State Of The Art ◽

Computational Cost ◽

Receptive Fields ◽

Feature Maps ◽

Multi Scale ◽

Extensive Evaluation ◽

Network Transition ◽

Segmentation Task

The precise segmentation of forest areas is essential for monitoring tasks related to forest exploration, extraction, and statistics. However, the effective and accurate segmentation of forest images will be affected by factors such as blurring and discontinuity of forest boundaries. Therefore, a Pyramid Feature Extraction-UNet network (PFE-UNet) based on traditional UNet is proposed to be applied to end-to-end forest image segmentation. Among them, the Pyramid Feature Extraction module (PFE) is introduced in the network transition layer, which obtains multi-scale forest image information through different receptive fields. The spatial attention module (SA) and the channel-wise attention module (CA) are applied to low-level feature maps and PFE feature maps, respectively, to highlight specific segmentation task features while fusing context information and suppressing irrelevant regions. The standard convolution block is replaced by a novel depthwise separable convolutional unit (DSC Unit), which not only reduces the computational cost but also prevents overfitting. This paper presents an extensive evaluation with the DeepGlobe dataset and a comparative analysis with several state-of-the-art networks. The experimental results show that the PFE-UNet network obtains an accuracy of 94.23% in handling the real-time forest image segmentation, which is significantly higher than other advanced networks. This means that the proposed PFE-UNet also provides a valuable reference for the precise segmentation of forest images.

Download Full-text

Attention-over-Attention Field-Aware Factorization Machine

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6101 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6323-6330

Author(s):

Zhibo Wang ◽

Jinxin Ma ◽

Yongquan Zhang ◽

Qian Wang ◽

Ju Ren ◽

...

Keyword(s):

State Of The Art ◽

Popular Approach ◽

Large Margin ◽

Feature Interactions ◽

Great Performance ◽

Benchmark Datasets ◽

Field Information ◽

Factorization Machine ◽

Click Through Rate ◽

Novel Algorithm

Factorization Machine (FM) has been a popular approach in supervised predictive tasks, such as click-through rate prediction and recommender systems, due to its great performance and efficiency. Recently, several variants of FM have been proposed to improve its performance. However, most of the state-of-the-art prediction algorithms neglected the field information of features, and they also failed to discriminate the importance of feature interactions due to the problem of redundant features. In this paper, we present a novel algorithm called Attention-over-Attention Field-aware Factorization Machine (AoAFFM) for better capturing the characteristics of feature interactions. Specifically, we propose the field-aware embedding layer to exploit the field information of features, and combine it with the attention-over-attention mechanism to learn both feature-level and interaction-level attention to estimate the weight of feature interactions. Experimental results show that the proposed AoAFFM improves FM and FFM with large margin, and outperforms state-of-the-art algorithms on three public benchmark datasets.

Download Full-text