Sequential Recommendation with Relation-Aware Kernelized Self-Attention

Recent studies identified that sequential Recommendation is improved by the attention mechanism. By following this development, we propose Relation-Aware Kernelized Self-Attention (RKSA) adopting a self-attention mechanism of the Transformer with augmentation of a probabilistic model. The original self-attention of Transformer is a deterministic measure without relation-awareness. Therefore, we introduce a latent space to the self-attention, and the latent space models the recommendation context from relation as a multivariate skew-normal distribution with a kernelized covariance matrix from co-occurrences, item characteristics, and user information. This work merges the self-attention of the Transformer and the sequential recommendation by adding a probabilistic model of the recommendation task specifics. We experimented RKSA over the benchmark datasets, and RKSA shows significant improvements compared to the recent baseline models. Also, RKSA were able to produce a latent space model that answers the reasons for recommendation.

Download Full-text

Maximum Likelihood Embedding of Logistic Random Dot Product Graphs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5975 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5289-5297

Author(s):

Luke J. O'Connor ◽

Muriel Medard ◽

Soheil Feizi

Keyword(s):

Maximum Likelihood ◽

Spectral Method ◽

Political Blogs ◽

Space Model ◽

Product Graphs ◽

Latent Space ◽

Dot Product ◽

Latent Space Model ◽

Exact Maximum Likelihood ◽

Latent Space Models

A latent space model for a family of random graphs assigns real-valued vectors to nodes of the graph such that edge probabilities are determined by latent positions. Latent space models provide a natural statistical framework for graph visualizing and clustering. A latent space model of particular interest is the Random Dot Product Graph (RDPG), which can be fit using an efficient spectral method; however, this method is based on a heuristic that can fail, even in simple cases. Here, we consider a closely related latent space model, the Logistic RDPG, which uses a logistic link function to map from latent positions to edge likelihoods. Over this model, we show that asymptotically exact maximum likelihood inference of latent position vectors can be achieved using an efficient spectral method. Our method involves computing top eigenvectors of a normalized adjacency matrix and scaling eigenvectors using a regression step. The novel regression scaling step is an essential part of the proposed method. In simulations, we show that our proposed method is more accurate and more robust than common practices. We also show the effectiveness of our approach over standard real networks of the karate club and political blogs.

Download Full-text

Positional Estimation Within a Latent Space Model for Networks

Methodology ◽

10.1027/1614-2241.2.1.24 ◽

2006 ◽

Vol 2 (1) ◽

pp. 24-33 ◽

Cited By ~ 17

Author(s):

Susan Shortreed ◽

Mark S. Handcock ◽

Peter Hoff

Keyword(s):

Random Effects ◽

Network Data ◽

Random Effects Models ◽

Space Model ◽

Network Analyses ◽

Recent Advances ◽

Latent Space ◽

Latent Space Model ◽

Modeling Data

Recent advances in latent space and related random effects models hold much promise for representing network data. The inherent dependency between ties in a network makes modeling data of this type difficult. In this article we consider a recently developed latent space model that is particularly appropriate for the visualization of networks. We suggest a new estimator of the latent positions and perform two network analyses, comparing four alternative estimators. We demonstrate a method of checking the validity of the positional estimates. These estimators are implemented via a package in the freeware statistical language R. The package allows researchers to efficiently fit the latent space model to data and to visualize the results.

Download Full-text

EM algorithm using overparameterization for the multivariate skew-normal distribution

Econometrics and Statistics ◽

10.1016/j.ecosta.2021.03.003 ◽

2021 ◽

Author(s):

Toshihiro Abe ◽

Hironori Fujisawa ◽

Takayuki Kawashima ◽

Christophe Ley

Keyword(s):

Em Algorithm ◽

Normal Distribution ◽

Skew Normal Distribution ◽

Skew Normal

Download Full-text

Parameter estimation for univariate Skew-Normal distribution based on the modified empirical characteristic function

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2021.1883655 ◽

2021 ◽

pp. 1-12

Author(s):

Gege Hou ◽

Ancha Xu ◽

Fengjing Cai ◽

You-Gan Wang

Keyword(s):

Parameter Estimation ◽

Characteristic Function ◽

Normal Distribution ◽

Empirical Characteristic Function ◽

Skew Normal Distribution ◽

Skew Normal

Download Full-text

Non-Intrusive Load Disaggregation Based on a Multi-Scale Attention Residual Network

Applied Sciences ◽

10.3390/app10249132 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9132

Author(s):

Liguo Weng ◽

Xiaodong Zhang ◽

Junhao Qian ◽

Min Xia ◽

Yiqing Xu ◽

...

Keyword(s):

Smart Grids ◽

Recognition Rate ◽

Low Frequency ◽

Attention Mechanism ◽

Learning Ability ◽

Residual Network ◽

Multi Scale ◽

Energy Disaggregation ◽

Benchmark Datasets ◽

Load Disaggregation

Non-intrusive load disaggregation (NILD) is of great significance to the development of smart grids. Current energy disaggregation methods extract features from sequences, and this process easily leads to a loss of load features and difficulties in detecting, resulting in a low recognition rate of low-use electrical appliances. To solve this problem, a non-intrusive sequential energy disaggregation method based on a multi-scale attention residual network is proposed. Multi-scale convolutions are used to learn features, and the attention mechanism is used to enhance the learning ability of load features. The residual learning further improves the performance of the algorithm, avoids network degradation, and improves the precision of load decomposition. The experimental results on two benchmark datasets show that the proposed algorithm has more advantages than the existing algorithms in terms of load disaggregation accuracy and judgments of the on/off state, and the attention mechanism can further improve the disaggregation accuracy of low-frequency electrical appliances.

Download Full-text

Some properties of the unified skew-normal distribution

Statistical Papers ◽

10.1007/s00362-021-01235-2 ◽

2021 ◽

Author(s):

Reinaldo B. Arellano-Valle ◽

Adelchi Azzalini

Keyword(s):

Normal Distribution ◽

Fourth Order ◽

Probability Distributions ◽

Skew Normal Distribution ◽

Present Contribution ◽

The Family ◽

Multivariate Skewness ◽

Unified Skew Normal Distribution ◽

Skewness And Kurtosis ◽

Skew Normal

AbstractFor the family of multivariate probability distributions variously denoted as unified skew-normal, closed skew-normal and other names, a number of properties are already known, but many others are not, even some basic ones. The present contribution aims at filling some of the missing gaps. Specifically, the moments up to the fourth order are obtained, and from here the expressions of the Mardia’s measures of multivariate skewness and kurtosis. Other results concern the property of log-concavity of the distribution, closure with respect to conditioning on intervals, and a possible alternative parameterization.

Download Full-text

Copulaesque Versions of the Skew-Normal and Skew-Student Distributions

Symmetry ◽

10.3390/sym13050815 ◽

2021 ◽

Vol 13 (5) ◽

pp. 815

Author(s):

Christopher Adcock

Keyword(s):

Distribution Function ◽

Normal Distribution ◽

Degrees Of Freedom ◽

Special Functions ◽

Normal Distribution Function ◽

Skew Normal Distribution ◽

Marginal Distributions ◽

Student’S T ◽

Normal Copula ◽

Skew Normal

A recent paper presents an extension of the skew-normal distribution which is a copula. Under this model, the standardized marginal distributions are standard normal. The copula itself depends on the familiar skewing construction based on the normal distribution function. This paper is concerned with two topics. First, the paper presents a number of extensions of the skew-normal copula. Notably these include a case in which the standardized marginal distributions are Student’s t, with different degrees of freedom allowed for each margin. In this case the skewing function need not be the distribution function for Student’s t, but can depend on certain of the special functions. Secondly, several multivariate versions of the skew-normal copula model are presented. The paper contains several illustrative examples.

Download Full-text