scholarly journals Collaborative filtering recommendation algorithm based on variational inference

2020 ◽  
Vol 4 (1) ◽  
pp. 31-44
Author(s):  
Kai Zheng ◽  
Xianjun Yang ◽  
Yilei Wang ◽  
Yingjie Wu ◽  
Xianghan Zheng

Purpose The purpose of this paper is to alleviate the problem of poor robustness and over-fitting caused by large-scale data in collaborative filtering recommendation algorithms. Design/methodology/approach Interpreting user behavior from the probabilistic perspective of hidden variables is helpful to improve robustness and over-fitting problems. Constructing a recommendation network by variational inference can effectively solve the complex distribution calculation in the probabilistic recommendation model. Based on the aforementioned analysis, this paper uses variational auto-encoder to construct a generating network, which can restore user-rating data to solve the problem of poor robustness and over-fitting caused by large-scale data. Meanwhile, for the existing KL-vanishing problem in the variational inference deep learning model, this paper optimizes the model by the KL annealing and Free Bits methods. Findings The effect of the basic model is considerably improved after using the KL annealing or Free Bits method to solve KL vanishing. The proposed models evidently perform worse than competitors on small data sets, such as MovieLens 1 M. By contrast, they have better effects on large data sets such as MovieLens 10 M and MovieLens 20 M. Originality/value This paper presents the usage of the variational inference model for collaborative filtering recommendation and introduces the KL annealing and Free Bits methods to improve the basic model effect. Because the variational inference training denotes the probability distribution of the hidden vector, the problem of poor robustness and overfitting is alleviated. When the amount of data is relatively large in the actual application scenario, the probability distribution of the fitted actual data can better represent the user and the item. Therefore, using variational inference for collaborative filtering recommendation is of practical value.

Author(s):  
Jun Huang ◽  
Linchuan Xu ◽  
Jing Wang ◽  
Lei Feng ◽  
Kenji Yamanishi

Existing multi-label learning (MLL) approaches mainly assume all the labels are observed and construct classification models with a fixed set of target labels (known labels). However, in some real applications, multiple latent labels may exist outside this set and hide in the data, especially for large-scale data sets. Discovering and exploring the latent labels hidden in the data may not only find interesting knowledge but also help us to build a more robust learning model. In this paper, a novel approach named DLCL (i.e., Discovering Latent Class Labels for MLL) is proposed which can not only discover the latent labels in the training data but also predict new instances with the latent and known labels simultaneously. Extensive experiments show a competitive performance of DLCL against other state-of-the-art MLL approaches.


Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Artificial intelligence (ARTINT) and information have been famous fields for many years. A reason has been that many different areas have been promoted quickly based on the ARTINT and information, and they have created many significant values for many years. These crucial values have certainly been used more and more for many economies of the countries in the world, other sciences, companies, organizations, etc. Many massive corporations, big organizations, etc. have been established rapidly because these economies have been developed in the strongest way. Unsurprisingly, lots of information and large-scale data sets have been created clearly from these corporations, organizations, etc. This has been the major challenges for many commercial applications, studies, etc. to process and store them successfully. To handle this problem, many algorithms have been proposed for processing these big data sets.


2017 ◽  
Author(s):  
Shirley M. Matteson ◽  
Sonya E. Sherrod ◽  
Sevket Ceyhun Cetin

2017 ◽  
Vol 8 (2) ◽  
pp. 30-43
Author(s):  
Mrutyunjaya Panda

The Big Data, due to its complicated and diverse nature, poses a lot of challenges for extracting meaningful observations. This sought smart and efficient algorithms that can deal with computational complexity along with memory constraints out of their iterative behavior. This issue may be solved by using parallel computing techniques, where a single machine or a multiple machine can perform the work simultaneously, dividing the problem into sub problems and assigning some private memory to each sub problems. Clustering analysis are found to be useful in handling such a huge data in the recent past. Even though, there are many investigations in Big data analysis are on, still, to solve this issue, Canopy and K-Means++ clustering are used for processing the large-scale data in shorter amount of time with no memory constraints. In order to find the suitability of the approach, several data sets are considered ranging from small to very large ones having diverse filed of applications. The experimental results opine that the proposed approach is fast and accurate.


Sign in / Sign up

Export Citation Format

Share Document