A similarity measure based on Kullback–Leibler divergence for collaborative filtering in sparse data

2018 ◽  
Vol 45 (5) ◽  
pp. 656-675 ◽  
Author(s):  
Jiangzhou Deng ◽  
Yong Wang ◽  
Junpeng Guo ◽  
Yongheng Deng ◽  
Jerry Gao ◽  
...  

In the neighbourhood-based collaborative filtering (CF) algorithms, a user similarity measure is used to find other users similar to an active user. Most of the existing user similarity measures rely on the co-rated items. However, there are not enough co-rated items in sparse dataset, which usually leads to poor prediction. In this article, a new similarity scheme is proposed, which breaks free of the constraint of the co-rated items. Moreover, an item similarity measure based on the Kullback–Leibler (KL) divergence is presented, which identifies the relation between items based on the probability density distribution of ratings. Since the item similarity based on KL divergence makes full use of all ratings, it owns better flexibility for sparse datasets. The CF algorithm using our proposed similarity scheme is implemented and compared with some classic CF algorithms. The compared results show that the CF using our similarity has better predictive performance. Therefore, our similarity scheme is a good solution for the sparsity problem and has great potential to be applied to recommendation systems.

2021 ◽  
Vol 11 (13) ◽  
pp. 6108
Author(s):  
Jehan Al-Safi ◽  
Cihan Kaleli

A technique employed by recommendation systems is collaborative filtering,,which predicts the item ratings and recommends the items that may be interesting to the user. Naturally, users have diverse opinions, and only trusting user ratings of products may produce inaccurate recommendations. Therefore, it is essential to offer a new similarity measure that enhances recommendation accuracy, even for customers who only leave a few ratings. Thus, this article proposes an algorithm for user similarity measures that exploit item genre information to make more accurate recommendations. This algorithm measures the relationship between users using item genre information, discovers the active user’s nearest neighbors in each genre, and finds the final nearest neighbors list who can share with them the same preference in a genre. Finally, it predicts the active-user rating of items using a definite prediction procedure. To measure the accuracy, we propose new evaluation criteria: the rating level and reliability among users, according to rating level. We implement the proposed method on real datasets. The empirical results clarify that the proposed algorithm produces a predicted rating accuracy, rating level, and reliability between users, which are better than many existing collaborative filtering algorithms.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Lili Wang ◽  
Ting Shi ◽  
Shijin Li

Since the user recommendation complex matrix is characterized by strong sparsity, it is difficult to correctly recommend relevant services for users by using the recommendation method based on location and collaborative filtering. The similarity measure between users is low. This paper proposes a fusion method based on KL divergence and cosine similarity. KL divergence and cosine similarity have advantages by comparing three similar metrics at different K values. Using the fusion method of the two, the user’s similarity with the preference is reused. By comparing the location-based collaborative filtering (LCF) algorithm, user-based collaborative filtering (UCF) algorithm, and user recommendation algorithm (F2F), the proposed method has the preparation rate, recall rate, and experimental effect advantage. In different median values, the proposed method also has an advantage in experimental results.


2013 ◽  
Vol 13 (Special-Issue) ◽  
pp. 122-130
Author(s):  
Yue Huang ◽  
Xuedong Gao ◽  
Shujuan Gu

Abstract User similarity measurement plays a key role in collaborative filtering recommendation which is the most widely applied technique in recommender systems. Traditional user-based collaborative filtering recommendation methods focus on absolute rating difference of common rated items while neglecting the relative rating level difference to the same items. In order to overcome this drawback, we propose a novel user similarity measure which takes into account the degree of rating the level gap that users could accept. The results of collaborative filtering recommendation based on User Acceptable Rating Radius (UARR) on a real movie rating data set, the MovieLens data set, prove to generate more accurate prediction results compared to the traditional similarity methods.


2020 ◽  
Vol 9 (9) ◽  
pp. 519 ◽  
Author(s):  
Soroush Ojagh ◽  
Mohammad Reza Malek ◽  
Sara Saeedi

Providing recommendations in cold start situations is one of the most challenging problems for collaborative filtering based recommender systems (RSs). Although user social context information has largely contributed to the cold start problem, most of the RSs still suffer from the lack of initial social links for newcomers. For this study, we are going to address this issue using a proposed user similarity detection engine (USDE). Utilizing users’ personal smart devices enables the proposed USDE to automatically extract real-world social interactions between users. Moreover, the proposed USDE uses user clustering algorithm that includes contextual information for identifying similar users based on their profiles. The dynamically updated contextual information for the user profiles helps with user similarity clustering and provides more personalized recommendations. The proposed RS is evaluated using movie recommendations as a case study. The results show that the proposed RS can improve the accuracy and personalization level of recommendations as compared to two other widely applied collaborative filtering RSs. In addition, the performance of the USDE is evaluated in different scenarios. The conducted experimental results on USDE show that the proposed USDE outperforms widely applied similarity measures in cold start and data sparsity situations.


Author(s):  
Keunho Choi ◽  
Yongmoo Suh ◽  
Donghee Yoo

Many online shopping malls have implemented personalized recommendation systems to improve customer retention in the age of high competition and information overload. Sellers make use of these recommendation systems to survive high competition and buyers utilize them to find proper product information for their own needs. However, transaction data of most online shopping malls prevent us from using collaborative filtering (CF) technique to recommend products, for the following two reasons: 1) explicit rating information is rarely available in the transaction data; 2) the sparsity problem usually occurs in the data, which makes it difficult to identify reliable neighbors, resulting in less effective recommendations. Therefore, this paper first suggests a means to derive implicit rating information from the transaction data of an online shopping mall and then proposes a new user similarity function to mitigate the sparsity problem. The new user similarity function computes the user similarity of two users if they rated similar items, while the user similarity function of traditional CF technique computes it only if they rated common items. Results from several experiments using an online shopping mall dataset in Korea demonstrate that our approach significantly outperforms the traditional CF technique.


2019 ◽  
Vol 3 (3) ◽  
pp. 39 ◽  
Author(s):  
Mahamudul Hasan ◽  
Falguni Roy

Item-based collaborative filtering is one of the most popular techniques in the recommender system to retrieve useful items for the users by finding the correlation among the items. Traditional item-based collaborative filtering works well when there exists sufficient rating data but cannot calculate similarity for new items, known as a cold-start problem. Usually, for the lack of rating data, the identification of the similarity among the cold-start items is difficult. As a result, existing techniques fail to predict accurate recommendations for cold-start items which also affects the recommender system’s performance. In this paper, two item-based similarity measures have been designed to overcome this problem by incorporating items’ genre data. An item might be uniform to other items as they might belong to more than one common genre. Thus, one of the similarity measures is defined by determining the degree of direct asymmetric correlation between items by considering their association of common genres. However, the similarity is determined between a couple of items where one of the items could be cold-start and another could be any highly rated item. Thus, the proposed similarity measure is accounted for as asymmetric by taking consideration of the item’s rating data. Another similarity measure is defined as the relative interconnection between items based on transitive inference. In addition, an enhanced prediction algorithm has been proposed so that it can calculate a better prediction for the recommendation. The proposed approach has experimented with two popular datasets that is Movielens and MovieTweets. In addition, it is found that the proposed technique performs better in comparison with the traditional techniques in a collaborative filtering recommender system. The proposed approach improved prediction accuracy for Movielens and MovieTweets approximately in terms of 3.42% & 8.58% mean absolute error, 7.25% & 3.29% precision, 7.20% & 7.55% recall, 8.76% & 5.15% f-measure and 49.3% and 16.49% mean reciprocal rank, respectively.


2022 ◽  
Vol 24 (3) ◽  
pp. 0-0

The cost-effective and easy availability of handheld mobile devices and ubiquity of location acquisition services such as GPS and GSM networks has helped expedient logging and sharing of location histories of mobile users. This work aims to find semantic user similarity using their past travel histories. Application of the semantic similarity measure can be found in tourism-related recommender systems and information retrieval. The paper presents Earth Mover’s Distance (EMD) based semantic user similarity measure using users' GPS logs. The similarity measure is applied and evaluated on the GPS dataset of 182 users collected from April 2007 to August 2012 by Microsoft's GeoLife project. The proposed similarity measure is compared with conventional similarity measures used in literature such as Jaccard, Dice, and Pearsons’ Correlation. The percentage improvement of EMD based approach over existing approaches in terms of average RMSE is 10.70%, and average MAE is 5.73%.


Sign in / Sign up

Export Citation Format

Share Document