A similarity measure based on Kullback–Leibler divergence for collaborative filtering in sparse data
In the neighbourhood-based collaborative filtering (CF) algorithms, a user similarity measure is used to find other users similar to an active user. Most of the existing user similarity measures rely on the co-rated items. However, there are not enough co-rated items in sparse dataset, which usually leads to poor prediction. In this article, a new similarity scheme is proposed, which breaks free of the constraint of the co-rated items. Moreover, an item similarity measure based on the Kullback–Leibler (KL) divergence is presented, which identifies the relation between items based on the probability density distribution of ratings. Since the item similarity based on KL divergence makes full use of all ratings, it owns better flexibility for sparse datasets. The CF algorithm using our proposed similarity scheme is implemented and compared with some classic CF algorithms. The compared results show that the CF using our similarity has better predictive performance. Therefore, our similarity scheme is a good solution for the sparsity problem and has great potential to be applied to recommendation systems.