scholarly journals Augmenting Black Sheep Neighbour Importance for Enhancing Rating Prediction Accuracy in Collaborative Filtering

2021 ◽  
Vol 11 (18) ◽  
pp. 8369
Author(s):  
Dionisis Margaris ◽  
Dimitris Spiliotopoulos ◽  
Costas Vassilakis

In this work, an algorithm for enhancing the rating prediction accuracy in collaborative filtering, which does not need any supplementary information, utilising only the users’ ratings on items, is presented. This accuracy enhancement is achieved by augmenting the importance of the opinions of ‘black sheep near neighbours’, which are pairs of near neighbours with opinion agreement on items that deviates from the dominant community opinion on the same item. The presented work substantiates that the weights of near neighbours can be adjusted, based on the degree to which the target user and the near neighbour deviate from the dominant ratings for each item. This concept can be utilized in various other CF algorithms. The experimental evaluation was conducted on six datasets broadly used in CF research, using two user similarity metrics and two rating prediction error metrics. The results show that the proposed technique increases rating prediction accuracy both when used independently and when combined with other CF algorithms. The proposed algorithm is designed to work without the requirements to utilise any supplementary sources of information, such as user relations in social networks and detailed item descriptions. The aforesaid point out both the efficacy and the applicability of the proposed work.

Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 174
Author(s):  
Dionisis Margaris ◽  
Dimitris Spiliotopoulos ◽  
Gregory Karagiorgos ◽  
Costas Vassilakis

Collaborative filtering algorithms formulate personalized recommendations for a user, first by analysing already entered ratings to identify other users with similar tastes to the user (termed as near neighbours), and then using the opinions of the near neighbours to predict which items the target user would like. However, in sparse datasets, too few near neighbours can be identified, resulting in low accuracy predictions and even a total inability to formulate personalized predictions. This paper addresses the sparsity problem by presenting an algorithm that uses robust predictions, that is predictions deemed as highly probable to be accurate, as derived ratings. Thus, the density of sparse datasets increases, and improved rating prediction coverage and accuracy are achieved. The proposed algorithm, termed as CFDR, is extensively evaluated using (1) seven widely-used collaborative filtering datasets, (2) the two most widely-used correlation metrics in collaborative filtering research, namely the Pearson correlation coefficient and the cosine similarity, and (3) the two most widely-used error metrics in collaborative filtering, namely the mean absolute error and the root mean square error. The evaluation results show that, by successfully increasing the density of the datasets, the capacity of collaborative filtering systems to formulate personalized and accurate recommendations is considerably improved.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 68301-68310 ◽  
Author(s):  
Dionisis Margaris ◽  
Anna Kobusinska ◽  
Dimitris Spiliotopoulos ◽  
Costas Vassilakis

2014 ◽  
Vol 610 ◽  
pp. 747-751
Author(s):  
Jian Sun ◽  
Xiao Ying Chen

Aiming at the problems of extremely sparse of user-item rating data and poor recommendation quality, we put forward a collaborative filtering recommendation algorithm based on cloud model, item attribute and user data which combined with the existing literatures. A rating prediction algorithm based on cloud model and item attribute is proposed, based on idea that the similar users rating for the same item are similar and the same user ratings for the similar items are similar and stable. Through compare and analysis this paper’s and other studies experimental results, we get the conclusion that the rating prediction accuracy is improved.


Author(s):  
Shulin Cheng ◽  
Bofeng Zhang ◽  
Guobing Zou

Collaborative filtering (CF) approach is successfully applied in the rating prediction of personal recommendation. But individual information source is leveraged in many of them, i.e., the information derived from single perspective is used in the user-item matrix for recommendation, such as user-based CF method mainly utilizing the information of user view, item-based CF method mainly exploiting the information of item view. In this paper, in order to take full advantage of multiple information sources embedded in user-item rating matrix, we proposed a rating-based integrated recommendation framework of CF approaches to improve the rating prediction accuracy. Firstly, as for the sparsity of the conventional item-based CF method, we improved it by fusing the inner similarity and outer similarity based on the local sparsity factor. Meanwhile, we also proposed the improved user-based CF method in line with the user-item-interest model (UIIM) by preliminary rating. Second, we put forward a background method called user-item-based improved CF (UIBCF-I), which utilizes the information source of both similar items and similar users, to smooth itembased and user-based CF methods. Lastly, we leveraged the three information sources and fused their corresponding ratings into an Integrated CF model (INTE-CF). Experiments demonstrate that the proposed rating-based INTE-CF indeed improves the prediction accuracy and has strong robustness and low sensitivity to sparsity of dataset by comparisons to other mainstream CF approaches.


2020 ◽  
Vol 10 (12) ◽  
pp. 4599-4613
Author(s):  
Fabio Morgante ◽  
Wen Huang ◽  
Peter Sørensen ◽  
Christian Maltecca ◽  
Trudy F. C. Mackay

The ability to accurately predict complex trait phenotypes from genetic and genomic data are critical for the implementation of personalized medicine and precision agriculture; however, prediction accuracy for most complex traits is currently low. Here, we used data on whole genome sequences, deep RNA sequencing, and high quality phenotypes for three quantitative traits in the ∼200 inbred lines of the Drosophila melanogaster Genetic Reference Panel (DGRP) to compare the prediction accuracies of gene expression and genotypes for three complex traits. We found that expression levels (r = 0.28 and 0.38, for females and males, respectively) provided higher prediction accuracy than genotypes (r = 0.07 and 0.15, for females and males, respectively) for starvation resistance, similar prediction accuracy for chill coma recovery (null for both models and sexes), and lower prediction accuracy for startle response (r = 0.15 and 0.14 for female and male genotypes, respectively; and r = 0.12 and 0.11, for females and male transcripts, respectively). Models including both genotype and expression levels did not outperform the best single component model. However, accuracy increased considerably for all the three traits when we included gene ontology (GO) category as an additional layer of information for both genomic variants and transcripts. We found strongly predictive GO terms for each of the three traits, some of which had a clear plausible biological interpretation. For example, for starvation resistance in females, GO:0033500 (r = 0.39 for transcripts) and GO:0032870 (r = 0.40 for transcripts), have been implicated in carbohydrate homeostasis and cellular response to hormone stimulus (including the insulin receptor signaling pathway), respectively. In summary, this study shows that integrating different sources of information improved prediction accuracy and helped elucidate the genetic architecture of three Drosophila complex phenotypes.


Author(s):  
ROSA M. RODRÍGUEZ ◽  
LUIS MARTÍNEZ ◽  
DA RUAN ◽  
JUN LIU

Nuclear safeguards evaluation aims to verify that countries are not misusing nuclear programs for nuclear weapons purposes. Experts of the International Atomic Energy Agency (IAEA) carry out an evaluation process in which several hundreds of indicators are assessed according to the information obtained from different sources, such as State declarations, on-site inspections, IAEA non-safeguards databases and other open sources. These assessments are synthesized in a hierarchical way to obtain a global assessment. Much information and many sources of information related to nuclear safeguards are vague, imprecise and ill-defined. The use of the fuzzy linguistic approach has provided good results to deal with such uncertainties in this type of problems. However, a new challenge on nuclear safeguards evaluation has attracted the attention of researchers. Due to the complexity and vagueness of the sources of information obtained by IAEA experts and the huge number of indicators involved in the problem, it is common that they cannot assess all of them appearing missing values in the evaluation, which can bias the nuclear safeguards results. This paper proposes a model based on collaborative filtering (CF) techniques to impute missing values and provides a trust measure that indicates the reliability of the nuclear safeguards evaluation with the imputed values.


2018 ◽  
Vol 2 (2) ◽  
pp. 81-87 ◽  
Author(s):  
Pushpendra Kumar ◽  
Vinod Kumar ◽  
Ramjeevan Singh Thakur

2012 ◽  
Vol 251 ◽  
pp. 185-190
Author(s):  
Dun Hong Yao ◽  
Xiao Ning Peng ◽  
Jia He

In every field which needs data processing, the sparseness of data is an essential problem that should be resolved, especially in movies, shopping sites. The users with the same commodity preferences makes the data evaluation valuable. Otherwise, without any evaluation of information, it will result in sparse distribution of the entire data obtained. This article introduces a collaborative filtering technology used in sparse data processing methods - project-based rating prediction algorithm, and extends it to the areas of rough set, the sparse information table processing, rough set data preprocessing sparse issues.


2020 ◽  
Vol 10 (1) ◽  
pp. 34-47
Author(s):  
Abba Almu ◽  
Abubakar Roko ◽  
Aminu Mohammed ◽  
Ibrahim Saidu

The existing similarity functions use the user-item rating matrix to process similar neighbours that can be used to predict ratings to the users. However, the functions highly penalise high popular items which lead to predicting items that may not be of interest to active users due to the punishment function employed. The functions also reduce the chances of selecting less popular items as similar neighbours due to the items with common ratings used. In this article, a popularised similarity function (pop_sim) is proposed to provide effective recommendations to users. The pop_sim function introduces a modified punishment function to minimise the penalty on high popular items. The function also employs a popularity constraint which uses ratings threshold to increase the chances of selecting less popular items as similar neighbours. The experimental studies indicate that the proposed pop_sim is effective in improving the accuracy of the rating prediction in terms of not only lowering the MAE but also the RMSE.


Sign in / Sign up

Export Citation Format

Share Document