Effective Collaborative Filtering Approaches Based on Missing Data Imputation

Author(s):  
Weiwei Xia ◽  
Liang He ◽  
Junzhong Gu ◽  
Keqin He
2021 ◽  
Vol 11 (8) ◽  
pp. 3719
Author(s):  
Sun-Young Ihm ◽  
Shin-Eun Lee ◽  
Young-Ho Park ◽  
Aziz Nasridinov ◽  
Miyeon Kim ◽  
...  

Collaborative filtering (CF) is a recommendation technique that analyzes the behavior of various users and recommends the items preferred by users with similar preferences. However, CF methods suffer from poor recommendation accuracy when the user preference data used in the recommendation process is sparse. Data imputation can alleviate the data sparsity problem by substituting a virtual part of the missing user preferences. In this paper, we propose a k-recursive reliability-based imputation (k-RRI) that first selects data with high reliability and then recursively imputes data with additional selection while gradually lowering the reliability criterion. We also propose a new similarity measure that weights common interests and indifferences between users and items. The proposed method can overcome disregarding the importance of missing data and resolve the problem of poor data imputation of existing methods. The experimental results demonstrate that the proposed approach significantly improves recommendation accuracy compared to those resulting from the state-of-the-art methods while demanding less computational complexity.


2021 ◽  
pp. 147592172110219
Author(s):  
Huachen Jiang ◽  
Chunfeng Wan ◽  
Kang Yang ◽  
Youliang Ding ◽  
Songtao Xue

Wireless sensors are the key components of structural health monitoring systems. During the signal transmission, sensor failure is inevitable, among which, data loss is the most common type. Missing data problem poses a huge challenge to the consequent damage detection and condition assessment, and therefore, great importance should be attached. Conventional missing data imputation basically adopts the correlation-based method, especially for strain monitoring data. However, such methods often require delicate model selection, and the correlations for vehicle-induced strains are much harder to be captured compared with temperature-induced strains. In this article, a novel data-driven generative adversarial network (GAN) for imputing missing strain response is proposed. As opposed to traditional ways where correlations for inter-strains are explicitly modeled, the proposed method directly imputes the missing data considering the spatial–temporal relationships with other strain sensors based on the remaining observed data. Furthermore, the intact and complete dataset is not even necessary during the training process, which shows another great superiority over the model-based imputation method. The proposed method is implemented and verified on a real concrete bridge. In order to demonstrate the applicability and robustness of the GAN, imputation for single and multiple sensors is studied. Results show the proposed method provides an excellent performance of imputation accuracy and efficiency.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Nishith Kumar ◽  
Md. Aminul Hoque ◽  
Masahiro Sugimoto

AbstractMass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA.


Sign in / Sign up

Export Citation Format

Share Document