collaborative clustering
Recently Published Documents


TOTAL DOCUMENTS

72
(FIVE YEARS 22)

H-INDEX

11
(FIVE YEARS 2)

Author(s):  
Daoming Wan ◽  
Roozbeh Razavi-Far ◽  
Mehrdad Saif ◽  
Niloofar Mozafari

PLoS ONE ◽  
2021 ◽  
Vol 16 (1) ◽  
pp. e0244691
Author(s):  
WAQAR ISHAQ ◽  
ELIYA BUYUKKAYA ◽  
MUSHTAQ ALI ◽  
ZAKIR KHAN

The vertical collaborative clustering aims to unravel the hidden structure of data (similarity) among different sites, which will help data owners to make a smart decision without sharing actual data. For example, various hospitals located in different regions want to investigate the structure of common disease among people of different populations to identify latent causes without sharing actual data with other hospitals. Similarly, a chain of regional educational institutions wants to evaluate their students’ performance belonging to different regions based on common latent constructs. The available methods used for finding hidden structures are complicated and biased to perform collaboration in measuring similarity among multiple sites. This study proposes vertical collaborative clustering using a bit plane slicing approach (VCC-BPS), which is simple and unique with improved accuracy, manages collaboration among various data sites. The VCC-BPS transforms data from input space to code space, capturing maximum similarity locally and collaboratively at a particular bit plane. The findings of this study highlight the significance of those particular bits which fit the model in correctly classifying class labels locally and collaboratively. Thenceforth, the data owner appraises local and collaborative results to reach a better decision. The VCC-BPS is validated by Geyser, Skin and Iris datasets and its results are compared with the composite dataset. It is found that the VCC-BPS outperforms existing solutions with improved accuracy in term of purity and Davies-Boulding index to manage collaboration among different data sites. It also performs data compression by representing a large number of observations with a small number of data symbols.


Author(s):  
Xu Yang ◽  
Cheng Deng ◽  
Zhiyuan Dang ◽  
Dacheng Tao

2020 ◽  
Vol 30 (1) ◽  
pp. 327-345
Author(s):  
Sarah Zouinina ◽  
Younès Bennani ◽  
Nicoleta Rogovschi ◽  
Abdelouahid Lyhyaoui

Abstract The interest in data anonymization is exponentially growing, motivated by the will of the governments to open their data. The main challenge of data anonymization is to find a balance between data utility and the amount of disclosure risk. One of the most known frameworks of data anonymization is k-anonymity, this method assumes that a dataset is anonymous if and only if for each element of the dataset, there exist at least k − 1 elements identical to it. In this paper, we propose two techniques to achieve k-anonymity through microaggregation: k-CMVM and Constrained-CMVM. Both, use topological collaborative clustering to obtain k-anonymous data. The first one determines the k levels automatically and the second defines it by exploration. We also improved the results of these two approaches by using pLVQ2 as a weighted vector quantization method. The four methods proposed were proven to be efficient using two data utility measures, the separability utility and the structural utility. The experimental results have shown a very promising performance.


2020 ◽  
Vol 67 (10) ◽  
pp. 2735-2744
Author(s):  
Hangfan Liu ◽  
Hongming Li ◽  
Mohamad Habes ◽  
Yuemeng Li ◽  
Pamela Boimel ◽  
...  

2020 ◽  
Vol 2 (1) ◽  
pp. 13-18
Author(s):  
Hartatik Hartatik ◽  
Rosyid Rosyid

Sparsity adalah satu masalah yang sering terjadi pada teknik collaborative clustering dimana user sedikit sekali memiliki informasi (pada penelitian ini rating) yang menyebabkan sistem seringkali tidak akurat ketika memberikan rekomendasi. Banyak metode yang bisa digunakan untuk menyelesaikan masalah sparsity data, salah satunya adalah metode KNN. Namun metode KNN memiliki kelemahan yaitu scalability. Scalability terjadi ketika ketika data yang harus dicari kesamaannya semakin besar. Salah satu solusi yang mungkin diimplementasikan adalah dengan mencari profil dari user dan mengelompokkannya menjadi satu kelompok. Eksperimen yang dilakukan pada penelitian ini untuk mengatasi masalah sparsity dan scalability adalah dengan menggabungkan algoritma silhouette, k-means, K-Nearest Neighbour. Dataset yang dipakai di penelitian ini, berjumlah 700 rating yang di crawling melalui web traveloka. Data rating antara user dan item akan disimpan dalam database, untuk selanjutnya dirubah menjadi bentuk array user-item. Hasil pengujian dengan 5 data uji didapatkan nilai rata-rata RMSE 1,33% dengan rata-rata akurasi = 100% - 1,33% = 98,67%.


Sign in / Sign up

Export Citation Format

Share Document