Fuzzy Microaggregation for Microdata Protection

Author(s):  
Josep Domingo-Ferrer ◽  
◽  
Vicenç Torra ◽  

In this work we describe a microdata protection method based on the use of fuzzy clustering and, more specifically, using fuzzy c-means. Microaggregation is a well-known masking method for microdata protection used by National Statistical Offices. Given a set of objects described in terms of a set of variables, this method consists on building a partition of the objects and then replace the original evaluation for each variable by the aggregates of each partition. This is, the values in a given cluster are aggregated –fused– and used instead of the original ones. As the problem of finding the best partition for microdata protection is an NP problem, heuristic methods are considered in the literature. Our approach uses fuzzy c-means for building a fuzzy partition, instead of a crisp one.

2012 ◽  
Vol 548 ◽  
pp. 740-743
Author(s):  
Yi Lan Chen ◽  
Huan Bao Wang

In this paper, we present a novel hybrid classification model with fuzzy clustering and design a newly combinatorial classifier for error-data in joining processes with diverse-granular computing, which is an ensemble of a naïve Bayes classifier with fuzzy c-means clustering. And we apply it to improve classification performance of traditional hard classifiers in more complex real-world situations. The fuzzy c-means clustering is applied to a fuzzy partition based on a given propositional function to augment the combinatorial classifier. This strategy would work better than a conventional hard classifier without fuzzy clustering. Proper scale granularity of objects contributes to higher classification performance of the combinatorial classifier. Our experimental results show the newly combinatorial classifier has improved the accuracy and stability of classification.


2020 ◽  
Vol 39 (5) ◽  
pp. 5999-6008
Author(s):  
Vicenç Torra

Microaggregation is an effective data-driven protection method that permits us to achieve a good trade-off between disclosure risk and information loss. In this work we propose a method for microaggregation based on fuzzy c-means, that is appropriate when there are constraints (linear constraints) on the variables that describe the data. Our method leads to results that satisfy these constraints even when the data to be masked do not satisfy them.


2011 ◽  
Vol 211-212 ◽  
pp. 793-797
Author(s):  
Chin Chun Chen ◽  
Yuan Horng Lin ◽  
Jeng Ming Yih ◽  
Sue Fen Huang

Apply interpretive structural modeling to construct knowledge structure of linear algebra. New fuzzy clustering algorithms improved fuzzy c-means algorithm based on Mahalanobis distance has better performance than fuzzy c-means algorithm. Each cluster of data can easily describe features of knowledge structures individually. The results show that there are six clusters and each cluster has its own cognitive characteristics. The methodology can improve knowledge management in classroom more feasible.


2014 ◽  
Vol 7 (2) ◽  
Author(s):  
Anif Hanifa Setianingrum

Dunia pendidikan sering mengalami masalah dengan tidak tercapainya tujuan yang telah ditetapkan dalam visi misi institusi. Banyak faktor yang menyebabkan tidak berjalan atau tidak tercapainya target output yang dihasilkan. Faktor-faktor internal SDM, metode pengajaran, serta kurikulum yang telah dirumuskan kadang tidak dapat memenuhi standarisasi kualifikasi dari pihak stakeholder. Metode evaluasi dan monitoring akan melakukan pemetaan permasalahan metode pengajaran dari para pelaksana institusi. Evaluasi Pemetaan dan Penerapan metode pengajaran dengan menggunakan Metode Fuzzy C-Means Clustering (FCM), dengan mengumpulkan data hasil penilaian dosen terhadap daftar nilai mahasiswa.. Penilaian juga harus dilakukan dengan hasil penilaian stakeholder.Hasil Cluster menyatakan ada Lima (5) cluster pengelompokkan Kualifikasi Mahasiswa (SO1, SO2, SO3) dan Identifikasi Penilaian SKKNI terhadap JRP  Cluster Pertama untuk K,V,AD,AG, Cluster Kedua  : D,H,O,W,AN, Cluster Ketiga untuk Mahasiswa A,M,R,T,AA,AJ, Cluster 4 Y,AC,AI,AK,AO, Cluster 5 E,I,J,N,AL.Ada persamaan dan ketidaksamaan nama mahasiswa dari hasil penilaian internal maupun hasil penilaian eksternal artinya Penilaian internal terhadap kualifikasi kelulusan mahasiswa berbeda dengan kriteria penilaian stakeholder terhadap standarisasi SKKNI.Kata Kunci: Fuzzy, Clustering, Standarisasi SKKNI, FCM


2014 ◽  
Vol 685 ◽  
pp. 638-641
Author(s):  
Zhi Xin Ma ◽  
Bin Bin Wen ◽  
Da Gan Nie

Fuzzy clustering can express the ambiguity ofsample category, and better reflect the actual needs of datamining. By introducing wavelet transform and artificial immunealgorithm to fuzzy clustering, Wavelet-based Immune Fuzzy C-means Algorithm (WIFCM) is proposed for overcoming theimperfections of fuzzy clustering, such as falling easily into localoptimal solution, slower convergence speed and initialization-dependence of clustering centers. Innovations of WIFCM arethe elite extraction operator and the descent reproductive mode.Using the locality and multi-resolution of wavelet transform, theelite extraction operator explores the distribution and densityinformation of spatial data objects in multi-dimensional spaceto guide the search of cluster centers. Taking advantage ofthe relationship between the relative positions of elite centersand inferior centers, the descent reproductive mode obtains theapproximate fastest descent direction of objective function values,and assures fast convergence of algorithm. Compared to theclassic fuzzy C-means algorithm, experiments on 3 UCI data setsshow that WIFCM has obvious advantages in average numberof iterations and accuracy.


Author(s):  
Mashhour H. Baeshen ◽  
Malcolm J. Beynon ◽  
Kate L. Daunt

This chapter presents a study of the development of the clustering methodology to data analysis, with particular attention to the analysis from a crisp environment to a fuzzy environment. An applied problem concerning service quality (using SERVQUAL) of mobile phone users, and subsequent loyalty and satisfaction forms the data set to demonstrate the clustering issue. Following details on both the crisp k-means and fuzzy c-means clustering techniques, comparable results from their analysis are shown, on a subset of data, to enable both graphical and statistical elucidation. Fuzzy c-means is then employed on the full SERVQUAL dimensions, and the established results interpreted before tested on external variables, namely the level of loyalty and satisfaction across the different clusters established.


Author(s):  
Frank Rehm ◽  
Roland Winkler ◽  
Rudolf Kruse

A well known issue with prototype-based clustering is the user’s obligation to know the right number of clusters in a dataset in advance or to determine it as a part of the data analysis process. There are different approaches to cope with this non-trivial problem. This chapter follows the approach to address this problem as an integrated part of the clustering process. An extension to repulsive fuzzy c-means clustering is proposed equipping non-Euclidean prototypes with repulsive properties. Experimental results are presented that demonstrate the feasibility of the authors’ technique.


Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 158
Author(s):  
Tran Dinh Khang ◽  
Nguyen Duc Vuong ◽  
Manh-Kien Tran ◽  
Michael Fowler

Clustering is an unsupervised machine learning technique with many practical applications that has gathered extensive research interest. Aside from deterministic or probabilistic techniques, fuzzy C-means clustering (FCM) is also a common clustering technique. Since the advent of the FCM method, many improvements have been made to increase clustering efficiency. These improvements focus on adjusting the membership representation of elements in the clusters, or on fuzzifying and defuzzifying techniques, as well as the distance function between elements. This study proposes a novel fuzzy clustering algorithm using multiple different fuzzification coefficients depending on the characteristics of each data sample. The proposed fuzzy clustering method has similar calculation steps to FCM with some modifications. The formulas are derived to ensure convergence. The main contribution of this approach is the utilization of multiple fuzzification coefficients as opposed to only one coefficient in the original FCM algorithm. The new algorithm is then evaluated with experiments on several common datasets and the results show that the proposed algorithm is more efficient compared to the original FCM as well as other clustering methods.


Sign in / Sign up

Export Citation Format

Share Document