Efficient and Privacy-Preserving Multi-User Outsourced K-Means Clustering

Na Li; Lianguan Huang; Yanling Li; Meng Sun

doi:10.5539/cis.v14n2p26

Efficient and Privacy-Preserving Multi-User Outsourced K-Means Clustering

Computer and Information Science ◽

10.5539/cis.v14n2p26 ◽

2021 ◽

Vol 14 (2) ◽

pp. 26

Author(s):

Na Li ◽

Lianguan Huang ◽

Yanling Li ◽

Meng Sun

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Privacy Preserving ◽

Locality Sensitive Hashing ◽

Sensitive Information ◽

The Public ◽

Big Data Mining ◽

Euclidean Distances ◽

Computational Resources

In recent years, with the development of the Internet, the data on the network presents an outbreak trend. Big data mining aims at obtaining useful information through data processing, such as clustering, clarifying and so on. Clustering is an important branch of big data mining and it is popular because of its simplicity. A new trend for clients who lack of storage and computational resources is to outsource the data and clustering task to the public cloud platforms. However, as datasets used for clustering may contain some sensitive information (e.g., identity information, health information), simply outsourcing them to the cloud platforms can't protect the privacy. So clients tend to encrypt their databases before uploading to the cloud for clustering. In this paper, we focus on privacy protection and efficiency promotion with respect to k-means clustering, and we propose a new privacy-preserving multi-user outsourced k-means clustering algorithm which is based on locality sensitive hashing (LSH). In this algorithm, we use a Paillier cryptosystem encrypting databases, and combine LSH to prune off some unnecessary computations during the clustering. That is, we don't need to compute the Euclidean distances between each data record and each clustering center. Finally, the theoretical and experimental results show that our algorithm is more efficient than most existing privacy-preserving k-means clustering.

Download Full-text

Improved K-Means Clustering Algorithm for Big Data Mining under Hadoop Parallel Framework

Journal of Grid Computing ◽

10.1007/s10723-019-09503-0 ◽

2019 ◽

Vol 18 (2) ◽

pp. 239-250 ◽

Cited By ~ 3

Author(s):

Weijia Lu

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Big Data Mining

Download Full-text

Research on Parallel Adaptive Canopy-K-Means Clustering Algorithm for Big Data Mining Based on Cloud Platform

Journal of Grid Computing ◽

10.1007/s10723-019-09504-z ◽

2020 ◽

Vol 18 (2) ◽

pp. 263-273 ◽

Cited By ~ 1

Author(s):

Dongliang Xia ◽

Feifei Ning ◽

Weina He

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Cloud Platform ◽

Big Data Mining

Download Full-text

Optimal Privacy Preserving Technique Over Big Data Analytics Using Oppositional Fruit Fly Algorithm

Recent Advances in Computer Science and Communications ◽

10.2174/2213275911666181119113913 ◽

2020 ◽

Vol 13 (2) ◽

pp. 283-295

Author(s):

Ajmeera Kiran ◽

Vasumathi Devara

Keyword(s):

Big Data ◽

Data Analytics ◽

Input Data ◽

Clustering Algorithm ◽

Big Data Analytics ◽

Fruit Fly ◽

Privacy Preserving ◽

Sensitive Information ◽

Convolution Process ◽

Fuzzy C Means Clustering

Background: Big data analytics is the process of utilizing a collection of data accompanied on the internet to store and retrieve anywhere and at any time. Big data is not simply a data but it involves the data generated by variety of gadgets or devices or applications. Objective: When massive volume of data is stored, there is a possibility for malevolent attacks on the searching data are stored in the server because of under privileged privacy preserving approaches. These traditional methods result in many drawbacks due to various attacks on sensitive information. Hence, to enhance the privacy preserving for sensitive information stored in the database, the proposed method makes use of efficient methods. Methods: In this manuscript, an optimal privacy preserving over big data using Hadoop and mapreduce framework is proposed. Initially, the input data is grouped by modified fuzzy c means clustering algorithm. Then we are performing a map reduce framework. And then the clustered data is fed to the mapper; in mapper the privacy of input data is done by convolution process. To validate the privacy of input data the recommended technique utilizes the optimal artificial neural network. Here, oppositional fruit fly algorithm is used to enhancing the neural networks. Results: The routine of the suggested system is assessed by means of clustering accuracy, error value, memory, and time. The experimentation is performed by KDD dataset. Conclusion: A result shows that our proposed system has maximum accuracy and attains the effective convolution process to improve privacy preserving.

Download Full-text

Privacy Preserving Big Data mining using Pseudonymization and Homomorphic Encryption

10.1109/gcat52182.2021.9587765 ◽

2021 ◽

Author(s):

Ila Chandrakar ◽

Vishwanath R Hulipalled

Keyword(s):

Data Mining ◽

Big Data ◽

Homomorphic Encryption ◽

Privacy Preserving ◽

Big Data Mining

Download Full-text

Efficient and Privacy-Preserving k-Means Clustering for Big Data Mining

2016 IEEE Trustcom/BigDataSE/ISPA ◽

10.1109/trustcom.2016.0140 ◽

2016 ◽

Cited By ~ 8

Author(s):

Zakaria Gheid ◽

Yacine Challal

Keyword(s):

Data Mining ◽

Big Data ◽

Privacy Preserving ◽

Big Data Mining

Download Full-text

A Comprehensive Survey on Privacy Preserving Big Data Mining

International Journal of Computer Applications Technology and Research ◽

10.7753/ijcatr0602.1002 ◽

2017 ◽

Vol 6 (2) ◽

pp. 79-86 ◽

Cited By ~ 1

Author(s):

S. Srijayanthi ◽

R. Sethukkarasi

Keyword(s):

Data Mining ◽

Big Data ◽

Privacy Preserving ◽

Big Data Mining ◽

Comprehensive Survey

Download Full-text

An Empirical Perusal of Distance Measures for Clustering with Big Data Mining

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8078.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 606-616 ◽

Cited By ~ 1

Keyword(s):

Data Mining ◽

Big Data ◽

Clustering Algorithm ◽

Distance Measure ◽

Confusion Matrix ◽

Heterogeneous Data ◽

Distance Measures ◽

Research Perspective ◽

Big Data Mining ◽

Data Criterion

The distance measure is the core idea of data mining techniques such as classification, clustering, and statistical analysis and so on. All clustering taxonomies such as partition, hierarchical, density, grid, model, fuzzy and graphs used to distance measures for the data point’s categorization under difference cluster, cluster construction and validation. Big data mining is the advanced concept of data mining respect to the big data dimensions. When traditional clustering algorithm is used under the big data mining the distance measure is needed for scalable under big data mining and support to a huge size dataset, heterogeneous data and sources, and velocity characteristics of the big data. From a theoretically, practically and the existing research perspective, the paper focuses on volume, variety, and velocity big data criterion for identifying a distance measure for the big data mining and recognize how to distance measure works under clustering taxonomy. This study also analyzed all distance measures accuracy with the help of a confusion matrix through clustering.

Download Full-text

Privacy preserving big data mining: association rule hiding using fuzzy logic approach

IET Information Security ◽

10.1049/iet-ifs.2015.0545 ◽

2018 ◽

Vol 12 (1) ◽

pp. 15-24 ◽

Cited By ~ 12

Author(s):

Golnar Assadat Afzali ◽

Shahriar Mohammadi

Keyword(s):

Data Mining ◽

Fuzzy Logic ◽

Big Data ◽

Association Rule ◽

Privacy Preserving ◽

Big Data Mining ◽

Fuzzy Logic Approach ◽

Logic Approach ◽

Mining Association Rule

Download Full-text

Legal Models In Privacy-Preserving Big Data Mining

International Journal of Engineering Trends and Technology ◽

10.14445/22315381/ijett-v68i7p213s ◽

2020 ◽

Vol 68 (7) ◽

pp. 83-92

Author(s):

Preeti Gulia ◽

Hem lata

Keyword(s):

Data Mining ◽

Big Data ◽

Privacy Preserving ◽

Big Data Mining

Download Full-text

PROPOSE AND ENLARGEMENT OF PRIVACY PRESERVING TECHNIQUES IN BIG DATA MINING

International Journal of Advanced Research in Computer Science ◽

10.26483/ijarcs.v9i2.5855 ◽

2018 ◽

Vol 9 (2) ◽

pp. 470-471

Author(s):

P. Latha ◽

Keyword(s):

Data Mining ◽

Big Data ◽

Privacy Preserving ◽

Big Data Mining

Download Full-text