Optimal Privacy Preserving Technique Over Big Data Analytics Using Oppositional Fruit Fly Algorithm

2020 ◽  
Vol 13 (2) ◽  
pp. 283-295
Author(s):  
Ajmeera Kiran ◽  
Vasumathi Devara

Background: Big data analytics is the process of utilizing a collection of data accompanied on the internet to store and retrieve anywhere and at any time. Big data is not simply a data but it involves the data generated by variety of gadgets or devices or applications. Objective: When massive volume of data is stored, there is a possibility for malevolent attacks on the searching data are stored in the server because of under privileged privacy preserving approaches. These traditional methods result in many drawbacks due to various attacks on sensitive information. Hence, to enhance the privacy preserving for sensitive information stored in the database, the proposed method makes use of efficient methods. Methods: In this manuscript, an optimal privacy preserving over big data using Hadoop and mapreduce framework is proposed. Initially, the input data is grouped by modified fuzzy c means clustering algorithm. Then we are performing a map reduce framework. And then the clustered data is fed to the mapper; in mapper the privacy of input data is done by convolution process. To validate the privacy of input data the recommended technique utilizes the optimal artificial neural network. Here, oppositional fruit fly algorithm is used to enhancing the neural networks. Results: The routine of the suggested system is assessed by means of clustering accuracy, error value, memory, and time. The experimentation is performed by KDD dataset. Conclusion: A result shows that our proposed system has maximum accuracy and attains the effective convolution process to improve privacy preserving.

2021 ◽  
Vol 14 (2) ◽  
pp. 26
Author(s):  
Na Li ◽  
Lianguan Huang ◽  
Yanling Li ◽  
Meng Sun

In recent years, with the development of the Internet, the data on the network presents an outbreak trend. Big data mining aims at obtaining useful information through data processing, such as clustering, clarifying and so on. Clustering is an important branch of big data mining and it is popular because of its simplicity. A new trend for clients who lack of storage and computational resources is to outsource the data and clustering task to the public cloud platforms. However, as datasets used for clustering may contain some sensitive information (e.g., identity information, health information), simply outsourcing them to the cloud platforms can't protect the privacy. So clients tend to encrypt their databases before uploading to the cloud for clustering. In this paper, we focus on privacy protection and efficiency promotion with respect to k-means clustering, and we propose a new privacy-preserving multi-user outsourced k-means clustering algorithm which is based on locality sensitive hashing (LSH). In this algorithm, we use a Paillier cryptosystem encrypting databases, and combine LSH to prune off some unnecessary computations during the clustering. That is, we don't need to compute the Euclidean distances between each data record and each clustering center. Finally, the theoretical and experimental results show that our algorithm is more efficient than most existing privacy-preserving k-means clustering.


2020 ◽  
Author(s):  
Hidayath Ali Baig ◽  
Dr. Yogesh Kumar Sharma ◽  
Syed Zakir Ali

Big Data is a collection of large or vast amount of information that grows at ever increasing rates. Big data is ordered, unstructured, semi structured or mixed data in natural world. Researchers are designing, implementing, analyzing different application. In medicinal industry large or vast amount of data is available but people are not able to extract the significant information. Healthcare big data analytics (HBDA) becomes “Healthier analytics” by fusion of techniques. In this paper, we discuss and implement algorithms of clustering using R-Studio tool. Clustering is defined as the method of partitioning set of patterns into similar groups called as clusters. We can extract the data from vast datasets in the form of clustering rules. These clustering techniques are scalable. Also, we compare the accuracies of two partition based clustering techniques k-means and Clara on healthcare datasets for giving good quality of healthcare services. Implemented results demonstrate the k-means method gives better accuracy values than Clara algorithm.


Sign in / Sign up

Export Citation Format

Share Document