A Secure Clustering Technique for Unstructured and Uncertain Big Data

Author(s):  
Md Tabrez Nafis ◽  
Ranjit Biswas
2019 ◽  
Vol 5 (2) ◽  
pp. 134-147 ◽  
Author(s):  
Mustafa Hajeer ◽  
Dipankar Dasgupta

In day to day life, the computer plays a major role, due to this advancement of technology collection of data from various fields are increasing. A large amount of data is produced by various fields for every second and is not easy to process. This large amount of data is called as Big data. A large number of small files also considered as Big data. It's not easy to process and store the small files in Hadoop. In the existing methods Merging technologies and Clustering Techniques are used to combine smaller files to large files up to 128 MB before sending it to HDFS in Hadoop. In the Proposed system CSFC (Clustering Small Files based on Centroid) Clustering Technique is used without mentioning the number of Clusters previously because if the clusters are mentioned before, all the files are clubbed within the limited number of clusters. In proposing system clusters are generated by depending on the number of related files in the dataset. The relevant files are combined up to 128 MB in a cluster. If any file is not relevant to the existing cluster or if the memory size reached 128MB then-new cluster will be generated and the file will be stored. It is easy to process the related files, comparing two irrelevant files. By using this method fetching data from the data node, it produces efficient result when comparing with other clustering techniques.


The expedient exuding innovation during recent year in the zone of data innovation is "Huge Data". Grouping is one of the significant assignment in wide scope of areas dealing with gigantic information. This study presents the different bunching approaches received for the viable enormous information grouping. Therefore, this survey article gives the audit of various research papers proposing different strategies embraced for the successful huge information grouping, similar to K-implies bunching, Variant of K-implies bunching, Fuzzy Cimplies grouping, Possibilistic C-implies bunching, Collaborative separating and Optimization based bunching. In addition, an elaborative examination is finished by concerning the usage instruments utilized, datasets used and the received system for bunching of huge information. In this manner a successful plan must be created to outperform present systems for remarkable administration of enormous information. In the long run the examination issues and holes of different huge information bunching strategies are introduced for profiting the analysts for initiation towards better large information grouping.


2016 ◽  
Vol 153 (4) ◽  
pp. 53-56
Author(s):  
Twisha Phirke ◽  
Pushkara Dighe ◽  
Darshana Rathi ◽  
Jayalakshmi Iyer ◽  
Nupur Giri

Sign in / Sign up

Export Citation Format

Share Document