Analysis of large volume data processing using clustering algorithms
The study of large dataset with velocity, variety and volume which is also known as Big data. When the dataset has limited number of clusters, low dimensions and small number of data points the existing traditional clustering algorithms can be used.. As we know this is the internet age, the data is growing very fast and existing clustering algorithms are not giving the acceptable results in terms of time complexity and spatial complexity. So there is a need to develop a new approach of applying clustering of large volume of data processing with low time and spatial complexity through MapReduce and Hadoop frame work applying to different clustering algorithms, k-means, Canopy clustering and proposed algorithm .The analysis shows that the large volume of data processing will take low time and spatial complexity when compared to small volume of data.