Scalable Differential Evolutionary Clustering Algorithm for Big Data Using Map-Reduce Paradigm
One of the remarkable results of the rapid advances in information technology is the production of tremendous amounts of data sets, so large or complex that available processing methods are inadequate, among these methods cluster analysis. Clustering becomes more challenging and complex. In this paper, the authors describe a highly scalable Differential Evolution (DE) algorithm based on map-reduce programming model. The traditional use of DE to deal with clustering of large sets of data is so time-consuming that it is not feasible. On the other hand, map-reduce is a programming model emerged lately to allow the design of parallel and distributed approaches. In this paper, four stages map-reduce differential evolution algorithm termed as DE-MRC is presented; each of these four phases is a map-reduce process and dedicated to a particular DE operation. DE-MRC has been tested on a real parallel platform of 128 computers connected with each other and more than 30 GB of data. Experimental results show the high scalability and robustness of DE-MRC.