Map reduce programming model: Construction of inverted index for automated document clustering

Author(s):  
K. Santhiya ◽  
V. Bhuvaneswari
2013 ◽  
Vol 2013 ◽  
pp. 1-12
Author(s):  
Lei Liu ◽  
Dongqing Liu ◽  
Shuai Lü ◽  
Peng Zhang

Map-Reduce-Merge is an improved parallel programming model based on Map-Reduce in cloud computing environment. Through the new Merge module, Map-Reduce-Merge can support processing multiple related heterogeneous datasets more efficiently. In order to demonstrate the validity and effectiveness of this new model, we present a rigorous description for Map-Reduce-Merge model using Haskell. Firstly, we describe the basic program skeleton of Map-Reduce-Merge programming model. Secondly, an abstract description for the Merge module is presented by analyzing the structure and function of the Merge module with Haskell as the description tool. Thirdly, we evaluate the Map-Reduce-Merge model on the basis of our description. We capture the functional characteristics of the Map-Reduce-Merge model by our abstract description, which can provide theoretical basis for designing more efficient parallel programming model to process join operation.


2017 ◽  
Vol 8 (1) ◽  
pp. 45-60 ◽  
Author(s):  
Zakaria Benmounah ◽  
Souham Meshoul ◽  
Mohamed Batouche

One of the remarkable results of the rapid advances in information technology is the production of tremendous amounts of data sets, so large or complex that available processing methods are inadequate, among these methods cluster analysis. Clustering becomes more challenging and complex. In this paper, the authors describe a highly scalable Differential Evolution (DE) algorithm based on map-reduce programming model. The traditional use of DE to deal with clustering of large sets of data is so time-consuming that it is not feasible. On the other hand, map-reduce is a programming model emerged lately to allow the design of parallel and distributed approaches. In this paper, four stages map-reduce differential evolution algorithm termed as DE-MRC is presented; each of these four phases is a map-reduce process and dedicated to a particular DE operation. DE-MRC has been tested on a real parallel platform of 128 computers connected with each other and more than 30 GB of data. Experimental results show the high scalability and robustness of DE-MRC.


The technological advancement plays a major role in this era of digital world of growing data. Hence, there is a need to analyse the data so as to make good decisions. In the domain of data analytics, clustering is one of the significant tasks. The main difficulty in Map reduce is the clustering of massive amount of dataset. Within a computing cluster, Map Reduce associated with the algorithm such as parallel and distributed methods serve as a main programming model. In this work, Map Reduce-based Firefly algorithm known as MR-FF is projected for clustering the data. It is implemented using a MapReduce model within the Hadoop framework. It is used to enhance the task of clustering as a major role of reducing the sum of Euclidean distance among every instance of data and its belonging centroid of the cluster. The outcome of the experiment exhibits that the projected algorithm is better while dealing with gigantic data, and also outcome maintains the quality of clustering level


Author(s):  
Zhehuang Huang ◽  
Jianxin Huang

The rapid updates of the resources and media in the big data age provide new opportunities for oversea Chinese education. It is an urgent task to effectively use the big data to boost the development of oversea Chinese education. However, very few studies are conducted in this area. Map-Reduce is a programming model of cloud computing used for the parallel computing of the large-scale data sets and this model enables programmers to run their own programs in the distributed system. In this paper we proposed a personalized overseas Chinese education model based on Map-Reduce mechanism , which can analyze the behavioral habits and personal preferences of users from a large pool of Chinese educational resources. In this way, the customer needs can be accurately grasped and their favorite resources are recommended from huge amounts of resources. The proposed model has a good application prospect for overseas Chinese education .


Sign in / Sign up

Export Citation Format

Share Document