A Comparative Survey of Big Data Computing and HPC: From a Parallel Programming Model to a Cluster Architecture

Author(s):  
Fei Yin ◽  
Feng Shi
2014 ◽  
Vol 08 (03) ◽  
pp. 279-299
Author(s):  
Guigang Zhang ◽  
Chao Li ◽  
Yong Zhang ◽  
Chunxiao Xing

Big data is playing a more and more important role in every area such as medical health, internet finance, culture and education etc. How to process these big data efficiently is a huge challenge. MapReduce is a good parallel programming language to process big data. However, it has lots of shortcomings. For example, it cannot process complex computing. It cannot suit real-time computing. In order to overcome these shortcomings of MapReduce and its variants, in this paper, we propose a Semantic++ MapReduce parallel programming model. This study includes the following parts. (1) Semantic++ MapReduce parallel programming model. It includes physical framework of semantic++ MapReduce parallel programming model and logic framework of semantic++ MapReduce parallel programming model; (2) Semantic++ extraction and management method for big data; (3) Semantic++ MapReduce parallel programming computing framework. It includes semantic++ map, semantic++ reduce and semantic++ shuffle; (4) Semantic++ MapReduce for multi-data centers. It includes basic framework of semantic++ MapReduce for multi-data centers and semantic++ MapReduce application framework for multi-data centers; (5) A Case Study of semantic++ MapReduce across multi-data centers.


2015 ◽  
Vol 44 (4) ◽  
pp. 832-866 ◽  
Author(s):  
Ren Li ◽  
Haibo Hu ◽  
Heng Li ◽  
Yunsong Wu ◽  
Jianxi Yang

2016 ◽  
Vol 43 ◽  
pp. 95-103 ◽  
Author(s):  
James A. Ross ◽  
David A. Richie ◽  
Song J. Park ◽  
Dale R. Shires

2021 ◽  
Vol 24 (1) ◽  
pp. 157-183
Author(s):  
Никита Андреевич Катаев

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.


Author(s):  
Françoise Baude ◽  
Guy Vidal-Naquet

Sign in / Sign up

Export Citation Format

Share Document