Parallel Computing Framework Based on MapReduce and GPU Clusters

AbstractNowadays, with the rapid growth of data volume, massive data has become one of the factors that plague the development of enterprises. How to effectively process data and reduce the concurrency pressure of data access has become the driving force for the continuous development of big data solutions. This article mainly studies the MapReduce parallel computing framework based on multiple data fusion sensors and GPU clusters. This experimental environment uses a Hadoop fully distributed cluster environment, and the entire programming of the single-source shortest path algorithm based on MapReduce is implemented in Java language. 8 ordinary physical machines are used to build a fully distributed cluster, and the configuration environment of each node is basically the same. The MapReduce framework divides the request job into several mapping tasks and assigns them to different computing nodes. After the mapping process, a certain intermediate file that is consistent with the final file format is generated. At this time, the system will generate several reduction tasks and distribute these files to different cluster nodes for execution. This experiment will verify the changes in the running time of the PSON algorithm when the size of the test data set gradually increases while keeping the hardware level and software configuration of the Hadoop platform unchanged. When the number of computing nodes increases from 2 to 4, the running time is significantly reduced. When the number of computing nodes continues to increase, the reduction in running time will become less and less significant. The results show that NESTOR can complete the basic workflow of MapReduce, and simplifies the process of user development of GPU positive tree order, which has a significant speedup for applications with large amounts of calculations.

Download Full-text

TopADD: a 2D/3D integrated topology optimization parallel-computing framework for arbitrary design domains

Structural and Multidisciplinary Optimization ◽

10.1007/s00158-021-02917-z ◽

2021 ◽

Author(s):

Zhi-Dong Zhang ◽

Osezua Ibhadode ◽

Ali Bonakdar ◽

Ehsan Toyserkani

Keyword(s):

Parallel Computing ◽

Topology Optimization ◽

Computing Framework

Download Full-text

Snow: A Parallel Computing Framework for the R System

International Journal of Parallel Programming ◽

10.1007/s10766-008-0077-2 ◽

2008 ◽

Vol 37 (1) ◽

pp. 78-90 ◽

Cited By ~ 39

Author(s):

Luke Tierney ◽

A. J. Rossini ◽

Na Li

Keyword(s):

Parallel Computing ◽

Computing Framework

Download Full-text

A Practical Approach for Scalable Record Linkage on Hadoop

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.753-755.3018 ◽

2013 ◽

Vol 753-755 ◽

pp. 3018-3024 ◽

Cited By ~ 1

Author(s):

Fen Gyu Yang ◽

Ying Chen ◽

Ye Zhang

Keyword(s):

Parallel Computing ◽

Record Linkage ◽

Practical Approach ◽

Massive Data ◽

Traditional Methods ◽

High Recall ◽

Data Parallel ◽

Computing Framework ◽

Data Parallel Computing ◽

Hadoop Cluster

As increasing data have been collected in many applications, we have to face with millions of data in record linkage. With respect to traditional methods, there comes out a big challenge in performance while dealing with massive data. Parallel computing framework, such as MapReduce, has become an efficient and practical way to address this problem. In this paper, we propose a practical 3-phase MapReduce approach that fulfills blocking, filtering, and linking in 3 consecutive processes on Hadoop cluster. Experiments show that our approach functions efficiently and effectively with keeping high recall in contrast to tradition method.

Download Full-text

An Efficient Parallel Computing Framework for Over the Obstacle VLSI Routing

Advances in Intelligent Systems and Computing - Progress in Computing, Analytics and Networking ◽

10.1007/978-981-15-2414-1_38 ◽

2020 ◽

pp. 383-395

Author(s):

G. Shyamala ◽

G. R. Prasad

Keyword(s):

Parallel Computing ◽

Computing Framework

Download Full-text

Multicore Parallel Computing and DSP Processor for the Design of Bio-inspired Soft Computing Framework for Speech and Image Processing Applications

Recent Trends in Intelligent and Emerging Systems - Signals and Communication Technology ◽

10.1007/978-81-322-2407-5_10 ◽

2015 ◽

pp. 125-134 ◽

Cited By ~ 1

Author(s):

Dipjyoti Sarma ◽

Kandarpa Kumar Sarma

Keyword(s):

Image Processing ◽

Parallel Computing ◽

Soft Computing ◽

Computing Framework

Download Full-text

A System for Noncontact Estimation of Cognitive Load Using Saccadic Parameters Based on a Serio-Parallel Computing Framework

IEEE Transactions on Cognitive and Developmental Systems ◽

10.1109/tcds.2019.2901024 ◽

2019 ◽

Vol 11 (3) ◽

pp. 450-459

Keyword(s):

Parallel Computing ◽

Cognitive Load ◽

Computing Framework

Download Full-text

A Parallel Computing Framework for Finding the Supported Solutions to a Biobjective Network Optimization Problem

Journal of Multi-Criteria Decision Analysis ◽

10.1002/mcda.1541 ◽

2015 ◽

Vol 22 (5-6) ◽

pp. 244-259 ◽

Cited By ~ 4

Author(s):

Fernando Antonio Medrano ◽

Richard Lee Church

Keyword(s):

Parallel Computing ◽

Network Optimization ◽

Optimization Problem ◽

Computing Framework

Download Full-text

Mr4Soil: A MapReduce-Based Framework Integrated with GIS for Soil Erosion Modelling

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8030103 ◽

2019 ◽

Vol 8 (3) ◽

pp. 103

Author(s):

Zhigang Han ◽

Fen Qin ◽

Caihui Cui ◽

Yannan Liu ◽

Lingling Wang ◽

...

Keyword(s):

Parallel Computing ◽

Soil Erosion ◽

Laser Scanning ◽

Three Dimensional ◽

Large Data ◽

Data Sets ◽

Precision Data ◽

Model Calculations ◽

Erosion Modelling ◽

Computing Framework

A soil erosion model is used to evaluate the conditions of soil erosion and guide agricultural production. Recently, high spatial resolution data have been collected in new ways, such as three-dimensional laser scanning, providing the foundation for refined soil erosion modelling. However, serial computing cannot fully meet the computational requirements of massive data sets. Therefore, it is necessary to perform soil erosion modelling under a parallel computing framework. This paper focuses on a parallel computing framework for soil erosion modelling based on the Hadoop platform. The framework includes three layers: the methodology, algorithm, and application layers. In the methodology layer, two types of parallel strategies for data splitting are defined as row-oriented and sub-basin-oriented methods. The algorithms for six parallel calculation operators for local, focal and zonal computing tasks are designed in detail. These operators can be called to calculate the model factors and perform model calculations. We defined the key-value data structure of GeoCSV format for vector, row-based and cell-based rasters as the inputs for the algorithms. A geoprocessing toolbox is developed and integrated with the geographic information system (GIS) platform in the application layer. The performance of the framework is examined by taking the Gushanchuan basin as an example. The results show that the framework can perform calculations involving large data sets with high computational efficiency and GIS integration. This approach is easy to extend and use and provides essential support for applying high-precision data to refine soil erosion modelling.

Download Full-text

PAPIRUS, a parallel computing framework for sensitivity analysis, uncertainty propagation, and estimation of parameter distribution

Nuclear Engineering and Design ◽

10.1016/j.nucengdes.2015.07.002 ◽

2015 ◽

Vol 292 ◽

pp. 237-247 ◽

Cited By ~ 12

Author(s):

Jaeseok Heo ◽

Kyung Doo Kim

Keyword(s):

Sensitivity Analysis ◽

Parallel Computing ◽

Uncertainty Propagation ◽

Parameter Distribution ◽

Computing Framework

Download Full-text