Parallel Computing for Sorting Algorithms

The expanding use of multi-processor supercomputers has made a significant impact on the speed and size of many problems. The adaptation of standard Message Passing Interface protocol (MPI) has enabled programmers to write portable and efficient codes across a wide variety of parallel architectures. Sorting is one of the most common operations performed by a computer. Because sorted data are easier to manipulate than randomly ordered data, many algorithms require sorted data. Sorting is of additional importance to parallel computing because of its close relation to the task of routing data among processes, which is an essential part of many parallel algorithms. In this paper, sequential sorting algorithms, the parallel implementation of many sorting methods in a variety of ways using MPICH.NT.1.2.3 library under C++ programming language and comparisons between the parallel and sequential implementations are presented. Then, these methods are used in the image processing field. It have been built a median filter based on these submitted algorithms. As the parallel platform is unavailable, the time is computed in terms of a number of computations steps and communications steps

Download Full-text

A Robot's Response Acceleration Using the Metric Dimension Problem

10.20944/preprints201911.0194.v1 ◽

2019 ◽

Author(s):

Elsayed Badr ◽

Khalid Aloufi

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Parallel Implementation ◽

Metric Dimension ◽

Complete Problem ◽

Path Graph ◽

Ladder Graph ◽

Current Location ◽

C Programming ◽

Np Complete

Consider a robot that is navigating in a space modeled by a graph, and that wants to know its current location. It can send a signal to determine how far it is from each landmark among a set of fixed landmarks. We study the problem of computing the minimum required number of landmarks, and where they should be placed so that the robot can always determine its location. Since the problem is an NP-complete problem, the robot's responses to the actions are slow. To accelerate this response, we can use the parallel version of this problem. In this work, we introduce a new parallel implementation for determining the metric dimension of a given graph. We run the proposed algorithm on a symmetric multi-processing (SMP) cluster using C programming language and the Message Passing Interface (MPI) library. Finally, we run our implementation on four categories of graphs (the tracks in which the robot moves): a cycle graph Cn, a path graph Pn, a triangular snake graph and a ladder graph Ln. Preliminary computational results indicate that the metric dimension problem is an NP-complete problem and prove the ability of the proposed algorithm to achieve a speedup of 6 for 8 processors.

Download Full-text

Parallel implementation for HSLO(3)-FDTD with message passing interface on Distributed Memory Architecture

2006 International Conference on Computing & Informatics ◽

10.1109/icoci.2006.5276531 ◽

2006 ◽

Author(s):

Mohammad Khatim Hasan ◽

Mohamed Othman ◽

Jalil Md Desa ◽

Zulkifly Abbas ◽

Jumat Sulaiman

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Distributed Memory ◽

Parallel Implementation ◽

Memory Architecture ◽

Distributed Memory Architecture

Download Full-text

Simulation-Based Scheduling of Waterway Projects Using a Parallel Genetic Algorithm

Transportation Systems and Engineering ◽

10.4018/978-1-4666-8473-7.ch016 ◽

2015 ◽

pp. 334-347 ◽

Cited By ~ 2

Author(s):

Ning Yang ◽

Shiaaulir Wang ◽

Paul Schonfeld

Keyword(s):

Genetic Algorithm ◽

Parallel Computing ◽

Message Passing ◽

Message Passing Interface ◽

Computation Time ◽

Parallel Genetic Algorithm ◽

Simulation Based ◽

Multiple Processors ◽

Simulation Based Optimization ◽

Speed Up

A Parallel Genetic Algorithm (PGA) is used for a simulation-based optimization of waterway project schedules. This PGA is designed to distribute a Genetic Algorithm application over multiple processors in order to speed up the solution search procedure for a very large combinational problem. The proposed PGA is based on a global parallel model, which is also called a master-slave model. A Message-Passing Interface (MPI) is used in developing the parallel computing program. A case study is presented, whose results show how the adaption of a simulation-based optimization algorithm to parallel computing can greatly reduce computation time. Additional techniques which are found to further improve the PGA performance include: (1) choosing an appropriate task distribution method, (2) distributing simulation replications instead of different solutions, (3) avoiding the simulation of duplicate solutions, (4) avoiding running multiple simulations simultaneously in shared-memory processors, and (5) avoiding using multiple processors which belong to different clusters (physical sub-networks).

Download Full-text

Reducing communication in algebraic multigrid with multi-step node aware communication

The International Journal of High Performance Computing Applications ◽

10.1177/1094342020925535 ◽

2020 ◽

Vol 34 (5) ◽

pp. 547-561

Author(s):

Amanda Bienz ◽

William D Gropp ◽

Luke N Olson

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Parallel Implementation ◽

Algebraic Multigrid ◽

Sparse Linear Systems ◽

Parallel Scalability ◽

Strong Scaling ◽

The Cost ◽

Communication Schedule ◽

Inter Process Communication

Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.

Download Full-text

Interpretive MPI for Parallel Computing

Volume 3: 28th Computers and Information in Engineering Conference, Parts A and B ◽

10.1115/detc2008-49996 ◽

2008 ◽

Author(s):

Yu-Cheng Chou ◽

Harry H. Cheng

Keyword(s):

Parallel Computing ◽

Programming Languages ◽

Message Passing ◽

Large Scale ◽

Message Passing Interface ◽

Rapid Development ◽

Web Based ◽

Heterogeneous Platforms ◽

C Programs ◽

Computation Speedup

Message Passing Interface (MPI) is a standardized library specification designed for message-passing parallel programming on large-scale distributed systems. A number of MPI libraries have been implemented to allow users to develop portable programs using the scientific programming languages, Fortran, C and C++. Ch is an embeddable C/C++ interpreter that provides an interpretive environment for C/C++ based scripts and programs. Combining Ch with any MPI C/C++ library provides the functionality for rapid development of MPI C/C++ programs without compilation. In this article, the method of interfacing Ch scripts with MPI C implementations is introduced by using the MPICH2 C library as an example. The MPICH2-based Ch MPI package provides users with the ability to interpretively run MPI C program based on the MPICH2 C library. Running MPI programs through the MPICH2-based Ch MPI package across heterogeneous platforms consisting of Linux and Windows machines is illustrated. Comparisons for the bandwidth, latency, and parallel computation speedup between C MPI, Ch MPI, and MPI for Python in an Ethernet-based environment comprising identical Linux machines are presented. A Web-based example is given to demonstrate the use of Ch and MPICH2 in C based CGI scripting to facilitate the development of Web-based applications for parallel computing.

Download Full-text

Study on the Numerical Simulation of Explosion and Impact Processes Using PC Cluster System

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.2892 ◽

2012 ◽

Vol 433-440 ◽

pp. 2892-2898

Author(s):

Guang Lei Fei ◽

Jian Guo Ning ◽

Tian Bao Ma

Keyword(s):

Numerical Simulation ◽

Operating System ◽

Parallel Computing ◽

Message Passing ◽

Message Passing Interface ◽

Parallel Program ◽

Pc Cluster ◽

Computing Platform ◽

Impact Processes ◽

Platform System

Parallel computing has been applied in many fields, and the parallel computing platform system, PC cluster based on MPI (Message Passing Interface) library under Linux operating system is a cost-effectiveness approach to parallel compute. In this paper, the key algorithm of parallel program of explosion and impact is presented. The techniques of solving data dependence and realizing communication between subdomain are proposed. From the test of program, the portability of MMIC-3D parallel program is satisfied, and compared with the single computer, PC cluster can improve the calculation speed and enlarge the scale greatly.

Download Full-text

A highly portable parallel implementation of AMBER4 using the message passing interface standard

Journal of Computational Chemistry ◽

10.1002/jcc.540161110 ◽

1995 ◽

Vol 16 (11) ◽

pp. 1420-1427 ◽

Cited By ~ 18

Author(s):

James J. Vincent ◽

Kenneth M. Merz

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Parallel Implementation

Download Full-text

Efficient Message Passing Interface (MPI) for Parallel Computing on Clusters of Workstations

Journal of Parallel and Distributed Computing ◽

10.1006/jpdc.1996.1267 ◽

1997 ◽

Vol 40 (1) ◽

pp. 19-34 ◽

Cited By ~ 32

Author(s):

Jehoshua Bruck ◽

Danny Dolev ◽

Ching-Tien Ho ◽

Marcel-Cătălin Roşu ◽

Ray Strong

Keyword(s):

Parallel Computing ◽

Message Passing ◽

Message Passing Interface

Download Full-text

Optimized parallel simulations of analytic bond-order potentials on hybrid shared/distributed memory with MPI and OpenMP

The International Journal of High Performance Computing Applications ◽

10.1177/1094342017727060 ◽

2017 ◽

Vol 33 (2) ◽

pp. 227-241 ◽

Cited By ~ 1

Author(s):

Carlos Teijeiro ◽

Thomas Hammerschmidt ◽

Ralf Drautz ◽

Godehard Sutmann

Keyword(s):

Message Passing ◽

Bond Order ◽

Message Passing Interface ◽

Parallel Implementation ◽

Computational Cost ◽

Parallel Simulations ◽

Decomposition Scheme ◽

Significant Performance ◽

Restricted Volume ◽

Bond Order Potentials

Analytic bond-order potentials (BOPs) allow to obtain a highly accurate description of interatomic interactions at a reasonable computational cost. However, for simulations with very large systems, the high memory demands require the use of a parallel implementation, which at the same time also optimizes the use of computational resources. The calculations of analytic BOPs are performed for a restricted volume around every atom and therefore have shown to be well suited for a message passing interface (MPI)-based parallelization based on a domain decomposition scheme, in which one process manages one big domain using the entire memory of a compute node. On the basis of this approach, the present work focuses on the analysis and enhancement of its performance on shared memory by using OpenMP threads on each MPI process, in order to use many cores per node to speed up computations and minimize memory bottlenecks. Different algorithms are described and their corresponding performance results are presented, showing significant performance gains for highly parallel systems with hybrid MPI/OpenMP simulations up to several thousands of threads.

Download Full-text

CIP and Parallel Computing Based Numerical Solutions of 3-D Slamming Problems

Volume 11: Prof. Robert F. Beck Honoring Symposium on Marine Hydrodynamics ◽

10.1115/omae2015-41292 ◽

2015 ◽

Author(s):

Peng Wen ◽

Wei Qiu

Keyword(s):

Parallel Computing ◽

Message Passing ◽

Message Passing Interface ◽

Numerical Solutions ◽

Three Dimensional ◽

Simulation Method ◽

Water Entry ◽

Computational Domain ◽

Cip Method ◽

Constrained Interpolation

This paper presents the further development of numerical simulation method to solve 3-D highly non-linear slamming problems using parallel computing algorithms. The water entry problems are treated as multi-phase problems (solid, water and air) and governed by the Navier-Stokes (N-S) equations. They are solved by the three-dimensional constrained interpolation profile (CIP) method. The interfaces between different phases are captured using density functions. In the computation, the 3-D CIP method is employed for the advection phase of the N-S equations and a pressure-based algorithm is applied for the non-advection phase. The bi-conjugate gradient stabilized method (BiCGSTAB) is utilized to solve the linear equation systems. A Message Passing Interface (MPI) parallel computing scheme was implemented in the computations. For the parallel computations, the three-dimensional Cartesian decomposition of the computational domain was used. The speed-up performance of various decomposition schemes were studied. Validation studies were carried out for the water entry of a 3-D wedge and a 3-D ship section with prescribed velocities. The computed slamming force, pressure distribution and free-surface elevations are compared with experimental results and numerical results by other methods.

Download Full-text