VISUAL PROGRAMMING FOR MESSAGE-PASSING SYSTEMS

The attractiveness of visual programming stems in large part from the direct interaction with program elements as if they were real objects, since people deal better with concrete objects than with the abstract. This paper describes a new graph based software visualization tool for parallel message-passing programming named Visper that combines the levels of abstraction at which message-passing parallel programs are expressed and makes use of compositional programming. Central to the tool is the Process Communication Graph that correlates both the control and data flow graphs into a single graph formalism, without a need for complex textual annotation. The graph can express static and runtime communication and replication structures, as found in Message Passing Interface (MPI) and Parallel Virtual Machine (PVM). It also forms the basis for visualizing parallel debugging and performance.

Download Full-text

A parallelization scheme to simulate reactive transport in the subsurface environment with OGS#IPhreeqc

Geoscientific Model Development Discussions ◽

10.5194/gmdd-8-2369-2015 ◽

2015 ◽

Vol 8 (3) ◽

pp. 2369-2402

Author(s):

W. He ◽

C. Beyer ◽

J. H. Fleckenstein ◽

E. Jang ◽

O. Kolditz ◽

...

Keyword(s):

Message Passing ◽

Reactive Transport ◽

Message Passing Interface ◽

Transport Processes ◽

Coupled Processes ◽

Scientific Software ◽

Geochemical Reactions ◽

Optimized Allocation ◽

And Performance ◽

The One

Abstract. This technical paper presents an efficient and performance-oriented method to model reactive mass transport processes in environmental and geotechnical subsurface systems. The open source scientific software packages OpenGeoSys and IPhreeqc have been coupled, to combine their individual strengths and features to simulate thermo-hydro-mechanical-chemical coupled processes in porous and fractured media with simultaneous consideration of aqueous geochemical reactions. Furthermore, a flexible parallelization scheme using MPI (Message Passing Interface) grouping techniques has been implemented, which allows an optimized allocation of computer resources for the node-wise calculation of chemical reactions on the one hand, and the underlying processes such as for groundwater flow or solute transport on the other hand. The coupling interface and parallelization scheme have been tested and verified in terms of precision and performance.

Download Full-text

The MPI and OpenMP Implementation of Parallel Algorithm for Generating Mandelbrot Set

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.26 ◽

2014 ◽

Vol 571-572 ◽

pp. 26-29

Author(s):

Xiang Wei Duan ◽

Wei Chang Shen ◽

Jun Guo

Keyword(s):

Parallel Algorithm ◽

Shared Memory ◽

Message Passing ◽

Message Passing Interface ◽

Algorithm Design ◽

Performance Testing ◽

Mandelbrot Set ◽

The Difference ◽

And Performance

The paper introduce the Mandelbrot Set and the message passing interface (MPI) and shared-memory (OpenMP), analyses the characteristic of algorithm design in the MPI and OpenMP environment, describes the implementation of parallel algorithm about Mandelbrot Set in the MPI environment and the OpenMP environment, conducted a series of evaluation and performance testing during the process of running, then the difference between the two system implementations is compared.

Download Full-text

MPI to Coarray Fortran: Experiences with a CFD Solver for Unstructured Meshes

Scientific Programming ◽

10.1155/2017/3409647 ◽

2017 ◽

Vol 2017 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Anuj Sharma ◽

Irene Moulitsas

Keyword(s):

High Resolution ◽

Message Passing ◽

Message Passing Interface ◽

Parallel Implementation ◽

Unstructured Meshes ◽

Navier Stokes ◽

Performance Measurements ◽

Partitioned Global Address Space ◽

Computational Fluid Dynamics Cfd ◽

And Performance

High-resolution numerical methods and unstructured meshes are required in many applications of Computational Fluid Dynamics (CFD). These methods are quite computationally expensive and hence benefit from being parallelized. Message Passing Interface (MPI) has been utilized traditionally as a parallelization strategy. However, the inherent complexity of MPI contributes further to the existing complexity of the CFD scientific codes. The Partitioned Global Address Space (PGAS) parallelization paradigm was introduced in an attempt to improve the clarity of the parallel implementation. We present our experiences of converting an unstructured high-resolution compressible Navier-Stokes CFD solver from MPI to PGAS Coarray Fortran. We present the challenges, methodology, and performance measurements of our approach using Coarray Fortran. With the Cray compiler, we observe Coarray Fortran as a viable alternative to MPI. We are hopeful that Intel and open-source implementations could be utilized in the future.

Download Full-text

Design, implementation and performance of fault-tolerant message passing interface (MPI)

Proceedings. Seventh International Conference on High Performance Computing and Grid in Asia Pacific Region, 2004. ◽

10.1109/hpcasia.2004.1324026 ◽

2004 ◽

Author(s):

A.D. Selvakumar ◽

P.M. Sobha ◽

G.C. Ravindra ◽

R. Pitchiah

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Fault Tolerant ◽

And Performance

Download Full-text

Porting the AVS/Express scientific visualization software to Cray XT4

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2011.0133 ◽

2011 ◽

Vol 369 (1949) ◽

pp. 3398-3412 ◽

Cited By ~ 2

Author(s):

George W. Leaver ◽

Martin J. Turner ◽

James S. Perrin ◽

Paul M. Mummery ◽

Philip J. Withers

Keyword(s):

Performance Analysis ◽

Message Passing ◽

Large Scale ◽

Message Passing Interface ◽

Materials Science ◽

Scientific Visualization ◽

Visualization Software ◽

Science Community ◽

Interactive Application ◽

And Performance

Remote scientific visualization, where rendering services are provided by larger scale systems than are available on the desktop, is becoming increasingly important as dataset sizes increase beyond the capabilities of desktop workstations. Uptake of such services relies on access to suitable visualization applications and the ability to view the resulting visualization in a convenient form. We consider five rules from the e-Science community to meet these goals with the porting of a commercial visualization package to a large-scale system. The application uses message-passing interface (MPI) to distribute data among data processing and rendering processes. The use of MPI in such an interactive application is not compatible with restrictions imposed by the Cray system being considered. We present details, and performance analysis, of a new MPI proxy method that allows the application to run within the Cray environment yet still support MPI communication required by the application. Example use cases from materials science are considered.

Download Full-text

MPI-FT: PORTABLE FAULT TOLERANCE SCHEME FOR MPI

Parallel Processing Letters ◽

10.1142/s0129626400000342 ◽

2000 ◽

Vol 10 (04) ◽

pp. 371-382 ◽

Cited By ~ 42

Author(s):

SOULLA LOUCA ◽

NEOPHYTOS NEOPHYTOU ◽

ADRIANOS LACHANAS ◽

PARASKEVAS EVRIPIDOU

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Fault Tolerant ◽

Detection Mechanism ◽

First Case ◽

Monitoring Process ◽

Replacement Process ◽

The Dead ◽

And Performance ◽

Process Solutions

In this paper, we propose the design and development of a fault tolerant and recovery scheme for the Message Passing Interface (MPI). The proposed scheme consists of a detection mechanism for detecting process failures, and a recovery mechanism. Two different cases are considered, both assuming the existence of a monitoring process, the Observer which triggers the recovery procedure in case of failure. In the first case, each process keeps a buffer with its own message traffic to be used in case of failure, while the implementor uses periodical tests for notification of failure by the Observer. The recovery function simulates all the communication of the processes with the dead one by re-sending to the replacement process all the messages destined for the dead one. In the second case, the Observer receives and stores all message traffic, and sends to the replacement all the buffered messages destined for the dead process. Solutions are provided to the dead communicator problem caused by the death of a process. A description of the prototype developed is provided along with the results of the experiments performed for efficiency and performance.

Download Full-text

Designing Zero-Copy Message Passing Interface Derived Datatype Communication Over Infiniband: Alternative Approaches and Performance Evaluation

The International Journal of High Performance Computing Applications ◽

10.1177/1094342005054259 ◽

2005 ◽

Vol 19 (2) ◽

pp. 129-142 ◽

Cited By ~ 4

Author(s):

Gopalakrishnan Santhanaraman ◽

Jiesheng Wu ◽

Wei Huang ◽

Dhabaleswar K. Panda

Keyword(s):

Performance Evaluation ◽

Message Passing ◽

Message Passing Interface ◽

Zero Copy ◽

Alternative Approaches ◽

And Performance

Download Full-text

Shared Memory Transport for ALFA

EPJ Web of Conferences ◽

10.1051/epjconf/201921405029 ◽

2019 ◽

Vol 214 ◽

pp. 05029 ◽

Cited By ~ 2

Author(s):

Alexey Rybalchenko ◽

Dennis Klein ◽

Mohammad Al-Turany ◽

Thorsten Kollegger

Keyword(s):

Shared Memory ◽

Particle Physics ◽

Message Passing ◽

Message Passing Interface ◽

Large Data ◽

Building Blocks ◽

Data Transport ◽

High Data ◽

And Performance ◽

Physics Experiments

The high data rates expected for the next generation of particle physics experiments (e.g.: new experiments at FAIR/GSI and the upgrade of CERN experiments) call for dedicated attention with respect to design of the needed computing infrastructure. The common ALICE-FAIR framework ALFA is a modern software layer, that serves as a platform for simulation, reconstruction and analysis of particle physics experiments. Beside standard services needed for simulation and reconstruction of particle physics experiments, ALFA also provides tools for data transport, configuration and deployment. The FairMQ module in ALFA offers building blocks for creating distributed software components (processes) that communicate between each other via message passing. The abstract "message passing" interface in FairMQ has at the moment three implementations: ZeroMQ, nanomsg and shared memory. The newly developed shared memory transport will be presented, that provides significant per-formance benefits for transferring large data chunks between components on the same node. The implementation in FairMQ allows users to switch between the different transports via a trivial configuration change. The design decisions, im-plementation details and performance numbers of the shared memory transport in FairMQ/ALFA will be highlighted.

Download Full-text

High-Performance Design Patterns for Modern Fortran

Scientific Programming ◽

10.1155/2015/942059 ◽

2015 ◽

Vol 2015 ◽

pp. 1-14 ◽

Cited By ~ 4

Author(s):

Magne Haveraaen ◽

Karla Morris ◽

Damian Rouson ◽

Hari Radhakrishnan ◽

Clayton Carson

Keyword(s):

Message Passing ◽

Design Patterns ◽

High Performance ◽

Message Passing Interface ◽

State Of The Art ◽

Linear Scaling ◽

Distributed Data ◽

Efficient Communication ◽

Programming Patterns ◽

And Performance

This paper presents ideas for using coordinate-free numerics in modern Fortran to achieve code flexibility in the partial differential equation (PDE) domain. We also show how Fortran, over the last few decades, has changed to become a language well-suited for state-of-the-art software development. Fortran’s new coarray distributed data structure, the language’s class mechanism, and its side-effect-free, pure procedure capability provide the scaffolding on which we implement HPC software. These features empower compilers to organize parallel computations with efficient communication. We present some programming patterns that support asynchronous evaluation of expressions comprised of parallel operations on distributed data. We implemented these patterns using coarrays and the message passing interface (MPI). We compared the codes’ complexity and performance. The MPI code is much more complex and depends on external libraries. The MPI code on Cray hardware using the Cray compiler is 1.5–2 times faster than the coarray code on the same hardware. The Intel compiler implements coarrays atop Intel’s MPI library with the result apparently being 2–2.5 times slower than manually coded MPI despite exhibiting nearly linear scaling efficiency. As compilers mature and further improvements to coarrays comes in Fortran 2015, we expect this performance gap to narrow.

Download Full-text

Multi-level Parallelization of Genotype Imputation on Supercomputers

Current Bioinformatics ◽

10.2174/1574893615999200420071307 ◽

2020 ◽

Vol 15 ◽

Author(s):

Weiwen Zhang ◽

Long Wang ◽

Theint Theint Aye ◽

Juniarto Samsudin ◽

Yongqing Zhu

Keyword(s):

Association Study ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Genome Wide Association Study ◽

Job Scheduling ◽

Genotype Imputation ◽

Job Level ◽

Multi Level ◽

High Performance Requirement

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.

Download Full-text