Improving data transfer for model coupling

C. Zhang; L. Liu; G. Yang; R. Li; B. Wang

doi:10.5194/gmdd-8-8981-2015

Improving data transfer for model coupling

Geoscientific Model Development Discussions ◽

10.5194/gmdd-8-8981-2015 ◽

2015 ◽

Vol 8 (10) ◽

pp. 8981-9020 ◽

Cited By ~ 2

Author(s):

C. Zhang ◽

L. Liu ◽

G. Yang ◽

R. Li ◽

B. Wang

Keyword(s):

Performance Improvement ◽

Message Passing ◽

Message Passing Interface ◽

Data Transfer ◽

The Public ◽

Message Size ◽

Abstract Data ◽

Point To Point ◽

Component Models ◽

Size Variable

Abstract. Data transfer, which means transferring data fields between two component models or rearranging data fields among processes of the same component model, is a fundamental operation of a coupler. Most of state-of-the-art coupler versions currently use an implementation based on the point-to-point (P2P) communication of the Message Passing Interface (MPI) (call such an implementation "P2P implementation" for short). In this paper, we reveal the drawbacks of the P2P implementation, including low communication bandwidth due to small message size, variable and big number of MPI messages, and jams during communication. To overcome these drawbacks, we propose a butterfly implementation for data transfer. Although the butterfly implementation can outperform the P2P implementation in many cases, it degrades the performance in some cases because the total message size transferred by the butterfly implementation is larger than that by the P2P implementation. To make the data transfer completely improved, we design and implement an adaptive data transfer library that combines the advantages of both butterfly implementation and P2P implementation. Performance evaluation shows that the adaptive data transfer library significantly improves the performance of data transfer in most cases and does not decrease the performance in any cases. Now the adaptive data transfer library is open to the public and has been imported into a coupler version C-Coupler1 for performance improvement of data transfer. We believe that it can also improve other coupler versions.

Download Full-text

A new adaptive data transfer library for model coupling

Geoscientific Model Development ◽

10.5194/gmd-9-2099-2016 ◽

2016 ◽

Vol 9 (6) ◽

pp. 2099-2113

Author(s):

Cheng Zhang ◽

Li Liu ◽

Guangwen Yang ◽

Ruizhe Li ◽

Bin Wang

Keyword(s):

Performance Improvement ◽

Message Passing ◽

Message Passing Interface ◽

Data Transfer ◽

State Of The Art ◽

The Public ◽

Abstract Data ◽

Network Contention ◽

Point To Point ◽

Size Variable

Abstract. Data transfer means transferring data fields from a sender to a receiver. It is a fundamental and frequently used operation of a coupler. Most versions of state-of-the-art couplers currently use an implementation based on the point-to-point (P2P) communication of the message passing interface (MPI) (referred to as “P2P implementation” hereafter). In this paper, we reveal the drawbacks of the P2P implementation when the parallel decompositions of the sender and the receiver are different, including low communication bandwidth due to small message size, variable and high number of MPI messages, as well as network contention. To overcome these drawbacks, we propose a butterfly implementation for data transfer. Although the butterfly implementation outperforms the P2P implementation in many cases, it degrades the performance when the sender and the receiver have similar parallel decompositions or when the number of processes used for running models is small. To ensure data transfer with optimal performance, we design and implement an adaptive data transfer library that combines the advantages of both butterfly implementation and P2P implementation. As the adaptive data transfer library automatically uses the best implementation for data transfer, it outperforms the P2P implementation in many cases while it does not decrease the performance in any cases. Now, the adaptive data transfer library is open to the public and has been imported into the C-Coupler1 coupler for performance improvement of data transfer. We believe that other couplers can also benefit from this.

Download Full-text

Characterization of MPI Communication Primitives on a Heterogeneous Cluster

Scientific Research Journal ◽

10.24191/srj.v6i2.5631 ◽

2009 ◽

Vol 6 (2) ◽

pp. 23

Author(s):

Siti Arpah Ahmad ◽

Mohamed Faidz Mohamed Said ◽

Norazan Mohamed Ramli ◽

Mohd Nasir Taib

Keyword(s):

Message Passing ◽

High Performance ◽

Data Transfer ◽

The Other ◽

Small Cluster ◽

Heterogeneous Cluster ◽

Processing Power ◽

Basic Performance ◽

Point To Point

This paper focuses on the performance of basic communication primitives, namely the overlap of message transfer with computation in the point-to-point communication within a small cluster of four nodes. The mpptest has been implemented to measure the basic performance of MPI message passing routines with a variety of message sizes. The mpptest is capable of measuring performance with many participating processes thus exposing contention and scalability problems. This enables programmers to select message sizes in order to isolate and evaluate sudden changes in performance. Investigating these matters is interesting in that non-blocking calls have the advantage of allowing the system to schedule communications even when many processes are running simultaneously. On the other hand, understanding the characteristics of computation and communication overlap is significant, because high- performance kernels often strive to achieve this, since it is both advantageous with respect to data transfer and latency hiding. The results indicate that certain overlap sizes utilize greater node processing power either in blocking send and receive operations or non-blocking send and receive operations. The results have elucidated a detailed MPI characterization of the performance regarding the overlap of message transfer with computation in a small cluster system.

Download Full-text

Unified fault-tolerance framework for hybrid task-parallel message-passing applications

The International Journal of High Performance Computing Applications ◽

10.1177/1094342016669416 ◽

2016 ◽

Vol 32 (5) ◽

pp. 641-657 ◽

Cited By ~ 5

Author(s):

Omer Subasi ◽

Tatiana Martsinkevich ◽

Ferad Zyulkyarov ◽

Osman Unsal ◽

Jesus Labarta ◽

...

Keyword(s):

Fault Tolerance ◽

Performance Improvement ◽

Message Passing ◽

Message Passing Interface ◽

Fault Tolerant ◽

Performance Score ◽

Fine Grained ◽

Transient Errors ◽

Task Parallel ◽

Complete Failure

We present a unified fault-tolerance framework for task-parallel message-passing applications to mitigate transient errors. First, we propose a fault-tolerant message-logging protocol that only requires the restart of the task that experienced the error and transparently handles any message passing interface calls inside the task. In our experiments we demonstrate that our fault-tolerant solution has a reasonable overhead, with a maximum observed overhead of 4.5%. We also show that fine-grained parallelization is important for hiding the overheads related to the protocol as well as the recovery of tasks. Secondly, we develop a mathematical model to unify task-level checkpointing and our protocol with system-wide checkpointing in order to provide complete failure coverage. We provide closed formulas for the optimal checkpointing interval and the performance score of the unified scheme. Experimental results show that the performance improvement can be as high as 98% with the unified scheme.

Download Full-text

Mobile Message Passing using a Scatternet Framework

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2008.1.2374 ◽

2008 ◽

Vol 3 (1) ◽

pp. 51 ◽

Cited By ~ 5

Author(s):

Brendan J. Donegan ◽

Daniel C. Doolan ◽

Sabin Tabirca

Keyword(s):

Parallel Computing ◽

Mobile Phones ◽

Message Passing ◽

Message Passing Interface ◽

Global Communication ◽

Bluetooth Scatternet ◽

Point To Point

The Mobile Message Passing Interface is a library which implements MPI functionality on Bluetooth enabled mobile phones. It provides many of the functions available in MPI, including point-to-point and global communication. The main restriction of the library is that it was designed to work over Bluetooth piconets. Piconet based networks provide for a maximum of eight devices connected together simultaneously. This limits the libraries usefulness for parallel computing. A solution to solve this problem is presented that provides the same functionality as the original Mobile MPI library, but implemented over a Bluetooth scatternet. A scatternet may be defined as a number of piconets interconnected by common node(s). An outline of the scatternet design is explained and its major components discussed.

Download Full-text

Parallel Computing on a Mobile Device

Handbook of Research on Mobile Multimedia, Second Edition ◽

10.4018/978-1-60566-046-2.ch039 ◽

2010 ◽

pp. 566-583

Author(s):

Daniel C. Doolan ◽

Sabin Tabirca ◽

Laurence T. Yang

Keyword(s):

Parallel Computing ◽

Wireless Communication ◽

Mobile Phones ◽

Message Passing ◽

High Speed ◽

Message Passing Interface ◽

Parallel Machines ◽

Gigabit Ethernet ◽

Point To Point ◽

Global Communications

The Message Passing Interface (MPI) was published as a standard in 1992. Since then, many implementations have been developed. The MPICH library is one of the most well-known and freely available implementations. These libraries allow for the simplification of parallel computing on clusters and parallel machines. The system provides the developer with an easy-to-use set of functions for point-to-point and global communications. The details of how the actual communication takes place are hidden from the programmers, allowing them to focus on the domain-specific problem at hand. Communication between nodes on such systems is carried out via high-speed cabled interconnects (Gigabit Ethernet and upwards). The world of mobile computing, especially mobile phones, is now a ubiquitous technology. Mobile devices do not have any facility to allow for connections using traditional high-speed cabling; therefore, it is necessary to make use of wireless communication mechanisms to achieve interdevice communication. The majority of medium- to high-end phones are Bluetooth-enabled as standard, allowing for wireless communication to take place. The Mobile Message Passing Interface (MMPI) provides the developer with an intuitive set of functions to allow for communications between nodes (mobile phones) across a Bluetooth network. This chapter looks at the MMPI library and how it may be used for parallel computing on mobile phones (Smartphones).

Download Full-text

SMP-SIM: An SMP-based discrete-event execution-driven performance simulator

Computer Science and Information Systems ◽

10.2298/csis120118046l ◽

2012 ◽

Vol 9 (4) ◽

pp. 1361-1383

Author(s):

Yufei Lin ◽

Xinhai Xu ◽

Yuhua Tang ◽

Xin Zhang ◽

Xiaowei Guo

Keyword(s):

Message Passing ◽

Large Scale ◽

Message Passing Interface ◽

Discrete Event ◽

Simulation Method ◽

Parallel System ◽

Memory Space ◽

Central Processing ◽

Performance Requirements ◽

Point To Point

Designing and implementing a large-scale parallel system can be time-consuming and costly. It is therefore desirable to enable system developers to predict the performance of a parallel system at its design phase so that they can evaluate design alternatives to better meet performance requirements. Before the target machine is completely built, the developers can always build an symmetric multi-processor (SMP) for evaluation purposes. In this paper, we introduce an SMP-based discrete-event execution-driven performance simulation method for message passing interface (MPI) programs and describe the design and implementation of a simulator called SMP-SIM. As the processes share the same memory space in an SMP, SMP-SIM manages the events globally at the granularity of central processing units (CPUs). Furthermore, by re-implementing core MPI point-to-point communication primitives, SMP-SIM handles the communication virtually and sequential computation actually. Our experimental results show that SMP-SIM is highly accurate and scalable, resulting in errors of less than 7.60% for both SMP and SMP-Cluster target machines.

Download Full-text

Multi-level Parallelization of Genotype Imputation on Supercomputers

Current Bioinformatics ◽

10.2174/1574893615999200420071307 ◽

2020 ◽

Vol 15 ◽

Author(s):

Weiwen Zhang ◽

Long Wang ◽

Theint Theint Aye ◽

Juniarto Samsudin ◽

Yongqing Zhu

Keyword(s):

Association Study ◽

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Genome Wide Association Study ◽

Job Scheduling ◽

Genotype Imputation ◽

Job Level ◽

Multi Level ◽

High Performance Requirement

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.

Download Full-text

Distributed Singular Value Decomposition Method for Fast Data Processing in Recommendation Systems

Energies ◽

10.3390/en14082284 ◽

2021 ◽

Vol 14 (8) ◽

pp. 2284

Author(s):

Krzysztof Przystupa ◽

Mykola Beshley ◽

Olena Hordiichuk-Bublivska ◽

Marian Kyryk ◽

Halyna Beshley ◽

...

Keyword(s):

Distributed Systems ◽

Singular Value Decomposition ◽

Data Processing ◽

Message Passing ◽

Message Passing Interface ◽

Recommendation Systems ◽

Singular Value ◽

Singular Value Decomposition Method ◽

Value Decomposition ◽

Svd Method

The problem of analyzing a big amount of user data to determine their preferences and, based on these data, to provide recommendations on new products is important. Depending on the correctness and timeliness of the recommendations, significant profits or losses can be obtained. The task of analyzing data on users of services of companies is carried out in special recommendation systems. However, with a large number of users, the data for processing become very big, which causes complexity in the work of recommendation systems. For efficient data analysis in commercial systems, the Singular Value Decomposition (SVD) method can perform intelligent analysis of information. With a large amount of processed information we proposed to use distributed systems. This approach allows reducing time of data processing and recommendations to users. For the experimental study, we implemented the distributed SVD method using Message Passing Interface, Hadoop and Spark technologies and obtained the results of reducing the time of data processing when using distributed systems compared to non-distributed ones.

Download Full-text

A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing ◽

10.1016/0167-8191(96)00024-5 ◽

1996 ◽

Vol 22 (6) ◽

pp. 789-828 ◽

Cited By ~ 1155

Author(s):

William Gropp ◽

Ewing Lusk ◽

Nathan Doss ◽

Anthony Skjellum

Keyword(s):

Message Passing ◽

High Performance ◽

Message Passing Interface

Download Full-text

Parallel implementation for HSLO(3)-FDTD with message passing interface on Distributed Memory Architecture

2006 International Conference on Computing & Informatics ◽

10.1109/icoci.2006.5276531 ◽

2006 ◽

Author(s):

Mohammad Khatim Hasan ◽

Mohamed Othman ◽

Jalil Md Desa ◽

Zulkifly Abbas ◽

Jumat Sulaiman

Keyword(s):

Message Passing ◽

Message Passing Interface ◽

Distributed Memory ◽

Parallel Implementation ◽

Memory Architecture ◽

Distributed Memory Architecture

Download Full-text