Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language

To improve productivity for developing parallel applications on high performance computing systems, the XcalableMP PGAS language has been proposed. XcalableMP supports both a typical parallelization under the “global-view memory model” which uses directives and a flexible parallelization under the “local-view memory model” which uses coarray features. The goal of the present paper is to clarify XcalableMP’s productivity and performance. To do so, we implement and evaluate the high performance computing challenge benchmark, namely, EP STREAM Triad, High Performance Linpack, Global fast Fourier transform, and RandomAccess on the K computer using up to 16,384 compute nodes and a generic cluster system using up to 128 compute nodes. We found that we could more easily implement the benchmarks using XcalableMP rather than using MPI. Moreover, most of the performance results using XcalableMP were almost the same as those using MPI.

Download Full-text

The Design and Performance of Batched BLAS on Modern High-Performance Computing Systems

Procedia Computer Science ◽

10.1016/j.procs.2017.05.138 ◽

2017 ◽

Vol 108 ◽

pp. 495-504 ◽

Cited By ~ 23

Author(s):

Jack Dongarra ◽

Sven Hammarling ◽

Nicholas J. Higham ◽

Samuel D. Relton ◽

Pedro Valero-Lara ◽

...

Keyword(s):

High Performance Computing ◽

High Performance ◽

Computing Systems ◽

And Performance ◽

Performance Computing

Download Full-text

RAPID for high-performance computing systems: architecture and performance evaluation

Applied Optics ◽

10.1364/ao.45.006326 ◽

2006 ◽

Vol 45 (25) ◽

pp. 6326 ◽

Cited By ~ 7

Author(s):

Avinash Karanth Kodi ◽

Ahmed Louri

Keyword(s):

Performance Evaluation ◽

High Performance Computing ◽

High Performance ◽

Computing Systems ◽

Systems Architecture ◽

And Performance ◽

Performance Computing

Download Full-text

PCJ Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads

Journal Of Big Data ◽

10.1186/s40537-021-00454-6 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Marek Nowicki ◽

Łukasz Górski ◽

Piotr Bała

Keyword(s):

Artificial Intelligence ◽

Big Data ◽

High Performance Computing ◽

High Performance ◽

Large Scale ◽

And Performance ◽

Performance Results ◽

Computational Systems ◽

Performance Computing ◽

Java Library

AbstractWith the development of peta- and exascale size computational systems there is growing interest in running Big Data and Artificial Intelligence (AI) applications on them. Big Data and AI applications are implemented in Java, Scala, Python and other languages that are not widely used in High-Performance Computing (HPC) which is still dominated by C and Fortran. Moreover, they are based on dedicated environments such as Hadoop or Spark which are difficult to integrate with the traditional HPC management systems. We have developed the Parallel Computing in Java (PCJ) library, a tool for scalable high-performance computing and Big Data processing in Java. In this paper, we present the basic functionality of the PCJ library with examples of highly scalable applications running on the large resources. The performance results are presented for different classes of applications including traditional computational intensive (HPC) workloads (e.g. stencil), as well as communication-intensive algorithms such as Fast Fourier Transform (FFT). We present implementation details and performance results for Big Data type processing running on petascale size systems. The examples of large scale AI workloads parallelized using PCJ are presented.

Download Full-text