Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language

Author(s):  
Masahiro Nakao ◽  
Hitoshi Murai ◽  
Hidetoshi Iwashita ◽  
Taisuke Boku ◽  
Mitsuhisa Sato

To improve productivity for developing parallel applications on high performance computing systems, the XcalableMP PGAS language has been proposed. XcalableMP supports both a typical parallelization under the “global-view memory model” which uses directives and a flexible parallelization under the “local-view memory model” which uses coarray features. The goal of the present paper is to clarify XcalableMP’s productivity and performance. To do so, we implement and evaluate the high performance computing challenge benchmark, namely, EP STREAM Triad, High Performance Linpack, Global fast Fourier transform, and RandomAccess on the K computer using up to 16,384 compute nodes and a generic cluster system using up to 128 compute nodes. We found that we could more easily implement the benchmarks using XcalableMP rather than using MPI. Moreover, most of the performance results using XcalableMP were almost the same as those using MPI.

2017 ◽  
Vol 108 ◽  
pp. 495-504 ◽  
Author(s):  
Jack Dongarra ◽  
Sven Hammarling ◽  
Nicholas J. Higham ◽  
Samuel D. Relton ◽  
Pedro Valero-Lara ◽  
...  

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Marek Nowicki ◽  
Łukasz Górski ◽  
Piotr Bała

AbstractWith the development of peta- and exascale size computational systems there is growing interest in running Big Data and Artificial Intelligence (AI) applications on them. Big Data and AI applications are implemented in Java, Scala, Python and other languages that are not widely used in High-Performance Computing (HPC) which is still dominated by C and Fortran. Moreover, they are based on dedicated environments such as Hadoop or Spark which are difficult to integrate with the traditional HPC management systems. We have developed the Parallel Computing in Java (PCJ) library, a tool for scalable high-performance computing and Big Data processing in Java. In this paper, we present the basic functionality of the PCJ library with examples of highly scalable applications running on the large resources. The performance results are presented for different classes of applications including traditional computational intensive (HPC) workloads (e.g. stencil), as well as communication-intensive algorithms such as Fast Fourier Transform (FFT). We present implementation details and performance results for Big Data type processing running on petascale size systems. The examples of large scale AI workloads parallelized using PCJ are presented.


Sign in / Sign up

Export Citation Format

Share Document