A novel shared memory framework for distributed deep learning in high-performance computing architecture

Author(s):  
Shinyoimg Ahn ◽  
Joongheon Kim ◽  
Sungwon Kang
Author(s):  
Italo Epicoco ◽  
Silvia Mocavero ◽  
Andrew R Porter ◽  
Stephen M Pickles ◽  
Mike Ashworth ◽  
...  

This work describes the introduction of a second level of parallelism based on the OpenMP shared memory paradigm to NEMO, one of the most widely used ocean models in the European climate community. Although the existing parallelisation scheme in NEMO, based on the MPI paradigm, has served it well for many years, it is becoming unsuited to current high-performance computing architectures due to their increasing tendency to have fat nodes containing tens of compute cores. Three different parallel approaches for introducing OpenMP are presented, discussed and compared on several platforms. Finally we have also considered the effect on performance of the data layout employed in NEMO.


Sign in / Sign up

Export Citation Format

Share Document