scholarly journals Interaction with the User in the SAPFOR System

2021 ◽  
Vol 24 (1) ◽  
pp. 157-183
Author(s):  
Никита Андреевич Катаев

Automation of parallel programming is important at any stage of parallel program development. These stages include profiling of the original program, program transformation, which allows us to achieve higher performance after program parallelization, and, finally, construction and optimization of the parallel program. It is also important to choose a suitable parallel programming model to express parallelism available in a program. On the one hand, the parallel programming model should be capable to map the parallel program to a variety of existing hardware resources. On the other hand, it should simplify the development of the assistant tools and it should allow the user to explore the parallel program the assistant tools generate in a semi-automatic way. The SAPFOR (System FOR Automated Parallelization) system combines various approaches to automation of parallel programming. Moreover, it allows the user to guide the parallelization if necessary. SAPFOR produces parallel programs according to the high-level DVMH parallel programming model which simplify the development of efficient parallel programs for heterogeneous computing clusters. This paper focuses on the approach to semi-automatic parallel programming, which SAPFOR implements. We discuss the architecture of the system and present the interactive subsystem which is useful to guide the SAPFOR through program parallelization. We used the interactive subsystem to parallelize programs from the NAS Parallel Benchmarks in a semi-automatic way. Finally, we compare the performance of manually written parallel programs with programs the SAPFOR system builds.

2020 ◽  
Vol 23 (4) ◽  
pp. 866-886
Author(s):  
Vladimir Aleksandrovich Bakhtin ◽  
Dmitry Aleksandrovich Zakharov ◽  
Aleksandr Aleksandrovich Ermichev ◽  
Victor Alekseevich Krukov

DVM-system is designed for the development of parallel programs of scientific and technical calculations in the C-DVMH and Fortran-DVMH languages. These languages use a single DVMH-model of parallel programming model and are an extension of the standard C and Fortran languages with parallelism specifications in the form of compiler directives. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators, graphic processors or Intel Xeon Phi coprocessors can be used as computing devices along with universal multi-core processors. The article describes the method of debugging parallel programs in DVM-system, as well as new features of DVM-debugger.


2020 ◽  
Vol 23 (3) ◽  
pp. 247-270
Author(s):  
Valery Fedorovich Aleksahin ◽  
Vladimir Aleksandrovich Bakhtin ◽  
Olga Fedorovna Zhukova ◽  
Dmitry Aleksandrovich Zakharov ◽  
Victor Alekseevich Krukov ◽  
...  

DVM-system is designed for the development of parallel programs of scientific and technical calculations in the C-DVMH and Fortran-DVMH languages. These languages use a single DVMH-model of parallel programming model and are an extension of the standard C and Fortran languages with parallelism specifications in the form of compiler directives. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators, graphic processors or Intel Xeon Phi coprocessors can be used as computing devices along with universal multi-core processors. The article presents new features of DVM-system that have been developed recently.


2020 ◽  
Vol 23 (4) ◽  
pp. 594-614
Author(s):  
Vladimir Aleksandrovich Bakhtin ◽  
Dmitry Aleksandrovich Zakharov ◽  
Andrey Nikolaevich Kozlov ◽  
Veniamin Sergeevich Konovalov

DVM-system is designed for the development of parallel programs of scientific and technical calculations in the C-DVMH and Fortran-DVMH languages. These languages use a single DVMH-model of parallel programming model and are an extension of the standard C and Fortran languages with parallelism specifications in the form of compiler directives. The DVMH model makes it possible to create efficient parallel programs for heterogeneous computing clusters, in the nodes of which accelerators, graphic processors or Intel Xeon Phi coprocessors can be used as computing devices along with universal multi-core processors. The article describes the experience of the successful using of DVM-system to develop a parallel software code for calculating the problem of radiation magnetic gas dynamics and for research of plasma dynamics in the QSPA channel.


2018 ◽  
Vol 29 (01) ◽  
pp. 1850010 ◽  
Author(s):  
Claudio Bonati ◽  
Enrico Calore ◽  
Massimo D’Elia ◽  
Michele Mesiti ◽  
Francesco Negro ◽  
...  

This paper describes a state-of-the-art parallel Lattice QCD Monte Carlo code for staggered fermions, purposely designed to be portable across different computer architectures, including GPUs and commodity CPUs. Portability is achieved using the OpenACC parallel programming model, used to develop a code that can be compiled for several processor architectures. The paper focuses on parallelization on multiple computing nodes using OpenACC to manage parallelism within the node, and OpenMPI to manage parallelism among the nodes. We first discuss the available strategies to be adopted to maximize performances, we then describe selected relevant details of the code, and finally measure the level of performance and scaling-performance that we are able to achieve. The work focuses mainly on GPUs, which offer a significantly high level of performances for this application, but also compares with results measured on other processors.


2015 ◽  
Vol 44 (4) ◽  
pp. 832-866 ◽  
Author(s):  
Ren Li ◽  
Haibo Hu ◽  
Heng Li ◽  
Yunsong Wu ◽  
Jianxi Yang

2016 ◽  
Vol 43 ◽  
pp. 95-103 ◽  
Author(s):  
James A. Ross ◽  
David A. Richie ◽  
Song J. Park ◽  
Dale R. Shires

2021 ◽  
Vol 251 ◽  
pp. 03032
Author(s):  
Haiwang Yu ◽  
Zhihua Dong ◽  
Kyle Knoepfel ◽  
Meifeng Lin ◽  
Brett Viren ◽  
...  

The Liquid Argon Time Projection Chamber (LArTPC) technology plays an essential role in many current and future neutrino experiments. Accurate and fast simulation is critical to developing efficient analysis algorithms and precise physics model projections. The speed of simulation becomes more important as Deep Learning algorithms are getting more widely used in LArTPC analysis and their training requires a large simulated dataset. Heterogeneous computing is an efficient way to delegate computationally intensive tasks to specialized hardware. However, as the landscape of compute accelerators quickly evolves, it becomes increasingly difficult to manually adapt the code to the latest hardware or software environments. A solution which is portable to multiple hardware architectures without substantially compromising performance would thus be very beneficial, especially for long-term projects such as the LArTPC simulations. In search of a portable, scalable and maintainable software solution for LArTPC simulations, we have started to explore high-level portable programming frameworks that support several hardware backends. In this paper, we present our experience porting the LArTPC simulation code in the Wire-Cell Toolkit to NVIDIA GPUs, first with the CUDA programming model and then with a portable library called Kokkos. Preliminary performance results on NVIDIA V100 GPUs and multi-core CPUs are presented, followed by a discussion of the factors affiecting the performance and plans for future improvements.


Sign in / Sign up

Export Citation Format

Share Document