dense linear algebra
Recently Published Documents


TOTAL DOCUMENTS

104
(FIVE YEARS 5)

H-INDEX

18
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Paul Scheffler ◽  
Florian Zaruba ◽  
Fabian Schuiki ◽  
Torsten Hoefler ◽  
Luca Benini

2018 ◽  
Vol 106 (11) ◽  
pp. 2040-2055 ◽  
Author(s):  
Jack Dongarra ◽  
Mark Gates ◽  
Jakub Kurzak ◽  
Piotr Luszczek ◽  
Yaohung M. Tsai

2018 ◽  
Vol 30 (22) ◽  
pp. e4696 ◽  
Author(s):  
Dániel Berényi ◽  
András Leitereg ◽  
Gábor Lehel

Author(s):  
João Vicente Ferreira Lima ◽  
Issam Raïs ◽  
Laurent Lefèvre ◽  
Thierry Gautier

In this article, we analyze performance and energy consumption of five OpenMP runtime systems over a non-uniform memory access (NUMA) platform. We also selected three CPU-level optimizations or techniques to evaluate their impact on the runtime systems: processors features Turbo Boost and C-States, and CPU Dynamic Voltage and Frequency Scaling through Linux CPUFreq governors. We present an experimental study to characterize OpenMP runtime systems on the three main kernels in dense linear algebra algorithms (Cholesky, LU, and QR) in terms of performance and energy consumption. Our experimental results suggest that OpenMP runtime systems can be considered as a new energy leverage, and Turbo Boost, as well as C-States, impacted significantly performance and energy. CPUFreq governors had more impact with Turbo Boost disabled, since both optimizations reduced performance due to CPU thermal limits. An LU factorization with concurrent-write extension from libKOMP achieved up to 63% of performance gain and 29% of energy decrease over original PLASMA algorithm using GNU C compiler (GCC) libGOMP runtime.


Sign in / Sign up

Export Citation Format

Share Document