A Memory Access Pattern-Based Program Profiling System for Dynamic Parallelism Prediction

Author(s):  
Zijun Han ◽  
Guangzhi Qu ◽  
Dona Burkard ◽  
Kelvin Dobbins
Author(s):  
Yuto Nakano ◽  
Shinsaku Kiyomoto ◽  
Yutaka Miyake
Keyword(s):  

Author(s):  
S. Arash Ostadzadeh ◽  
Roel J. Meeuws ◽  
Carlo Galuzzi ◽  
Koen Bertels
Keyword(s):  

2021 ◽  
Vol 40 (2) ◽  
pp. 1-17
Author(s):  
Milan Jaroš ◽  
Lubomír Říha ◽  
Petr Strakoš ◽  
Matěj Špeťko

This article presents a solution to path tracing of massive scenes on multiple GPUs. Our approach analyzes the memory access pattern of a path tracer and defines how the scene data should be distributed across up to 16 GPUs with minimal effect on performance. The key concept is that the parts of the scene that have the highest amount of memory accesses are replicated on all GPUs. We propose two methods for maximizing the performance of path tracing when working with partially distributed scene data. Both methods work on the memory management level and therefore path tracer data structures do not have to be redesigned, making our approach applicable to other path tracers with only minor changes in their code. As a proof of concept, we have enhanced the open-source Blender Cycles path tracer. The approach was validated on scenes of sizes up to 169 GB. We show that only 1–5% of the scene data needs to be replicated to all machines for such large scenes. On smaller scenes we have verified that the performance is very close to rendering a fully replicated scene. In terms of scalability we have achieved a parallel efficiency of over 94% using up to 16 GPUs.


Sign in / Sign up

Export Citation Format

Share Document