Automatic Thread Block Size Selection Strategy in GPU Parallel Code Generation

Author(s):  
Weifang Hu ◽  
Lin Han ◽  
Pu Han ◽  
Jiandong Shang
Author(s):  
Mark R. Gilder ◽  
Mukkai S. Krishnamoorthy ◽  
John R. Punin

2008 ◽  
Vol 81 (8) ◽  
pp. 1389-1405 ◽  
Author(s):  
Theodore Andronikos ◽  
Florina M. Ciorba ◽  
Panayiotis Theodoropoulos ◽  
Dimitrios Kamenopoulos ◽  
George Papakonstantinou

1997 ◽  
Vol 07 (04) ◽  
pp. 425-436 ◽  
Author(s):  
Yunheung Paek ◽  
David A. Padua

Due to the complexity of programming scalable multiprocessors with physically distributed memories, it is onerous to manually generate parallel code for these machines. As a consequense, there has been much research on the development of compiler techniques to simplify programming, to increase reliability, and to reduce development costs. For code generation, a compiler applies a number of transformations in areas such as data privatization, data copying and replication, synchronization, and data and work distribution. In this paper, we discuss our recent work on the development and implementation of a few compiler techniques for some of these transformations. We use Polaris, a parallelizing Fortran restructurer developed at Illinois, as the infrastructure to implement our algorithms. The paper includes experimental results obtained by applying our techniques to several benchmark codes.


Sign in / Sign up

Export Citation Format

Share Document