Data Dependence Profiling for Speculative Optimizations

AbstractIn unstructured finite volume method, loop on different mesh components such as cells, faces, nodes, etc is used widely for the traversal of data. Mesh loop results in direct or indirect data access that affects data locality significantly. By loop on mesh, many threads accessing the same data lead to data dependence. Both data locality and data dependence play an important part in the performance of GPU simulations. For optimizing a GPU-accelerated unstructured finite volume Computational Fluid Dynamics (CFD) program, the performance of hot spots under different loops on cells, faces, and nodes is evaluated on Nvidia Tesla V100 and K80. Numerical tests under different mesh scales show that the effects of mesh loop modes are different on data locality and data dependence. Specifically, face loop makes the best data locality, so long as access to face data exists in kernels. Cell loop brings the smallest overheads due to non-coalescing data access, when both cell and node data are used in computing without face data. Cell loop owns the best performance in the condition that only indirect access of cell data exists in kernels. Atomic operations reduced the performance of kernels largely in K80, which is not obvious on V100. With the suitable mesh loop mode in all kernels, the overall performance of GPU simulations can be increased by 15%-20%. Finally, the program on a single GPU V100 can achieve maximum 21.7 and average 14.1 speed up compared with 28 MPI tasks on two Intel CPUs Xeon Gold 6132.

Download Full-text

Effectiveness of data dependence analysis

International Journal of Parallel Programming ◽

10.1007/bf02577784 ◽

1995 ◽

Vol 23 (1) ◽

pp. 63-81 ◽

Cited By ~ 7

Author(s):

Dror E. Maydan ◽

John L. Hennessy ◽

Monica S. Lam

Keyword(s):

Data Dependence ◽

Dependence Analysis ◽

Data Dependence Analysis

Download Full-text

Designing parallel sparse matrix algorithms beyond data dependence analysis

Proceedings International Conference on Parallel Processing Workshops ◽

10.1109/icppw.2001.951838 ◽

2002 ◽

Cited By ~ 2

Author(s):

H.X. Lin

Keyword(s):

Sparse Matrix ◽

Data Dependence ◽

Dependence Analysis ◽

Matrix Algorithms ◽

Data Dependence Analysis

Download Full-text

The use of data dependence graphs in the design of bit-level systolic arrays

IEEE Transactions on Acoustics Speech and Signal Processing ◽

10.1109/29.56023 ◽

1990 ◽

Vol 38 (5) ◽

pp. 787-793 ◽

Cited By ~ 29

Author(s):

J.V. McCanny ◽

J.G. McWhirter ◽

S.-Y. Kung

Keyword(s):

Data Dependence ◽

Systolic Arrays ◽

Use Of Data ◽

Dependence Graphs

Download Full-text

Data dependence and data-flow analysis of arrays

Languages and Compilers for Parallel Computing - Lecture Notes in Computer Science ◽

10.1007/3-540-57502-2_63 ◽

1993 ◽

pp. 434-448 ◽

Cited By ~ 13

Author(s):

D. Maydan ◽

S. Amarsinghe ◽

M. Lam

Keyword(s):

Data Flow ◽

Flow Analysis ◽

Data Dependence ◽

Data Flow Analysis

Download Full-text

Comparison of Data Dependence Analysis Tests

Lecture Notes in Computer Science - Computer Systems: Architectures, Modeling, and Simulation ◽

10.1007/978-3-540-27776-7_16 ◽

2004 ◽

pp. 149-158

Author(s):

Miia Viitanen ◽

Timo D. Hämäläinen

Keyword(s):

Data Dependence ◽

Dependence Analysis ◽

Data Dependence Analysis

Download Full-text

Reliability and Performance Models for Grid Computing

Handbook of Research on Scalable Computing Technologies ◽

10.4018/978-1-60566-661-7.ch010 ◽

2010 ◽

pp. 219-245 ◽

Cited By ~ 1

Author(s):

Yuan-Shun Dai ◽

Jack Dongarra

Keyword(s):

Grid Computing ◽

Resource Sharing ◽

Large Scale ◽

Optimization Problems ◽

Data Dependence ◽

Performance Models ◽

Task Partitioning ◽

Modeling And Analysis ◽

Failure Correlation ◽

And Performance

Grid computing is a newly developed technology for complex systems with large-scale resource sharing, wide-area communication, and multi-institutional collaboration. It is hard to analyze and model the Grid reliability because of its largeness, complexity and stiffness. Therefore, this chapter introduces the Grid computing technology, presents different types of failures in grid system, models the grid reliability with star structure and tree structure, and finally studies optimization problems for grid task partitioning and allocation. The chapter then presents models for star-topology considering data dependence and treestructure considering failure correlation. Evaluation tools and algorithms are developed, evolved from Universal generating function and Graph Theory. Then, the failure correlation and data dependence are considered in the model. Numerical examples are illustrated to show the modeling and analysis.

Download Full-text