Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX
Performance modeling and optimization of sparse matrix-vector multiplication on NVIDIA CUDA platform
2011 ◽
Vol 63
(3)
◽
pp. 710-721
◽
2014 ◽
Vol 27
(13)
◽
pp. 3281-3294
◽
2014 ◽
Vol 25
(5)
◽
pp. 1112-1123
◽
2017 ◽
Vol 43
(4)
◽
pp. 1-49
◽
Keyword(s):