Modeling contention of sparse-matrix-vector multiplication (SMV) in three parallel programming paradigms

Proceedings of the 6th international workshop on Software and performance - WOSP '07 ◽

10.1145/1216993.1217003 ◽

2007 ◽

Author(s):

Ahmed Sameh ◽

Tarek El-Ghazawi ◽

Yesha Yacoov

Keyword(s):

Parallel Programming ◽

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Programming Paradigms ◽

Download Full-text

SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator

2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA) ◽

10.1109/hpca51647.2021.00055 ◽

2021 ◽

Author(s):

Xinfeng Xie ◽

Zheng Liang ◽

Peng Gu ◽

Abanti Basak ◽

Lei Deng ◽

...

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Conflict-free symmetric sparse matrix-vector multiplication on multicore architectures

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis ◽

10.1145/3295500.3356148 ◽

2019 ◽

Author(s):

Athena Elafrou ◽

Georgios Goumas ◽

Nectarios Koziris

Keyword(s):

Sparse Matrix ◽

Multicore Architectures ◽

Matrix Vector Multiplication ◽

Download Full-text

Sparse Matrix-Vector Multiplication on GPGPUs

ACM Transactions on Mathematical Software ◽

10.1145/3017994 ◽

2017 ◽

Vol 43 (4) ◽

pp. 1-49 ◽

Author(s):

Salvatore Filippone ◽

Valeria Cardellini ◽

Davide Barbieri ◽

Alessandro Fanfarillo

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Multicoloring for Fast Sparse Matrix-Vector Multiplication in Solving PDE Problems

1993 International Conference on Parallel Processing - ICPP'93 Vol1 ◽

10.1109/icpp.1993.119 ◽

1993 ◽

Author(s):

H.C. Wang ◽

Kai Hwang

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Performance Evaluation of Sparse Matrix-Vector Multiplication Using GPU/MIC Cluster

2015 Third International Symposium on Computing and Networking (CANDAR) ◽

10.1109/candar.2015.73 ◽

2015 ◽

Author(s):

Hiroshi Maeda ◽

Daisuke Takahashi

Keyword(s):

Performance Evaluation ◽

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Optimising Sparse Matrix Vector multiplication for large scale FEM problems on FPGA

2016 26th International Conference on Field Programmable Logic and Applications (FPL) ◽

10.1109/fpl.2016.7577352 ◽

2016 ◽

Author(s):

Paul Grigoras ◽

Pavel Burovskiy ◽

Wayne Luk ◽

Spencer Sherwin

Keyword(s):

Large Scale ◽

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Addressing Volume and Latency Overheads in 1D-parallel Sparse Matrix-Vector Multiplication

Lecture Notes in Computer Science - Euro-Par 2017: Parallel Processing ◽

10.1007/978-3-319-64203-1_45 ◽

2017 ◽

pp. 625-637

Author(s):

Seher Acer ◽

Oguz Selvitopi ◽

Cevdet Aykanat

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Image convolution optimization using sparse matrix vector multiplication technique

2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2016.7732252 ◽

2016 ◽

Author(s):

B Bipin ◽

Jyothisha J Nair

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision

High Performance Computing for Computational Science – VECPAR 2016 - Lecture Notes in Computer Science ◽

10.1007/978-3-319-61982-8_4 ◽

2017 ◽

pp. 21-34 ◽

Author(s):

Toshiaki Hishinuma ◽

Hidehiko Hasegawa ◽

Teruo Tanaka

Keyword(s):

Sparse Matrix ◽

Matrix Vector Multiplication ◽

Download Full-text

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

Acta Polytechnica ◽

10.14311/1029 ◽

2008 ◽

Vol 48 (4) ◽

Author(s):

I. Šimeček

Keyword(s):

Sparse Matrix ◽

Numerical Linear Algebra ◽

Matrix Transformations ◽

Matrix Vector Multiplication ◽

Tightly Coupled ◽

Partial Multiplication ◽

Access Patterns ◽

Matrix Vector ◽

Register Blocking

Sparse matrix-vector multiplication (shortly SpM×V) is one of most common subroutines in numerical linear algebra. The problem is that the memory access patterns during SpM×V are irregular, and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking. These matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. The efficiency of these transformations depends strongly on the presence of suitable blocks. The overhead of reorganization of a matrix from one format to another is often of the order of tens of executions ofSpM×V. For this reason, such a reorganization pays off only if the same matrix A is multiplied by multiple different vectors, e.g., in iterative linear solvers.This paper introduces an unusual approach to accelerate SpM×V. This approach can be combined with other acceleration approaches andconsists of three steps:1) dividing matrix A into non-empty regions,2) choosing an efficient way to traverse these regions (in other words, choosing an efficient ordering of partial multiplications),3) choosing the optimal type of storage for each region.All these three steps are tightly coupled. The first step divides the whole matrix into smaller parts (regions) that can fit in the cache. The second step improves the locality during multiplication due to better utilization of distant references. The last step maximizes the machine computation performance of the partial multiplication for each region.In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Ourmeasurements prove that our approach gives a significant speedup for almost all matrices arising from various technical areas.

Download Full-text