A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer, using overlapped communication
1994 ◽
Vol 38
(6)
◽
pp. 673-681
◽
1998 ◽
Vol 10
(8)
◽
pp. 655-670
◽