scholarly journals Latency Analysis in the 2-Dimensional Systolic Arrays for Matrix Multiplication

2021 ◽  
Vol 15 ◽  
pp. 1-7
Author(s):  
Halil Snopce ◽  
Azir Aliu

This paper deals with the latency analysis in a twodimensional systolic array for matrix multiplication. The latency for all possible connection schemes is discussed. In this way there is obtained the lower bound of the latency that can be achieved using such arrays.

2021 ◽  
Vol 20 (5s) ◽  
pp. 1-20
Author(s):  
Hyungmin Cho

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.


2011 ◽  
Vol 03 (01) ◽  
pp. 77-86
Author(s):  
DRAGAN M. RANDJELOVIĆ

The objective of this paper is to discuss systolic arrays (SAs) that are suitable for regular three-nested loop algorithms implementation and which enable the possibility of high dependability calculations for the SAs obtained in this way. This is made by considering the different possible values of flow period of processor for SAs synthesized on adaptable algorithms. Therefore, the algorithm for two matrix multiplication is one typical adaptable algorithm obtained results illustrated in the end of this paper by the examples of two rectangular matrix multiplication realized with the so called flowing and hexagonal two dimensional 2D SAs of planar SAs group.


1997 ◽  
Vol 33 (6) ◽  
pp. 17-35 ◽  
Author(s):  
I.Z. Milentijević ◽  
I.Z̆. Milovanović ◽  
E.I. Milovanović ◽  
M.K. Stojc̆ev

Sign in / Sign up

Export Citation Format

Share Document