A 13.7 TFLOPS/W Floating-point DNN Processor using Heterogeneous Computing Architecture with Exponent-Computing-in-Memory

Author(s):  
Juhyoung Lee ◽  
Jihoon Kim ◽  
Wooyoung Jo ◽  
Sangyeob Kim ◽  
Sangjin Kim ◽  
...  
Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3409 ◽  
Author(s):  
Shiyu Wang ◽  
Shengbing Zhang ◽  
Xiaoping Huang ◽  
Jianfeng An ◽  
Libo Chang

The expansion and improvement of synthetic aperture radar (SAR) technology have greatly enhanced its practicality. SAR imaging requires real-time processing with limited power consumption for large input images. Designing a specific heterogeneous array processor is an effective approach to meet the power consumption constraints and real-time processing requirements of an application system. In this paper, taking a commonly used algorithm for SAR imaging—the chirp scaling algorithm (CSA)—as an example, the characteristics of each calculation stage in the SAR imaging process is analyzed, and the data flow model of SAR imaging is extracted. A heterogeneous array architecture for SAR imaging that effectively supports Fast Fourier Transformation/Inverse Fast Fourier Transform (FFT/IFFT) and phase compensation operations is proposed. First, a heterogeneous array architecture consisting of fixed-point PE units and floating-point FPE units, which are respectively proposed for the FFT/IFFT and phase compensation operations, increasing energy efficiency by 50% compared with the architecture using floating-point units. Second, data cross-placement and simultaneous access strategies are proposed to support the intra-block parallel processing of SAR block imaging, achieving up to 115.2 GOPS throughput. Third, a resource management strategy for heterogeneous computing arrays is designed, which supports the pipeline processing of FFT/IFFT and phase compensation operation, improving PE utilization by a factor of 1.82 and increasing energy efficiency by a factor of 1.5. Implemented in 65-nm technology, the experimental results show that the processor can achieve energy efficiency of up to 254 GOPS/W. The imaging fidelity and accuracy of the proposed processor were verified by evaluating the image quality of the actual scene.


2014 ◽  
Vol 15 (1) ◽  
pp. 216 ◽  
Author(s):  
Davor Sluga ◽  
Tomaz Curk ◽  
Blaz Zupan ◽  
Uros Lotric

Sign in / Sign up

Export Citation Format

Share Document