GA driven integrated exploration of loop unrolling factor and datapath for optimal scheduling of CDFGs during high level synthesis

Author(s):  
Pallabi Sarkar ◽  
Anirban Sengupta ◽  
Mrinal Kanti Naskar
10.29007/x3tx ◽  
2019 ◽  
Author(s):  
Luka Daoud ◽  
Fady Hussein ◽  
Nader Rafla

Advanced Encryption Standard (AES) represents a fundamental building module of many network security protocols to ensure data confidentiality in various applications ranging from data servers to low-power hardware embedded systems. In order to optimize such hardware implementations, High-Level Synthesis (HLS) provides flexibility in designing and rapid optimization of dedicated hardware to meet the design constraints. In this paper, we present the implementation of AES encryption processor on FPGA using Xilinx Vivado HLS. The AES architecture was analyzed and designed by loop unrolling, and inner-round and outer-round pipelining techniques to achieve a maximum throughput of the AES algorithm up to 1290 Mbps (Mega bit per second) with very significant low resources of 3.24% slices of the FPGA, achieving 3 Mbps per slice area.


2021 ◽  
Vol 29 (2) ◽  
Author(s):  
Panadda Solod ◽  
Nattha Jindapetch ◽  
Kiattisak Sengchuai ◽  
Apidet Booranawong ◽  
Pakpoom Hoyingcharoen ◽  
...  

In this work, we proposed High-Level Synthesis (HLS) optimization processes to improve the speed and the resource usage of complex algorithms, especially nested-loop. The proposed HLS optimization processes are divided into four steps: array sizing is performed to decrease the resource usage on Programmable Logic (PL) part, loop analysis is performed to determine which loop must be loop unrolling or loop pipelining, array partitioning is performed to resolve the bottleneck of loop unrolling and loop pipelining, and HLS interface is performed to select the best block level and port level interface for array argument of RTL design. A case study road lane detection was analyzed and applied with suitable optimization techniques to implement on the Xilinx Zynq-7000 family (Zybo ZC7010-1) which was a low-cost FPGA. From the experimental results, our proposed method reaches 6.66 times faster than the primitive method at clock frequency 100 MHz or about 6 FPS. Although the proposed methods cannot reach the standard real-time (25 FPS), they can instruct HLS developers for speed increasing and resource decreasing on an FPGA.


Author(s):  
Akira OHCHI ◽  
Nozomu TOGAWA ◽  
Masao YANAGISAWA ◽  
Tatsuo OHTSUKI

Sign in / Sign up

Export Citation Format

Share Document