Latin hypercube initialization strategy for design space exploration of deep neural network architectures

Author(s):  
Heitor R. Medeiros ◽  
Diogo M. F. Izidio ◽  
Antonyus P. do A. Ferreira ◽  
Edna N. da S. Barros
Author(s):  
Tao Yang ◽  
Yadong Wei ◽  
Zhijun Tu ◽  
Haolun Zeng ◽  
Michel A. Kinsy ◽  
...  

Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1921
Author(s):  
Hongmin Huang ◽  
Zihao Liu ◽  
Taosheng Chen ◽  
Xianghong Hu ◽  
Qiming Zhang ◽  
...  

The You Only Look Once (YOLO) neural network has great advantages and extensive applications in computer vision. The convolutional layers are the most important part of the neural network and take up most of the computation time. Improving the efficiency of the convolution operations can greatly increase the speed of the neural network. Field programmable gate arrays (FPGAs) have been widely used in accelerators for convolutional neural networks (CNNs) thanks to their configurability and parallel computing. This paper proposes a design space exploration for the YOLO neural network based on FPGA. A data block transmission strategy is proposed and a multiply and accumulate (MAC) design, which consists of two 14 × 14 processing element (PE) matrices, is designed. The PE matrices are configurable for different CNNs according to the given required functions. In order to take full advantage of the limited logical resources and the memory bandwidth on the given FPGA device and to simultaneously achieve the best performance, an improved roofline model is used to evaluate the hardware design to balance the computing throughput and the memory bandwidth requirement. The accelerator achieves 41.99 giga operations per second (GOPS) and consumes 7.50 W running at the frequency of 100 MHz on the Xilinx ZC706 board.


Sign in / Sign up

Export Citation Format

Share Document