Design space exploration for layer-parallel execution of convolutional neural networks on CGRAs

Author(s):  
Christian Heidorn ◽  
Frank Hannig ◽  
Jürgen Teich
Author(s):  
Abeer Al-Hyari ◽  
Shawki Areibi

This paper proposes a framework for design space exploration ofConvolutional Neural Networks (CNNs) using Genetic Algorithms(GAs). CNNs have many hyperparameters that need to be tunedcarefully in order to achieve favorable results when used for imageclassification tasks or similar vision applications. Genetic Algorithmsare adopted to efficiently traverse the huge search spaceof CNNs hyperparameters, and generate the best architecture thatfits the given task. Some of the hyperparameters that were testedinclude the number of convolutional and fully connected layers, thenumber of filters for each convolutional layer, and the number ofnodes in the fully connected layers. The proposed approach wastested using MNIST dataset for handwritten digit classification andresults obtained indicate that the proposed approach is able to generatea CNN architecture with validation accuracy up to 96.66% onaverage.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2200
Author(s):  
Alireza Ghaffari ◽  
Yvon Savaria

Convolutional Neural Networks (CNNs) have a major impact on our society, because of the numerous services they provide. These services include, but are not limited to image classification, video analysis, and speech recognition. Recently, the number of researches that utilize FPGAs to implement CNNs are increasing rapidly. This is due to the lower power consumption and easy reconfigurability that are offered by these platforms. Because of the research efforts put into topics, such as architecture, synthesis, and optimization, some new challenges are arising for integrating suitable hardware solutions to high-level machine learning software libraries. This paper introduces an integrated framework (CNN2Gate), which supports compilation of a CNN model for an FPGA target. CNN2Gate is capable of parsing CNN models from several popular high-level machine learning libraries, such as Keras, Pytorch, Caffe2, etc. CNN2Gate extracts computation flow of layers, in addition to weights and biases, and applies a “given” fixed-point quantization. Furthermore, it writes this information in the proper format for the FPGA vendor’s OpenCL synthesis tools that are then used to build and run the project on FPGA. CNN2Gate performs design-space exploration and fits the design on different FPGAs with limited logic resources automatically. This paper reports results of automatic synthesis and design-space exploration of AlexNet and VGG-16 on various Intel FPGA platforms.


2011 ◽  
Vol 467-469 ◽  
pp. 812-817 ◽  
Author(s):  
Dan Zhang ◽  
Rong Cai Zhao ◽  
Lin Han ◽  
Wei Fang Liang ◽  
Jin Qu ◽  
...  

Using FPGA for general-purpose computation has become a hot research topic in high-performance computing technologies. However, the complexity of design and resource of FPGA make applying a common approach to solve the problem with mixed constraints impossible. Aiming at familiar loop structure of the applications, a design space exploration method based on FPGA hardware constrains is proposed according to the FPGA chip features, which combines the features of the corresponding application to perform loop optimization for reducing the demand of memory. Experimental results show that the method significantly improves the rate of data reuse, reduces the times of external memory access, achieves parallel execution of multiple pipelining, and effectively improves the performance of applications implemented on FPGA.


2018 ◽  
Vol 6 (2) ◽  
pp. 37-49 ◽  
Author(s):  
Kohei Fujisawa ◽  
Atsushi Nunome ◽  
Kiyoshi Shibayama ◽  
Hiroaki Hirata

To enlarge the opportunities for parallelizing a sequentially coded program, the authors have previously proposed speculative memory (SM). With SM, they can start the parallel execution of a program by assuming that it does not violate the data dependencies in the program. When the SM system detects a violation, it recovers the computational state of the program and restarts the execution. In this article, the authors explore the design space for implementing a software-based SM system. They compared the possible choices in the following three viewpoints: (1) which waiting system of suspending or busy-waiting should be used, (2) when a speculative thread should be committed, and (3) which version of data a speculative thread should read. Consequently, the performance of the busy-waiting system which makes speculative threads commit early and read non-speculative values is better than that of others.


Sign in / Sign up

Export Citation Format

Share Document