scholarly journals Systematic realization of a fully connected deep and convolutional neural network architecture on a field programmable gate array

2022 ◽  
Vol 97 ◽  
pp. 107628
Author(s):  
Anand Kumar Mukhopadhyay ◽  
Sampurna Majumder ◽  
Indrajit Chakrabarti
Author(s):  
Chong Wang ◽  
Yu Jiang ◽  
Kai Wang ◽  
Fenglin Wei

Subsea pipeline is the safest, most reliable, and most economical way to transport oil and gas from an offshore platform to an onshore terminal. However, the pipelines may rupture under the harsh working environment, causing oil and gas leakage. This calls for a proper device and method to detect the state of subsea pipelines in a timely and precise manner. The autonomous underwater vehicle carrying side-scan sonar offers a desirable way for target detection in the complex environment under the sea. As a result, this article combines the field-programmable gate array, featuring high throughput, low energy consumption and a high degree of parallelism, and the convolutional neural network into a sonar image recognition system. First, a training set was constructed by screening and splitting the sonar images collected by sensors, and labeled one by one. Next, the convolutional neural network model was trained by the set on the workstation platform. The trained model was integrated into the field-programmable gate array system and applied to recognize actual datasets. The recognition results were compared with those of the workstation platform. The comparison shows that the computational precision of the designed field-programmable gate array system based on convolutional neural network is equivalent to that of the workstation platform; however, the recognition time of the designed system can be saved by more than 77%, and its energy consumption can also be saved by more than 96.67%. Therefore, our system basically satisfies our demand for energy-efficient, real-time, and accurate recognition of sonar images.


2022 ◽  
Vol 15 (3) ◽  
pp. 1-25
Author(s):  
Stefan Brennsteiner ◽  
Tughrul Arslan ◽  
John Thompson ◽  
Andrew McCormick

Machine learning in the physical layer of communication systems holds the potential to improve performance and simplify design methodology. Many algorithms have been proposed; however, the model complexity is often unfeasible for real-time deployment. The real-time processing capability of these systems has not been proven yet. In this work, we propose a novel, less complex, fully connected neural network to perform channel estimation and signal detection in an orthogonal frequency division multiplexing system. The memory requirement, which is often the bottleneck for fully connected neural networks, is reduced by ≈ 27 times by applying known compression techniques in a three-step training process. Extensive experiments were performed for pruning and quantizing the weights of the neural network detector. Additionally, Huffman encoding was used on the weights to further reduce memory requirements. Based on this approach, we propose the first field-programmable gate array based, real-time capable neural network accelerator, specifically designed to accelerate the orthogonal frequency division multiplexing detector workload. The accelerator is synthesized for a Xilinx RFSoC field-programmable gate array, uses small-batch processing to increase throughput, efficiently supports branching neural networks, and implements superscalar Huffman decoders.


2019 ◽  
Vol 2019 ◽  
pp. 1-13 ◽  
Author(s):  
Gianmarco Dinelli ◽  
Gabriele Meoni ◽  
Emilio Rapuano ◽  
Gionata Benelli ◽  
Luca Fanucci

During the last years, convolutional neural networks have been used for different applications, thanks to their potentiality to carry out tasks by using a reduced number of parameters when compared with other deep learning approaches. However, power consumption and memory footprint constraints, typical of on the edge and portable applications, usually collide with accuracy and latency requirements. For such reasons, commercial hardware accelerators have become popular, thanks to their architecture designed for the inference of general convolutional neural network models. Nevertheless, field-programmable gate arrays represent an interesting perspective since they offer the possibility to implement a hardware architecture tailored to a specific convolutional neural network model, with promising results in terms of latency and power consumption. In this article, we propose a full on-chip field-programmable gate array hardware accelerator for a separable convolutional neural network, which was designed for a keyword spotting application. We started from the model implemented in a previous work for the Intel Movidius Neural Compute Stick. For our goals, we appropriately quantized such a model through a bit-true simulation, and we realized a dedicated architecture exclusively using on-chip memories. A benchmark comparing the results on different field-programmable gate array families by Xilinx and Intel with the implementation on the Neural Compute Stick was realized. The analysis shows that better inference time and energy per inference results can be obtained with comparable accuracy at expenses of a higher design effort and development time through the FPGA solution.


Sign in / Sign up

Export Citation Format

Share Document