Systematic realization of a fully connected deep and convolutional neural network architecture on a field programmable gate array

Subsea pipeline is the safest, most reliable, and most economical way to transport oil and gas from an offshore platform to an onshore terminal. However, the pipelines may rupture under the harsh working environment, causing oil and gas leakage. This calls for a proper device and method to detect the state of subsea pipelines in a timely and precise manner. The autonomous underwater vehicle carrying side-scan sonar offers a desirable way for target detection in the complex environment under the sea. As a result, this article combines the field-programmable gate array, featuring high throughput, low energy consumption and a high degree of parallelism, and the convolutional neural network into a sonar image recognition system. First, a training set was constructed by screening and splitting the sonar images collected by sensors, and labeled one by one. Next, the convolutional neural network model was trained by the set on the workstation platform. The trained model was integrated into the field-programmable gate array system and applied to recognize actual datasets. The recognition results were compared with those of the workstation platform. The comparison shows that the computational precision of the designed field-programmable gate array system based on convolutional neural network is equivalent to that of the workstation platform; however, the recognition time of the designed system can be saved by more than 77%, and its energy consumption can also be saved by more than 96.67%. Therefore, our system basically satisfies our demand for energy-efficient, real-time, and accurate recognition of sonar images.

Download Full-text

A Real-Time Deep Learning OFDM Receiver

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3494049 ◽

2022 ◽

Vol 15 (3) ◽

pp. 1-25

Author(s):

Stefan Brennsteiner ◽

Tughrul Arslan ◽

John Thompson ◽

Andrew McCormick

Keyword(s):

Neural Network ◽

Neural Networks ◽

Real Time ◽

Field Programmable Gate Array ◽

Orthogonal Frequency Division Multiplexing ◽

Frequency Division Multiplexing ◽

Frequency Division ◽

Field Programmable ◽

Gate Array ◽

Fully Connected

Machine learning in the physical layer of communication systems holds the potential to improve performance and simplify design methodology. Many algorithms have been proposed; however, the model complexity is often unfeasible for real-time deployment. The real-time processing capability of these systems has not been proven yet. In this work, we propose a novel, less complex, fully connected neural network to perform channel estimation and signal detection in an orthogonal frequency division multiplexing system. The memory requirement, which is often the bottleneck for fully connected neural networks, is reduced by ≈ 27 times by applying known compression techniques in a three-step training process. Extensive experiments were performed for pruning and quantizing the weights of the neural network detector. Additionally, Huffman encoding was used on the weights to further reduce memory requirements. Based on this approach, we propose the first field-programmable gate array based, real-time capable neural network accelerator, specifically designed to accelerate the orthogonal frequency division multiplexing detector workload. The accelerator is synthesized for a Xilinx RFSoC field-programmable gate array, uses small-batch processing to increase throughput, efficiently supports branching neural networks, and implements superscalar Huffman decoders.

Download Full-text

Design of an energy-efficient binarized convolutional neural network accelerator using a nonvolatile field-programmable gate array with only-once-write shifting

Japanese Journal of Applied Physics ◽

10.35848/1347-4065/abe682 ◽

2021 ◽

Vol 60 (SB) ◽

pp. SBBB07

Author(s):

Daisuke Suzuki ◽

Takahiro Oka ◽

Takahiro Hanyu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Energy Efficient ◽

Field Programmable Gate Array ◽

Field Programmable ◽

Gate Array

Download Full-text

SIES: A Novel Implementation of Spiking Convolutional Neural Network Inference Engine on Field-Programmable Gate Array

Journal of Computer Science and Technology ◽

10.1007/s11390-020-9686-z ◽

2020 ◽

Vol 35 (2) ◽

pp. 475-489

Author(s):

Shu-Quan Wang ◽

Lei Wang ◽

Yu Deng ◽

Zhi-Jie Yang ◽

Sha-Sha Guo ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Field Programmable Gate Array ◽

Network Inference ◽

Inference Engine ◽

Field Programmable ◽

Gate Array

Download Full-text

Field-programmable gate array implementation of image dehazing system using convolutional neural network

Engineering Innovation and Design ◽

10.1201/9780429019777-15 ◽

2019 ◽

pp. 83-86

Author(s):

Ming-Shi Wang ◽

Chien-En Kao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Field Programmable Gate Array ◽

Image Dehazing ◽

Field Programmable ◽

Gate Array

Download Full-text

An FPGA-Based Hardware Accelerator for CNNs Using On-Chip Memories Only: Design and Benchmarking with Intel Movidius Neural Compute Stick

International Journal of Reconfigurable Computing ◽

10.1155/2019/7218758 ◽

2019 ◽

Vol 2019 ◽

pp. 1-13 ◽

Cited By ~ 11

Author(s):

Gianmarco Dinelli ◽

Gabriele Meoni ◽

Emilio Rapuano ◽

Gionata Benelli ◽

Luca Fanucci

Keyword(s):

Neural Network ◽

Power Consumption ◽

Convolutional Neural Network ◽

Field Programmable Gate Array ◽

Keyword Spotting ◽

Hardware Accelerator ◽

Neural Network Models ◽

Field Programmable ◽

Gate Array ◽

On Chip

During the last years, convolutional neural networks have been used for different applications, thanks to their potentiality to carry out tasks by using a reduced number of parameters when compared with other deep learning approaches. However, power consumption and memory footprint constraints, typical of on the edge and portable applications, usually collide with accuracy and latency requirements. For such reasons, commercial hardware accelerators have become popular, thanks to their architecture designed for the inference of general convolutional neural network models. Nevertheless, field-programmable gate arrays represent an interesting perspective since they offer the possibility to implement a hardware architecture tailored to a specific convolutional neural network model, with promising results in terms of latency and power consumption. In this article, we propose a full on-chip field-programmable gate array hardware accelerator for a separable convolutional neural network, which was designed for a keyword spotting application. We started from the model implemented in a previous work for the Intel Movidius Neural Compute Stick. For our goals, we appropriately quantized such a model through a bit-true simulation, and we realized a dedicated architecture exclusively using on-chip memories. A benchmark comparing the results on different field-programmable gate array families by Xilinx and Intel with the implementation on the Neural Compute Stick was realized. The analysis shows that better inference time and energy per inference results can be obtained with comparable accuracy at expenses of a higher design effort and development time through the FPGA solution.

Download Full-text