hybrid fpga
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 11)

H-INDEX

8
(FIVE YEARS 1)

Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7933
Author(s):  
António Silva ◽  
Duarte Fernandes ◽  
Rafael Névoa ◽  
João Monteiro ◽  
Paulo Novais ◽  
...  

Research about deep learning applied in object detection tasks in LiDAR data has been massively widespread in recent years, achieving notable developments, namely in improving precision and inference speed performances. These improvements have been facilitated by powerful GPU servers, taking advantage of their capacity to train the networks in reasonable periods and their parallel architecture that allows for high performance and real-time inference. However, these features are limited in autonomous driving due to space, power capacity, and inference time constraints, and onboard devices are not as powerful as their counterparts used for training. This paper investigates the use of a deep learning-based method in edge devices for onboard real-time inference that is power-effective and low in terms of space-constrained demand. A methodology is proposed for deploying high-end GPU-specific models in edge devices for onboard inference, consisting of a two-folder flow: study model hyperparameters’ implications in meeting application requirements; and compression of the network for meeting the board resource limitations. A hybrid FPGA-CPU board is proposed as an effective onboard inference solution by comparing its performance in the KITTI dataset with computer performances. The achieved accuracy is comparable to the PC-based deep learning method with a plus that it is more effective for real-time inference, power limited and space-constrained purposes.


2021 ◽  
Vol 12 (10) ◽  
pp. 6496
Author(s):  
Bartlomiej Kowalski ◽  
Xiaojing Huang ◽  
Samuel Steven ◽  
Alfredo Dubra
Keyword(s):  

Author(s):  
VE. Jayanthi ◽  
Senthil Pitchai ◽  
M. Smitha

Hybrid field programmable gate array (FPGA) implementation is proposed to improve the performance of visible image watermarking systems. The visible watermarking process is implemented as pixel by pixel operation under a spatial domain or vector operation in the frequency domain. The proposed approach is mainly designed for watermarking the images taken from digital cameras of various sizes. The padding technique is used for unequal sizes of the watermark image and original host image. The architecture data path consists of eight and six stages of pipeline capable of watermarking on the pixel-based operation and vector-based operation, respectively. The dual image watermarking architecture data path consists of a 13-stage pipeline. Pipeline and parallelism mechanisms are used to improve throughput. To improve the performance in discrete cosine transform operations at the frequency domain, the shift-add technique replaces the conventional multipliers. The clock gating technique is employed to reduce the power by preventing unnecessary switching in a path. Hardware implementation of the algorithm is tested in Intel Cyclone FPGA with the device of EP4CGX22CF19C6, with which the throughput achieved is 1.27[Formula: see text]Gbits/s with a total area utilization of 35[Formula: see text]digital signal processing (DSP) blocks, 378 look-up tables (LUTs) and 486 registers.


2021 ◽  
Author(s):  
Mateus Saquetti ◽  
Raphael M. Brum ◽  
Bruno Zatt ◽  
Samuel Pagliarini ◽  
Weverton Cordeiro ◽  
...  
Keyword(s):  

Author(s):  
Weiyun Jiang ◽  
Kaiqi Zhang ◽  
Colin Yu Lin ◽  
Feng Xing ◽  
Zheng Zhang

2020 ◽  
Vol 245 ◽  
pp. 09006
Author(s):  
Lukas On Arnold ◽  
Muhsen Owaida

Covariance matrices are used for a wide range of applications in particle physics, including Kálmán filter for tracking purposes or Primary Component Analysis for dimensionality reduction. Based on a novel decomposition of the covariance matrix, a design that requires only one pass of data for calculating the covariance matrix is presented. Two computation engines are used depending on parallelizability of the necessary computation steps. The design is implemented onto a hybrid FPGA/CPU system and yields speed-up of up to 5 orders of magnitude compared to previous FPGA implementation.


2019 ◽  
Vol 1 (1) ◽  
pp. 26-32
Author(s):  
Bahadır ÖZKILBAÇ

FPGAs have capabilities such as low power consumption, multiple I/O pins, and parallel processing. Because of these capabilities, FPGAs are commonly used in numerous areas that require mathematical computing such as signal processing, artificial neural network design, image processing and filter applications. From the simplest to the most complex, all mathematical applications are based on multiplication, division, subtraction, addition. When calculating, it is often necessary to deal with numbers that are fractional, large or negative. In this study, the Arithmetic Logic Unit (ALU), which uses multiplication, division, addition, subtraction in the form of IEEE754 32-bit floating-point number used to represent fractional and large numbers is designed using FPGA part of the Xilinx Zynq-7000 integrated circuit. The programming language used is VHDL. Then, the ALU designed by the ARM processor part of the same integrated circuit was sent by the commands and controlled.


Sign in / Sign up

Export Citation Format

Share Document