Real-Time Image and Video Processing Using High-Level Synthesis (HLS)

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch042 ◽

2018 ◽

pp. 1004-1022

Author(s):

Murad Qasaimeh ◽

Ehab Najeh Salahat

Keyword(s):

Video Processing ◽

High Performance ◽

Low Cost ◽

Optimization Techniques ◽

High Level Synthesis ◽

Image And Video Processing ◽

Digital Hardware ◽

Processing Algorithms ◽

Computationally Intensive ◽

High Level

Implementing high-performance, low-cost hardware accelerators for the computationally intensive image and video processing algorithms has attracted a lot of attention in the last 20 years. Most of the recent research efforts were trying to figure out new design automation methods to fill the gap between the ability of realizing efficient accelerators in hardware and the tight performance requirements of the complex image processing algorithms. High-Level synthesis (HLS) is a new method to automate the design process by transforming high-level algorithmic description into digital hardware while satisfying the design constraints. This chapter focuses on evaluating the suitability of using HLS as a new tool to accelerate the most demanding image and video processing algorithms in hardware. It discusses the gained benefits and current limitations, the recent academic and commercial tools, the compiler's optimization techniques and four case studies.

Download Full-text

Low-Cost Image and Video Processing Using High-Performance Middleware in Single-board Computers with Open Internet Standards

IEEE Latin America Transactions ◽

10.1109/tla.2020.9085285 ◽

2020 ◽

Vol 18 (02) ◽

pp. 311-318

Author(s):

Carlos Alejandro Perez ◽

Mario Sergio Cleva ◽

Diego Orlando Liska ◽

Dominga Concepcion Aquino ◽

Claudio Rodrigues da Fonseca

Keyword(s):

Video Processing ◽

High Performance ◽

Low Cost ◽

Image And Video Processing

Download Full-text

A Hybrid Scheme Based on Pipelining and Multitasking in Mobile Application Processors for Advanced Video Coding

Scientific Programming ◽

10.1155/2015/197843 ◽

2015 ◽

Vol 2015 ◽

pp. 1-16

Author(s):

Muhammad Asif ◽

Imtiaz A. Taj ◽

S. M. Ziauddin ◽

Maaz Bin Ahmad ◽

M. Tahir

Keyword(s):

Video Processing ◽

Mobile Application ◽

High Performance ◽

Optimization Techniques ◽

Processing Unit ◽

Software Modules ◽

Computationally Intensive ◽

Hardware Processing ◽

Memory Accesses ◽

Advanced Video Coding

One of the key requirements for mobile devices is to provide high-performance computing at lower power consumption. The processors used in these devices provide specific hardware resources to handle computationally intensive video processing and interactive graphical applications. Moreover, processors designed for low-power applications may introduce limitations on the availability and usage of resources, which present additional challenges to the system designers. Owing to the specific design of the JZ47x series of mobile application processors, a hybrid software-hardware implementation scheme for H.264/AVC encoder is proposed in this work. The proposed scheme distributes the encoding tasks among hardware and software modules. A series of optimization techniques are developed to speed up the memory access and data transferring among memories. Moreover, an efficient data reusage design is proposed for the deblock filter video processing unit to reduce the memory accesses. Furthermore, fine grained macroblock (MB) level parallelism is effectively exploited and a pipelined approach is proposed for efficient utilization of hardware processing cores. Finally, based on parallelism in the proposed design, encoding tasks are distributed between two processing cores. Experiments show that the hybrid encoder is 12 times faster than a highly optimized sequential encoder due to proposed techniques.

Download Full-text

Accelerating Sobel Edge Detection Using Compressor Cells Over FPGAs

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch047 ◽

2018 ◽

pp. 1133-1154

Author(s):

Ahmed Abouelfarag ◽

Marwa Ali Elshenawy ◽

Esraa Alaaeldin Khattab

Keyword(s):

Edge Detection ◽

Real Time ◽

Video Processing ◽

High Level Synthesis ◽

Computational Time ◽

The Novel ◽

Image And Video Processing ◽

Sobel Edge Detection ◽

High Level ◽

Time Image

Recently, computer vision is playing an important role in many essential human-computer interactive applications, these applications are subject to a “real-time” constraint, and therefore it requires a fast and reliable computational system. Edge Detection is the most used approach for segmenting images based on changes in intensity. There are various kernels used to perform edge detection, such as: Sobel, Robert, and Prewitt, upon which, the most commonly used is Sobel. In this research a novel type of operator cells that perform addition is introduced to achieve computational acceleration. The novel operator cells have been employed in the chosen FPGA Zedboard which is well-suited for real-time image and video processing. Accelerating the Sobel edge detection technique is exploited using different tools such as the High-Level Synthesis tools provided by Vivado. This enhancement shows a significant improvement as it decreases the computational time by 26% compared to the conventional adder cells.

Download Full-text

Accelerating Sobel Edge Detection Using Compressor Cells Over FPGAs

Advances in Business Information Systems and Analytics - Smart Technology Applications in Business Environments ◽

10.4018/978-1-5225-2492-2.ch001 ◽

2017 ◽

pp. 1-21

Author(s):

Ahmed Abouelfarag ◽

Marwa Ali Elshenawy ◽

Esraa Alaaeldin Khattab

Keyword(s):

Edge Detection ◽

Real Time ◽

Video Processing ◽

High Level Synthesis ◽

Computational Time ◽

The Novel ◽

Image And Video Processing ◽

Sobel Edge Detection ◽

High Level ◽

Time Image

Recently, computer vision is playing an important role in many essential human-computer interactive applications, these applications are subject to a “real-time” constraint, and therefore it requires a fast and reliable computational system. Edge Detection is the most used approach for segmenting images based on changes in intensity. There are various kernels used to perform edge detection, such as: Sobel, Robert, and Prewitt, upon which, the most commonly used is Sobel. In this research a novel type of operator cells that perform addition is introduced to achieve computational acceleration. The novel operator cells have been employed in the chosen FPGA Zedboard which is well-suited for real-time image and video processing. Accelerating the Sobel edge detection technique is exploited using different tools such as the High-Level Synthesis tools provided by Vivado. This enhancement shows a significant improvement as it decreases the computational time by 26% compared to the conventional adder cells.

Download Full-text

Image and video processing platform for field programmable gate arrays using a high-level synthesis

IET Computers & Digital Techniques ◽

10.1049/iet-cdt.2011.0156 ◽

2012 ◽

Vol 6 (6) ◽

pp. 414-425 ◽

Cited By ~ 7

Author(s):

C. Desmouliers ◽

F.M. Vallina ◽

S. Aslan ◽

J. Saniie ◽

E. Oruklu

Keyword(s):

Video Processing ◽

Field Programmable Gate Arrays ◽

High Level Synthesis ◽

Gate Arrays ◽

Image And Video Processing ◽

Field Programmable ◽

Programmable Gate Arrays ◽

High Level ◽

Processing Platform

Download Full-text

High Level Synthesis Optimizations of Road Lane Detection Development on Zynq-7000

Pertanika Journal of Science and Technology ◽

10.47836/pjst.29.2.01 ◽

2021 ◽

Vol 29 (2) ◽

Author(s):

Panadda Solod ◽

Nattha Jindapetch ◽

Kiattisak Sengchuai ◽

Apidet Booranawong ◽

Pakpoom Hoyingcharoen ◽

...

Keyword(s):

Low Cost ◽

Optimization Techniques ◽

Lane Detection ◽

High Level Synthesis ◽

Resource Usage ◽

Clock Frequency ◽

Loop Analysis ◽

Loop Unrolling ◽

Loop Pipelining ◽

High Level

In this work, we proposed High-Level Synthesis (HLS) optimization processes to improve the speed and the resource usage of complex algorithms, especially nested-loop. The proposed HLS optimization processes are divided into four steps: array sizing is performed to decrease the resource usage on Programmable Logic (PL) part, loop analysis is performed to determine which loop must be loop unrolling or loop pipelining, array partitioning is performed to resolve the bottleneck of loop unrolling and loop pipelining, and HLS interface is performed to select the best block level and port level interface for array argument of RTL design. A case study road lane detection was analyzed and applied with suitable optimization techniques to implement on the Xilinx Zynq-7000 family (Zybo ZC7010-1) which was a low-cost FPGA. From the experimental results, our proposed method reaches 6.66 times faster than the primitive method at clock frequency 100 MHz or about 6 FPS. Although the proposed methods cannot reach the standard real-time (25 FPS), they can instruct HLS developers for speed increasing and resource decreasing on an FPGA.

Download Full-text

Fast FPGA Prototyping based Real-Time Image and Video Processing with High-Level Synthesis

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2020.0110215 ◽

2020 ◽

Vol 11 (2) ◽

Author(s):

Refka Ghodhbani ◽

Layla Horrigue ◽

Taoufik Saidani ◽

Mohamed Atri

Keyword(s):

Real Time ◽

Video Processing ◽

High Level Synthesis ◽

Image And Video Processing ◽

Real Time Image ◽

High Level ◽

Time Image

Download Full-text

Low-Cost Image and Video Processing Using High-Performance Middleware in Single-board Computers with Open Internet Standards

IEEE Latin America Transactions ◽

10.1109/tla.2019.9082243 ◽

2019 ◽

Vol 18 (02) ◽

pp. 311-318

Author(s):

Carlos Alejandro Perez ◽

Mario Sergio Cleva ◽

Diego Orlando Liska ◽

Dominga Concepcion Aquino ◽

Claudio Rodrigues da Fonseca

Keyword(s):

Video Processing ◽

High Performance ◽

Low Cost ◽

Image And Video Processing

Download Full-text

High-Level Synthesis Design for Stencil Computations on FPGA with High Bandwidth Memory

Electronics ◽

10.3390/electronics9081275 ◽

2020 ◽

Vol 9 (8) ◽

pp. 1275

Author(s):

Changdao Du ◽

Yoshiki Yamaguchi

Keyword(s):

Programming Languages ◽

High Performance ◽

Design Space Exploration ◽

Scale Up ◽

High Level Synthesis ◽

Stencil Computations ◽

Temporal Domain ◽

High Bandwidth ◽

Promising Solution ◽

High Level

Due to performance and energy requirements, FPGA-based accelerators have become a promising solution for high-performance computations. Meanwhile, with the help of high-level synthesis (HLS) compilers, FPGA can be programmed using common programming languages such as C, C++, or OpenCL, thereby improving design efficiency and portability. Stencil computations are significant kernels in various scientific applications. In this paper, we introduce an architecture design for implementing stencil kernels on state-of-the-art FPGA with high bandwidth memory (HBM). Traditional FPGAs are usually equipped with external memory, e.g., DDR3 or DDR4, which limits the design space exploration in the spatial domain of stencil kernels. Therefore, many previous studies mainly relied on exploiting parallelism in the temporal domain to eliminate the bandwidth limitations. In our approach, we scale-up the design performance by considering both the spatial and temporal parallelism of the stencil kernel equally. We also discuss the design portability among different HLS compilers. We use typical stencil kernels to evaluate our design on a Xilinx U280 FPGA board and compare the results with other existing studies. By adopting our method, developers can take broad parallelization strategies based on specific FPGA resources to improve performance.

Download Full-text