A Scalable High-Performance Hardware Architecture for Real-Time Stereo Vision by Semi-Global Matching

The design of H.264/AVC interpolation unit is very challenging for the high memory bandwidth and large calculation complexity caused by the new coding features of variable block size (VBS) and 6-tap filter. In this paper, a novel one-step interpolation implementation algorithm is proposed which can effectively reduce processing cycle because of its less memory accessing. Moreover, a data reuse scheme is used to save processing cycle and memory bandwidth. A high performance hardware architecture is implemented according to the methods mentioned above. As a result, 26% memory bandwidth reduction and 45% processing cycle reduction are achieved, which shows that our architecture is an efficient hardware accelerating solution and can be used in real-time encoder.

Download Full-text

An intelligent ADAS processor with real-time semi-global matching and intention prediction for 720p stereo vision

2016 IEEE Hot Chips 28 Symposium (HCS) ◽

10.1109/hotchips.2016.7936225 ◽

2016 ◽

Cited By ~ 2

Author(s):

Kyuho J. Lee ◽

Kyeongryeol Bong ◽

Changhyeon Kim ◽

Hoi-Jun Yoo

Keyword(s):

Real Time ◽

Stereo Vision ◽

Global Matching

Download Full-text

Real-time stereo vision system using semi-global matching disparity estimation: Architecture and FPGA-implementation

2010 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation ◽

10.1109/icsamos.2010.5642077 ◽

2010 ◽

Cited By ~ 65

Author(s):

Christian Banz ◽

Sebastian Hesselbarth ◽

Holger Flatt ◽

Holger Blume ◽

Peter Pirsch

Keyword(s):

Real Time ◽

Stereo Vision ◽

Vision System ◽

Fpga Implementation ◽

Disparity Estimation ◽

Stereo Vision System ◽

Global Matching

Download Full-text

Matching cost computation algorithm and high speed FPGA architecture for high quality real-time Semi Global Matching stereo vision for road scenes

17th International IEEE Conference on Intelligent Transportation Systems (ITSC) ◽

10.1109/itsc.2014.6958182 ◽

2014 ◽

Cited By ~ 9

Author(s):

Frank Schumacher ◽

Thomas Greiner

Keyword(s):

Real Time ◽

Stereo Vision ◽

High Speed ◽

High Quality ◽

Fpga Architecture ◽

Global Matching ◽

Matching Cost ◽

Computation Algorithm

Download Full-text

Real-time stereo vision using semi-global matching on programmable graphics hardware

ACM SIGGRAPH 2006 Research posters on - SIGGRAPH '06 ◽

10.1145/1179849.1179960 ◽

2006 ◽

Cited By ~ 12

Author(s):

Ilya D. Rosenberg ◽

Philip L. Davidson ◽

Casey M. R. Muller ◽

Jefferson Y. Han

Keyword(s):

Real Time ◽

Stereo Vision ◽

Graphics Hardware ◽

Global Matching

Download Full-text

ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

Sensors ◽

10.3390/s21113938 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3938

Author(s):

Boitumelo Ruf ◽

Jonas Mohrs ◽

Martin Weinmann ◽

Stefan Hinz ◽

Jürgen Beyerer

Keyword(s):

Power Consumption ◽

Real Time ◽

High Performance ◽

Low Cost ◽

Qualitative Evaluation ◽

Image Resolution ◽

Massively Parallel ◽

Graphics Hardware ◽

Global Matching ◽

Stereo Processing

With the emergence of low-cost robotic systems, such as *UAV, the importance of embedded high-performance image processing has increased. For a long time, FPGAs were the only processing hardware that were capable of high-performance computing, while at the same time preserving a low power consumption, essential for embedded systems. However, the recently increasing availability of embedded GPU-based systems, such as the NVIDIA Jetson series, comprised of an ARM CPU and a NVIDIA Tegra GPU, allows for massively parallel embedded computing on graphics hardware. With this in mind, we propose an approach for real-time embedded stereo processing on ARM and CUDA-enabled devices, which is based on the popular and widely used Semi-Global Matching algorithm. In this, we propose an optimization of the algorithm for embedded CUDA GPUs, by using massively parallel computing, as well as using the NEON intrinsics to optimize the algorithm for vectorized SIMD processing on embedded ARM CPUs. We have evaluated our approach with different configurations on two public stereo benchmark datasets to demonstrate that they can reach an error rate as low as 3.3%. Furthermore, our experiments show that the fastest configuration of our approach reaches up to 46 FPS on VGA image resolution. Finally, in a use-case specific qualitative evaluation, we have evaluated the power consumption of our approach and deployed it on the DJI Manifold 2-G attached to a DJI Matrix 210v2 RTK *UAV, demonstrating its suitability for real-time stereo processing onboard a *UAV.

Download Full-text

A High-Throughput Hardware Architecture for the H.264/AVC Half-Pixel Motion Estimation Targeting High-Definition Videos

International Journal of Reconfigurable Computing ◽

10.1155/2011/254730 ◽

2011 ◽

Vol 2011 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Marcel M. Corrêa ◽

Mateus T. Schoenknecht ◽

Robson S. Dornelles ◽

Luciano V. Agostini

Keyword(s):

Motion Estimation ◽

Real Time ◽

High Throughput ◽

High Performance ◽

Hardware Architecture ◽

Interpolation Process ◽

High Definition ◽

Efficient Search ◽

Xilinx Fpga ◽

Very High

This paper presents a high-performance hardware architecture for the H.264/AVC Half-Pixel Motion Estimation that targets high-definition videos. This design can process very high-definition videos like QHDTV () in real time (30 frames per second). It also presents an optimized arrangement of interpolated samples, which is the main key to achieve an efficient search. The interpolation process is interleaved with the SAD calculation and comparison, allowing the high throughput. The architecture was fully described in VHDL, synthesized for two different Xilinx FPGA devices, and it achieved very good results when compared to related works.

Download Full-text

A Scalable High-Performance Hardware Architecture for Real-Time Stereo Vision by Semi-Global Matching

An FPGA-based real-time occlusion robust stereo vision system using semi-global matching

A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching

A High-Performance Hardware Architecture for a Frameless Stereo Vision Algorithm Implemented on a FPGA Platform

A High Performance Hardware Architecture of Sub-Pixel Interpolator for H.264/AVC Encoder

An intelligent ADAS processor with real-time semi-global matching and intention prediction for 720p stereo vision

Real-time stereo vision system using semi-global matching disparity estimation: Architecture and FPGA-implementation

Matching cost computation algorithm and high speed FPGA architecture for high quality real-time Semi Global Matching stereo vision for road scenes

Real-time stereo vision using semi-global matching on programmable graphics hardware

ReS2tAC—UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices

A High-Throughput Hardware Architecture for the H.264/AVC Half-Pixel Motion Estimation Targeting High-Definition Videos

Export Citation Format