scholarly journals New Motion Estimation Algorithms and its VLSI architectures for Real Time High Definition Video Coding

2012 ◽  
Vol 7 (1) ◽  
pp. 37-46
Author(s):  
Gustavo Sanchez ◽  
Marcelo Porto ◽  
Diego Noble ◽  
Sergio Bampi ◽  
Luciano Agostini

This paper presents an efficient hardware design using the new Motion Estimation (ME) algorithms named: Multi-point Diamond Search (MPDS) and Dynamic Multi-Point Diamond Search (DMPDS). These algorithms are more efficient to avoid from local minima falls than traditional fast algorithms.This fact contributes to increase the quality of the motion vectors, especially in High Definition (HD) videos, were the number of local minima are considerable higher. Two versions of MPDS algorithm were proposed. The first one, focused on high performance, is capable to process videos QFHD at 30 frames per second when synthesized to Altera Stratix 4 and 90nm TSCM, with only 18mW. The second version is focused on quality enhancement and is capable to process HD 1080p videos in real time. The DMPDS architecture has been developed focusing on high performance and was synthesized to Altera stratix 4. This architecture is capable to process videos QFHD at 34 frames per second. In comparison to related works, our solutions obtained the highest processing rates, and a good trade-off among power consumption, area, memory bits and performance.

2011 ◽  
Vol 2011 ◽  
pp. 1-9 ◽  
Author(s):  
Marcel M. Corrêa ◽  
Mateus T. Schoenknecht ◽  
Robson S. Dornelles ◽  
Luciano V. Agostini

This paper presents a high-performance hardware architecture for the H.264/AVC Half-Pixel Motion Estimation that targets high-definition videos. This design can process very high-definition videos like QHDTV () in real time (30 frames per second). It also presents an optimized arrangement of interpolated samples, which is the main key to achieve an efficient search. The interpolation process is interleaved with the SAD calculation and comparison, allowing the high throughput. The architecture was fully described in VHDL, synthesized for two different Xilinx FPGA devices, and it achieved very good results when compared to related works.


2012 ◽  
Vol 2012 ◽  
pp. 1-12 ◽  
Author(s):  
Gustavo Sanchez ◽  
Felipe Sampaio ◽  
Marcelo Porto ◽  
Sergio Bampi ◽  
Luciano Agostini

This paper presents a new fast motion estimation (ME) algorithm targeting high resolution digital videos and its efficient hardware architecture design. The new Dynamic Multipoint Diamond Search (DMPDS) algorithm is a fast algorithm which increases the ME quality when compared with other fast ME algorithms. The DMPDS achieves a better digital video quality reducing the occurrence of local minima falls, especially in high definition videos. The quality results show that the DMPDS is able to reach an average PSNR gain of 1.85 dB when compared with the well-known Diamond Search (DS) algorithm. When compared to the optimum results generated by the Full Search (FS) algorithm the DMPDS shows a lose of only 1.03 dB in the PSNR. On the other hand, the DMPDS reached a complexity reduction higher than 45 times when compared to FS. The quality gains related to DS caused an expected increase in the DMPDS complexity which uses 6.4-times more calculations than DS. The DMPDS architecture was designed focused on high performance and low cost, targeting to process Quad Full High Definition (QFHD) videos in real time (30 frames per second). The architecture was described in VHDL and synthesized to Altera Stratix 4 and Xilinx Virtex 5 FPGAs. The synthesis results show that the architecture is able to achieve processing rates higher than 53 QFHD fps, reaching the real-time requirements. The DMPDS architecture achieved the highest processing rate when compared to related works in the literature. This high processing rate was obtained designing an architecture with a high operation frequency and low numbers of cycles necessary to process each block.


Author(s):  
Kersten Schuster ◽  
Philip Trettner ◽  
Leif Kobbelt

We present a numerical optimization method to find highly efficient (sparse) approximations for convolutional image filters. Using a modified parallel tempering approach, we solve a constrained optimization that maximizes approximation quality while strictly staying within a user-prescribed performance budget. The results are multi-pass filters where each pass computes a weighted sum of bilinearly interpolated sparse image samples, exploiting hardware acceleration on the GPU. We systematically decompose the target filter into a series of sparse convolutions, trying to find good trade-offs between approximation quality and performance. Since our sparse filters are linear and translation-invariant, they do not exhibit the aliasing and temporal coherence issues that often appear in filters working on image pyramids. We show several applications, ranging from simple Gaussian or box blurs to the emulation of sophisticated Bokeh effects with user-provided masks. Our filters achieve high performance as well as high quality, often providing significant speed-up at acceptable quality even for separable filters. The optimized filters can be baked into shaders and used as a drop-in replacement for filtering tasks in image processing or rendering pipelines.


2010 ◽  
Vol 5 (1) ◽  
pp. 78-88 ◽  
Author(s):  
Marcelo Porto ◽  
André Silva ◽  
Sergo Almeida ◽  
Eduardo Da Costa ◽  
Sergio Bampi

This paper presents real time HDTV (High Definition Television) architecture for Motion Estimation (ME) using efficient adder compressors. The architecture is based on the Quarter Sub-sampled Diamond Search algorithm (QSDS) with Dynamic Iteration Control (DIC) algorithm. The main characteristic of the proposed architecture is the large amount of Processing Units (PUs) that are used to calculate the SAD (Sum of Absolute Difference) metric. The internal structures of the PUs are composed by a large number of addition operations to calculate the SADs. In this paper, efficient 4-2 and 8-2 adder compressors are used in the PUs architecture to achieve the performance to work with HDTV (High Definition Television) videos in real time at 30 frames per second. These adder compressors enable the simultaneous addition of 4 and 8 operands respectively. The PUs, using adder compressors, were applied to the ME architecture. The implemented architecture was described in VHDL and synthesized to FPGA and, with Leonardo Spectrum tool, to the TSMC 0.18μm CMOS standard cell technology. Synthesis results indicate that the new QSDS-DIC architecture reach the best performance result and enable gains of 12% in terms of processing rate. The architecture can reach real time for full HDTV (1920x1080 pixels) in the worst case processing 65 frames per second, and it can process 269 HDTV frames per second in the average case.


2016 ◽  
Vol 25 (08) ◽  
pp. 1650083
Author(s):  
P. Muralidhar ◽  
C. B. Rama Rao

Motion estimation (ME) is a highly computationally intensive operation in video compression. Efficient ME architectures are proposed in the literature. This paper presents an efficient low computational complexity systolic architecture for full search block matching ME (FSBME) algorithm. The proposed architecture is based on one-bit transform-based full search (FS) algorithm. The proposed ME hardware architectures perform FS ME for four macroblocks (MBs) in parallel. The proposed hardware architecture is implemented in VHDL. The FSBME hardware consumes 34% of the slices in a Xilinx Vertex XC6vlx240T FPGA device with a maximum frequency of 133[Formula: see text]MHz and is capable of processing full high definition (HD) ([Formula: see text]) frames at a rate of 60 frames per second.


1998 ◽  
Vol 4 (1) ◽  
pp. 67-79 ◽  
Author(s):  
Marco Accame ◽  
Francesco G.B. De Natale ◽  
Daniele D. Giusto

Author(s):  
Wael Farag ◽  

In this paper, based on the fusion of Lidar and Radar measurement data, high-definition probabilistic maps, and a tailored particle filter, a Real-Time Monte Carlo Localization (RT_MCL) method for autonomous cars is proposed. The lidar and radar devices are installed on the ego car, and a customized Unscented Kalman Filter (UKF) is used for their data fusion. Lidars are accurate in determining objects' positions and have a much higher spatial resolution. On the other hand, Radars are more accurate in measuring objects velocities and perform well in extreme weather conditions. Therefore, the merits of both sensors are combined using the UKF to provide pole-like static-objects pose estimations that are well suited to serve as landmarks for vehicle localization in urban environments. These pose estimations are then clustered using the Grid-Based Density-Based Spatial Clustering of Applications with Noise (GB-DBSCAN) algorithm to represent each pole landmarks in the form of a source-point model to reduce computational cost and memory requirements. A reference map that includes pole landmarks is generated off-line and extracted from a 3-D lidar to be used by a carefully designed Particle Filter (PF) for accurate ego-car localization. The particle filter is initialized by the combined GPS+IMU reading and used an ego-car motion model to predict the states of the particles. The data association between the estimated landmarks by the UKF and that in the reference map is performed using Iterative Closest Point (ICP) algorithm. The proposed pipeline is implemented using the high-performance language C++ and utilizes highly optimized math and optimization libraries for best real-time performance. Extensive simulation studies have been carried out to evaluate the performance of the RT_MCL in both longitudinal and lateral localization.


Sign in / Sign up

Export Citation Format

Share Document