scholarly journals Approximate Hardware Architecture for Interpolation Filter of Versatile Video Coding

2021 ◽  
Vol 16 (2) ◽  
pp. 1-8
Author(s):  
Giovane Gomes Silva ◽  
Ícaro Gonçalves Siqueira ◽  
Mateus Grellert ◽  
Claudio Machado Diniz

The new Versatile Video Coding (VVC) standard was recently developed to improve compression efficiency of previous video coding standards and to support new applications. This was achieved at the cost of an increase in the computational complexity of the encoder algorithms, which leads to the need to develop hardware accelerators and to apply approximate computing techniques to achieve the performance and power dissipation required for systems that encode video. This work proposes the implementation of an approximate hardware architecture for interpolation filters defined in the VVC standard targeting real-time processing of high resolution videos. The architecture is able to process up to 2560x1600 pixels videos at 30 fps with power dissipation of 23.9 mW when operating at a frequency of 522 MHz, with an average compression efficiency degradation of only 0.41% compared to default VVC video encoder software configuration.

Author(s):  
MyungJun Kim ◽  
Yung-Lyul Lee

High Efficiency Video Coding (HEVC) uses an 8-point filter and a 7-point filter, which are based on the discrete cosine transform (DCT), for the 1/2-pixel and 1/4-pixel interpolations, respectively. In this paper, discrete sine transform (DST)-based interpolation filters (IF) are proposed. The first proposed DST-based IFs (DST-IFs) use 8-point and 7-point filters for the 1/2-pixel and 1/4-pixel interpolations, respectively. The final proposed DST-IFs use 12-point and 11-point filters for the 1/2-pixel and 1/4-pixel interpolations, respectively. These DST-IF methods are proposed to improve the motion-compensated prediction in HEVC. The 8-point and 7-point DST-IF methods showed average BD-rate reductions of 0.7% and 0.3% in the random access (RA) and low delay B (LDB) configurations, respectively. The 12-point and 11-point DST-IF methods showed average BD-rate reductions of 1.4% and 1.2% in the RA and LDB configurations for the Luma component, respectively.


2020 ◽  
Vol 10 (3) ◽  
pp. 24
Author(s):  
Stefania Preatto ◽  
Andrea Giannini ◽  
Luca Valente ◽  
Guido Masera ◽  
Maurizio Martina

High Efficiency Video Coding (HEVC) is the latest video standard developed by the Joint Video Exploration Team. HEVC is able to offer better compression results than preceding standards but it suffers from a high computational complexity. In particular, one of the most time consuming blocks in HEVC is the fractional-sample interpolation filter, which is used in both the encoding and the decoding processes. Integrating different state-of-the-art techniques, this paper presents an architecture for interpolation filters, able to trade quality for energy and power efficiency by exploiting approximate interpolation filters and by halving the amount of required memory with respect to state-of-the-art implementations.


2020 ◽  
Author(s):  
Daniela Catelan ◽  
Ricardo Santos ◽  
Liana Duenha

With the end of Dennard's scale, designers have been looking for new alternatives and approximate computing (AC) has managed to attract the attention of researchers, by offering techniques ranging from the application level to the circuit level. When applying approximate circuit techniques in hardware design, the program user may speed up the application while a designer may save area and power dissipation at the cost of less accuracy on the operations results. This paper discusses the compromise between accuracy versus physical efficiency by presenting a set of experiments and results of tailor-made approximate arithmetic circuits on Field-Programmable Gate Array (FPGA) platforms. Our results reveal that an approximate circuit with accuracy control could not be useful if the goal is to save circuit area or even power dissipation. Even for circuits that seem to have power efficiency, we should care about the size and prototyping platform where the hardware will be used.


Author(s):  
Daiane Freitas ◽  
Rafael da Silva ◽  
Icaro Siqueira ◽  
Claudio M. Diniz ◽  
Ricardo A. L. Reis ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
pp. 9
Author(s):  
Samar Ghabraei ◽  
Morteza Rezaalipour ◽  
Masoud Dehyadegari ◽  
Mahdi Nazm Bojnordi

Floating-point multipliers have been the key component of nearly all forms of modern computing systems. Most data-intensive applications, such as deep neural networks (DNNs), expend the majority of their resources and energy budget for floating-point multiplication. The error-resilient nature of these applications often suggests employing approximate computing to improve the energy-efficiency, performance, and area of floating-point multipliers. Prior work has shown that employing hardware-oriented approximation for computing the mantissa product may result in significant system energy reduction at the cost of an acceptable computational error. This article examines the design of an approximate comparator used for preforming mantissa products in the floating-point multipliers. First, we illustrate the use of exact comparators for enhancing power, area, and delay of floating-point multipliers. Then, we explore the design space of approximate comparators for designing efficient approximate comparator-enabled multipliers (AxCEM). Our simulation results indicate that the proposed architecture can achieve a 66% reduction in power dissipation, another 66% reduction in die-area, and a 71% decrease in delay. As compared with the state-of-the-art approximate floating-point multipliers, the accuracy loss in DNN applications due to the proposed AxCEM is less than 0.06%.


2021 ◽  
Author(s):  
Daiane Freitas ◽  
Claudio M. Diniz ◽  
Mateus Grellert ◽  
Guilherme Correa

High-performance VLSI systems are essential in real-time applications, in order to increase the performance of the VLSI systems, an approximate computing technique is followed where the performance of the circuit is enhanced by trading off it with a slight loss in the accuracy. These approximate circuits are used in error-tolerant applications, where output need not be accurate. This paper concentrates mainly on approximate adders, as they are major building blocks of DSP systems. The analysis of the Lower-part OR Adder for 4-bit addition and comparison of it with the precise adder i.e., Ripple Carry Adder using the mentor graphics tool in 90 nm CMOS technology are presented in this paper. Our experimental results show that there is 17%-70% savings in power dissipation, 4%-32% saving in the area, and 19%-84% savings in time due to approximate adder. As the LOA-2 and LOA-3 are performing optimally these two adders can be used for error-tolerant applications and based on the requirement LOA-2 or LOA-3 can be selected.


Sign in / Sign up

Export Citation Format

Share Document