Approximate Hardware Architecture for Interpolation Filter of Versatile Video Coding

Giovane Gomes Silva; Ícaro Gonçalves Siqueira; Mateus Grellert; Claudio Machado Diniz

doi:10.29292/jics.v16i2.327

Approximate Hardware Architecture for Interpolation Filter of Versatile Video Coding

Journal of Integrated Circuits and Systems ◽

10.29292/jics.v16i2.327 ◽

2021 ◽

Vol 16 (2) ◽

pp. 1-8

Author(s):

Giovane Gomes Silva ◽

Ícaro Gonçalves Siqueira ◽

Mateus Grellert ◽

Claudio Machado Diniz

Keyword(s):

Video Coding ◽

Power Dissipation ◽

Hardware Architecture ◽

Approximate Computing ◽

Compression Efficiency ◽

Real Time Processing ◽

Interpolation Filter ◽

Interpolation Filters ◽

Software Configuration ◽

The Cost

The new Versatile Video Coding (VVC) standard was recently developed to improve compression efficiency of previous video coding standards and to support new applications. This was achieved at the cost of an increase in the computational complexity of the encoder algorithms, which leads to the need to develop hardware accelerators and to apply approximate computing techniques to achieve the performance and power dissipation required for systems that encode video. This work proposes the implementation of an approximate hardware architecture for interpolation filters defined in the VVC standard targeting real-time processing of high resolution videos. The architecture is able to process up to 2560x1600 pixels videos at 30 fps with power dissipation of 23.9 mW when operating at a frequency of 522 MHz, with an average compression efficiency degradation of only 0.41% compared to default VVC video encoder software configuration.

Download Full-text

Discrete Sine Transform-Based Interpolation Filter for Video Compression

10.20944/preprints201710.0097.v1 ◽

2017 ◽

Author(s):

MyungJun Kim ◽

Yung-Lyul Lee

Keyword(s):

Video Coding ◽

Discrete Cosine Transform ◽

Video Compression ◽

High Efficiency ◽

Random Access ◽

High Efficiency Video Coding ◽

Interpolation Filter ◽

Interpolation Filters ◽

Low Delay ◽

Sine Transform

High Efficiency Video Coding (HEVC) uses an 8-point filter and a 7-point filter, which are based on the discrete cosine transform (DCT), for the 1/2-pixel and 1/4-pixel interpolations, respectively. In this paper, discrete sine transform (DST)-based interpolation filters (IF) are proposed. The first proposed DST-based IFs (DST-IFs) use 8-point and 7-point filters for the 1/2-pixel and 1/4-pixel interpolations, respectively. The final proposed DST-IFs use 12-point and 11-point filters for the 1/2-pixel and 1/4-pixel interpolations, respectively. These DST-IF methods are proposed to improve the motion-compensated prediction in HEVC. The 8-point and 7-point DST-IF methods showed average BD-rate reductions of 0.7% and 0.3% in the random access (RA) and low delay B (LDB) configurations, respectively. The 12-point and 11-point DST-IF methods showed average BD-rate reductions of 1.4% and 1.2% in the RA and LDB configurations for the Luma component, respectively.

Download Full-text

Optimized VLSI Architecture of HEVC Fractional Pixel Interpolators with Approximate Computing

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea10030024 ◽

2020 ◽

Vol 10 (3) ◽

pp. 24

Author(s):

Stefania Preatto ◽

Andrea Giannini ◽

Luca Valente ◽

Guido Masera ◽

Maurizio Martina

Keyword(s):

Computational Complexity ◽

Power Efficiency ◽

High Efficiency ◽

State Of The Art ◽

Vlsi Architecture ◽

Approximate Computing ◽

High Efficiency Video Coding ◽

Interpolation Filter ◽

Interpolation Filters ◽

High Computational Complexity

High Efficiency Video Coding (HEVC) is the latest video standard developed by the Joint Video Exploration Team. HEVC is able to offer better compression results than preceding standards but it suffers from a high computational complexity. In particular, one of the most time consuming blocks in HEVC is the fractional-sample interpolation filter, which is used in both the encoding and the decoding processes. Integrating different state-of-the-art techniques, this paper presents an architecture for interpolation filters, able to trade quality for energy and power efficiency by exploiting approximate interpolation filters and by halving the amount of required memory with respect to state-of-the-art implementations.

Download Full-text

Accuracy and Physical Characterization of Approximate Arithmetic Circuits

10.5753/wscad.2020.14065 ◽

2020 ◽

Author(s):

Daniela Catelan ◽

Ricardo Santos ◽

Liana Duenha

Keyword(s):

Power Dissipation ◽

Power Efficiency ◽

Arithmetic Circuits ◽

Approximate Computing ◽

Accuracy Control ◽

Field Programmable ◽

Speed Up ◽

Circuit Techniques ◽

The Cost

With the end of Dennard's scale, designers have been looking for new alternatives and approximate computing (AC) has managed to attract the attention of researchers, by offering techniques ranging from the application level to the circuit level. When applying approximate circuit techniques in hardware design, the program user may speed up the application while a designer may save area and power dissipation at the cost of less accuracy on the operations results. This paper discusses the compromise between accuracy versus physical efﬁciency by presenting a set of experiments and results of tailor-made approximate arithmetic circuits on Field-Programmable Gate Array (FPGA) platforms. Our results reveal that an approximate circuit with accuracy control could not be useful if the goal is to save circuit area or even power dissipation. Even for circuits that seem to have power efﬁciency, we should care about the size and prototyping platform where the hardware will be used.

Download Full-text

Hardware Architecture for the Regular Interpolation Filter of the AV1 Video Coding Standard

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287551 ◽

2021 ◽

Author(s):

Daiane Freitas ◽

Rafael da Silva ◽

Icaro Siqueira ◽

Claudio M. Diniz ◽

Ricardo A. L. Reis ◽

...

Keyword(s):

Video Coding ◽

Hardware Architecture ◽

Interpolation Filter

Download Full-text

AxCEM: Designing Approximate Comparator-Enabled Multipliers

Journal of Low Power Electronics and Applications ◽

10.3390/jlpea10010009 ◽

2020 ◽

Vol 10 (1) ◽

pp. 9

Author(s):

Samar Ghabraei ◽

Morteza Rezaalipour ◽

Masoud Dehyadegari ◽

Mahdi Nazm Bojnordi

Keyword(s):

Power Dissipation ◽

Floating Point ◽

Computational Error ◽

Approximate Computing ◽

Computing Systems ◽

Error Resilient ◽

Data Intensive ◽

Efficiency Performance ◽

The Cost ◽

Data Intensive Applications

Floating-point multipliers have been the key component of nearly all forms of modern computing systems. Most data-intensive applications, such as deep neural networks (DNNs), expend the majority of their resources and energy budget for floating-point multiplication. The error-resilient nature of these applications often suggests employing approximate computing to improve the energy-efficiency, performance, and area of floating-point multipliers. Prior work has shown that employing hardware-oriented approximation for computing the mantissa product may result in significant system energy reduction at the cost of an acceptable computational error. This article examines the design of an approximate comparator used for preforming mantissa products in the floating-point multipliers. First, we illustrate the use of exact comparators for enhancing power, area, and delay of floating-point multipliers. Then, we explore the design space of approximate comparators for designing efficient approximate comparator-enabled multipliers (AxCEM). Our simulation results indicate that the proposed architecture can achieve a 66% reduction in power dissipation, another 66% reduction in die-area, and a 71% decrease in delay. As compared with the state-of-the-art approximate floating-point multipliers, the accuracy loss in DNN applications due to the proposed AxCEM is less than 0.06%.

Download Full-text

New insights into improving compression efficiency for distributed video coding

2009 17th International Packet Video Workshop ◽

10.1109/packet.2009.5152162 ◽

2009 ◽

Cited By ~ 1

Author(s):

Guogang Hua ◽

Chang Wen Chen

Keyword(s):

Video Coding ◽

Distributed Video Coding ◽

Compression Efficiency

Download Full-text

Adaptive pre-interpolation filter for high efficiency video coding

Journal of Visual Communication and Image Representation ◽

10.1016/j.jvcir.2010.12.008 ◽

2011 ◽

Vol 22 (8) ◽

pp. 697-703 ◽

Cited By ~ 1

Author(s):

Jie Dong ◽

King Ngi Ngan

Keyword(s):

Video Coding ◽

High Efficiency ◽

High Efficiency Video Coding ◽

Interpolation Filter

Download Full-text

High-Throughput Sharp Interpolation Filter Hardware Architecture for the AV1 Video Codec

10.1109/sbcci53441.2021.9529993 ◽

2021 ◽

Author(s):

Daiane Freitas ◽

Claudio M. Diniz ◽

Mateus Grellert ◽

Guilherme Correa

Keyword(s):

High Throughput ◽

Hardware Architecture ◽

Video Codec ◽

Interpolation Filter

Download Full-text

Separable adaptive interpolation filter for video coding

2008 15th IEEE International Conference on Image Processing ◽

10.1109/icip.2008.4712301 ◽

2008 ◽

Cited By ~ 17

Author(s):

Steffen Wittmann ◽

Thomas Wedi

Keyword(s):

Video Coding ◽

Interpolation Filter ◽

Adaptive Interpolation

Download Full-text

Approximate Adder: Lower-Part or Adder

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.g5832.059720 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1176-1180

Keyword(s):

Real Time ◽

Power Dissipation ◽

High Performance ◽

Building Blocks ◽

Cmos Technology ◽

Approximate Computing ◽

Computing Technique ◽

Ripple Carry Adder ◽

Real Time Applications ◽

Dsp Systems

High-performance VLSI systems are essential in real-time applications, in order to increase the performance of the VLSI systems, an approximate computing technique is followed where the performance of the circuit is enhanced by trading off it with a slight loss in the accuracy. These approximate circuits are used in error-tolerant applications, where output need not be accurate. This paper concentrates mainly on approximate adders, as they are major building blocks of DSP systems. The analysis of the Lower-part OR Adder for 4-bit addition and comparison of it with the precise adder i.e., Ripple Carry Adder using the mentor graphics tool in 90 nm CMOS technology are presented in this paper. Our experimental results show that there is 17%-70% savings in power dissipation, 4%-32% saving in the area, and 19%-84% savings in time due to approximate adder. As the LOA-2 and LOA-3 are performing optimally these two adders can be used for error-tolerant applications and based on the requirement LOA-2 or LOA-3 can be selected.

Download Full-text