video codecs Latest Research Papers

The block-based coding structure in the hybrid video coding framework inevitably introduces compression artifacts such as blocking, ringing, and so on. To compensate for those artifacts, extensive filtering techniques were proposed in the loop of video codecs, which are capable of boosting the subjective and objective qualities of reconstructed videos. Recently, neural network-based filters were presented with the power of deep learning from a large magnitude of data. Though the coding efficiency has been improved from traditional methods in High-Efficiency Video Coding (HEVC), the rich features and information generated by the compression pipeline have not been fully utilized in the design of neural networks. Therefore, in this article, we propose the Residual-Reconstruction-based Convolutional Neural Network (RRNet) to further improve the coding efficiency to its full extent, where the compression features induced from bitstream in form of prediction residual are fed into the network as an additional input to the reconstructed frame. In essence, the residual signal can provide valuable information about block partitions and can aid reconstruction of edge and texture regions in a picture. Thus, more adaptive parameters can be trained to handle different texture characteristics. The experimental results show that our proposed RRNet approach presents significant BD-rate savings compared to HEVC and the state-of-the-art CNN-based schemes, indicating that residual signal plays a significant role in enhancing video frame reconstruction.

Download Full-text

An Analytic Transform Kernel Derivation Method for Video Codecs

Applied Sciences ◽

10.3390/app11199280 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9280

Author(s):

Ankit Kumar ◽

Bumshik Lee

Keyword(s):

Video Coding ◽

Random Access ◽

Block Size ◽

Unit Element ◽

Video Codecs ◽

Derivation Method ◽

The One ◽

The Relationship ◽

Mathematical Operations ◽

Block Sizes

In the standardization of versatile video coding (VVC), discrete cosine transform (DCT)-2, discrete sine transform (DST)-7, and DCT-8 are regarded as the primary transform kernels. However, DST-4 and DCT-4 can also be considered as the transform kernels instead of using DST-7 and DCT-8 owing to their effectiveness in smaller resolution test sequences. To implement these different block size transform kernels, a considerable amount of memory has to be allocated. Moreover, memory consumption to store different block size transform kernels is regarded as a major issue in video coding standardization. To address this problem, a common sparse unified matrix concept is introduced in this study, where any block size transform kernel matrix can be obtained after some mathematical operations. The proposed common sparse unified matrix saves approximately 80% of the static memory by storing only a few transform kernel elements for DCT-2, DST-7, and DCT-8. Full-required transform kernels are derived using the stored transform kernels and generated unit-element matrices and a permutation matrix. The static memory required is only for 1648 elements instead of 8180 elements, each with 8-bit precision. The defined common sparse unified matrix is composed of two parts: a unified DST-3 matrix and a grouped DST-7 matrix. The unified DST-3 matrix is used to derive different points of DCT-2 transform kernels, and the grouped DST-7 matrix is used to derive different points of DST-7 and DCT-8 transform kernels. The new technique of grouping concept is introduced, which shows the relationship between different rows of DST-7 transform kernels with various block sizes. The proposed grouping concept supports the fast algorithm of DST-7 by implementing the proposed method of the “one group one feature” principle. The simulation was conducted using the VTM-3.0 reference software under common test conditions. The simulation result of the all intra (AI) configuration is Y = 0.00%, U = −0.02%, V = 0.00% with an encoding time of 100%, and a decoding time of 100%. Similarly, the simulation results of random access (RA) configuration are Y = −0.01%, U = 0.09%, V = 0.06%, and the encoding and decoding times are 101% and 100%, respectively. The simulation result of the low delay B (LDB) configuration is Y = 0.01%, U = 0.08%, and V = −0.27%, for encoding and decoding times of 101% and 100%, respectively.

Download Full-text

Universal Modeling of Monoscopic and Multiview Video Codecs with Applications to Encoder Control

10.1109/icip42928.2021.9506735 ◽

2021 ◽

Author(s):

Marek Domanski ◽

Yasir Al-Obaidi ◽

Tomasz Grajek

Keyword(s):

Video Codecs ◽

Multiview Video

Download Full-text

A Method of Codec Comparison and Selection for Good Quality Video Transmission Over Limited-Bandwidth Networks

Sensors ◽

10.3390/s21134589 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4589

Author(s):

Janusz Klink

Keyword(s):

Video Transmission ◽

Interpolation Method ◽

Spline Interpolation ◽

Video Quality ◽

Rate Distortion ◽

Specific Situation ◽

Video Codecs ◽

Limited Bandwidth ◽

Video Footage ◽

Quality Video

Finding a proper balance between video quality and the required bandwidth is an important issue, especially in networks of limited capacity. The problem of comparing the efficiency of video codecs and choosing the most suitable one in a specific situation has become very important. This paper proposes a method of comparing video codecs while also taking into account objective quality assessment metrics. The author shows the process of preparing video footage, assessing its quality, determining the rate–distortion curves, and calculating the bitrate saving for pairs of examined codecs. Thanks to the use of the spline interpolation method, the obtained results are better than those previously presented in the literature, and more resistant to the quality metric used.

Download Full-text

A Dynamic Predictive Search Algorithm for Fast Block-Based Motion Estimation

10.32920/ryerson.14655219 ◽

2021 ◽

Author(s):

Behnaz Abdoli

Keyword(s):

Computational Complexity ◽

Motion Estimation ◽

Compression Ratio ◽

Video Processing ◽

Search Algorithm ◽

Fast Motion Estimation ◽

Low Computational Complexity ◽

Video Codecs ◽

Fast Motion ◽

Block Based

Predictive fast Motion Estimation (ME) algorithms have been widely used in video CODECs due to their performance efficiency and low computational complexity. In this thesis, a new block-based fast motion estimation technique named Dynamic Predictive Search Algorithm (DPSA) is developed, which can be considered in predictive zonal search category. The proposed approach is based on the observation that temporally and spatially adjacent macro-blocks are not just statically correlated, but also dynamic alterations in their motion content are highly coherent. DPSA introduces a new set of six candidate predicted motion vectors. For early termination criteria, DPSA modifies termination procedure of already existing EPZS algorithm. Performance of this newly proposed algorithm has been compared to four other state-of-the-art algorithms implemented on JVT, H.264 standard software platform. Experimental results have proven that DPSA accomplishes up to 38% compression ratio enhancement achieved by a process with more 14.75% less computational complexity and up to0.47 dB higher PSNR values over the EPZS. It also manages to have up to 13% speed up over EPZS algorithm. Because of its simplicity and low computational complexity DPSA is energy efficient for portable video processing in computation- or power-constrained applications and easy to be implemented on both FPGA- and Microcontroller-based embedded systems. Also, higher compression ratio makes DPSA more compatible with limited capacity storage media, and limited band-width transmission networks.

Download Full-text

A Dynamic Predictive Search Algorithm for Fast Block-Based Motion Estimation

10.32920/ryerson.14655219.v1 ◽

2021 ◽

Author(s):

Behnaz Abdoli

Keyword(s):

Computational Complexity ◽

Motion Estimation ◽

Compression Ratio ◽

Video Processing ◽

Search Algorithm ◽

Fast Motion Estimation ◽

Low Computational Complexity ◽

Video Codecs ◽

Fast Motion ◽

Block Based

Predictive fast Motion Estimation (ME) algorithms have been widely used in video CODECs due to their performance efficiency and low computational complexity. In this thesis, a new block-based fast motion estimation technique named Dynamic Predictive Search Algorithm (DPSA) is developed, which can be considered in predictive zonal search category. The proposed approach is based on the observation that temporally and spatially adjacent macro-blocks are not just statically correlated, but also dynamic alterations in their motion content are highly coherent. DPSA introduces a new set of six candidate predicted motion vectors. For early termination criteria, DPSA modifies termination procedure of already existing EPZS algorithm. Performance of this newly proposed algorithm has been compared to four other state-of-the-art algorithms implemented on JVT, H.264 standard software platform. Experimental results have proven that DPSA accomplishes up to 38% compression ratio enhancement achieved by a process with more 14.75% less computational complexity and up to0.47 dB higher PSNR values over the EPZS. It also manages to have up to 13% speed up over EPZS algorithm. Because of its simplicity and low computational complexity DPSA is energy efficient for portable video processing in computation- or power-constrained applications and easy to be implemented on both FPGA- and Microcontroller-based embedded systems. Also, higher compression ratio makes DPSA more compatible with limited capacity storage media, and limited band-width transmission networks.

Download Full-text

Fast Thumbnail Extraction for H.264/AVC, HEVC and VP9

Applied Sciences ◽

10.3390/app11041844 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1844

Author(s):

Joohyung Byeon ◽

Seungchul Jang ◽

Jongseok Lee ◽

Kyungyong Kim ◽

Donggyu Sim

Keyword(s):

High Speed ◽

Intra Prediction ◽

Extraction Time ◽

Memory Usage ◽

Video Codecs ◽

Limited Memory ◽

Full Resolution ◽

Inverse Transform ◽

Partial Inverse

In this paper, we propose a partial decoding method with limited memory usage for high-speed thumbnail extraction. The proposed method performs a partial inverse transform and a partial intra prediction in order to reconstruct pixels for intra prediction and thumbnails. Thereafter, the reconstructed pixels at the bottom and right line of the block are stored in the line buffer and the thumbnail buffer without being stored in the decoded picture buffer with full resolution. H.264/AVC, HEVC and VP9 video codecs have different coding structures, prediction and transforms; however, the proposed algorithm can be applied to the corresponding codecs in the same manner. In order to evaluate the performance of the proposed method, we implemented the proposed algorithm for H.264/AVC, HEVC and VP9. We found that the thumbnail extraction time of the proposed method decreased by 66% in H.264/AVC, 52% in HEVC and 48% in VP9 as compared to the full decoding method.

Download Full-text