scholarly journals Bagged Tree And ResNet Based Joint End-to-End Fast CTU Partition Decision Algorithm For Video Intra Coding

Author(s):  
Yixiao Li ◽  
Lixiang Li ◽  
Yuan Fang ◽  
Haipeng Peng ◽  
Nam Ling

Abstract In the development of video coding standards, advanced ones have greatly improved the bit rate compared with those of previous generation, but also brought a huge increase in coding complexity. Coding standards, such as high efficiency video coding (HEVC), versatile video coding (VVC) and AOMedia video 2 (AV2), get the optimal encoding performance by traversing all possible combinations of coding unit (CU) partition and selecting the combination with minimum coding cost. This process of searching for the best makes up a large part of encoding complexity. To reduce the complexity of coding block partition for many video coding standards, this paper proposes an end-to-end fast algorithm for partition structure decision of coding tree unit (CTU) in intra coding. It can be extended to various coding standards with fine tuning, and is applied to the intra coding of HEVC reference software HM16.7 as an example. In the proposed method, the splitting decision of a CTU is made by a well designed bagged tree model firstly. Then, the partition problem of a 32×32 sized CU is modeled as a 17-output classification task and solved by a well trained residual network (ResNet). Jointly using bagged tree and ResNet, the proposed fast CTU partition algorithm is able to generate the partition quad-tree structure of a CTU through an end-to-end prediction process, instead of multiple decision making procedures at depth level. Besides, several effective and representative datasets are also conducted in this paper to lay the foundation of high prediction accuracy. Compared with the original HM16.7 encoder, experimental results show that the proposed algorithm can reduce the encoding time by 59.79% on average, while the BD-rate loss is as less as 2.02%, which outperforms the results of most of state-of-the-art approaches in the fast intra CU partition area.

Increasing applications of videos in everyday life demands compressing the videos further. International bodies for Video Coding standards are working toward making it more efficient in terms of reducing bitrate so as to efficiently compress the high-resolution videos. With increasing resolution, the size of the Coding Units increases. Latest Video Coding techniques like High Efficiency Video Coding (HEVC) and Versatile Video coding (VVC) proposed Larger coding Units with flexible Quadtree decompositions. In Inter-picture prediction all the sub blocks have to find best partitioning structure during motion estimation. Due to larger coding units finding the best partitioning introduces computational complexity. In the proposed work we present a computational complexity control scheme using predictive data mining. The method helps to predict whether to split or no split the coding unit. The decision tree model trained offline in the proposed work achieves 77.73% saving in encoding time with minimal change of 0.15 in average PSNR and 0.00074 in average SSIM values.


2019 ◽  
Vol 29 (03) ◽  
pp. 2050046
Author(s):  
Xin Li ◽  
Na Gong

The state-of-the-art high efficiency video coding (HEVC/H.265) adopts the hierarchical quadtree-structured coding unit (CU) to enhance the coding efficiency. However, the computational complexity significantly increases because of the exhaustive rate-distortion (RD) optimization process to obtain the optimal coding tree unit (CTU) partition. In this paper, we propose a fast CU size decision algorithm to reduce the heavy computational burden in the encoding process. In order to achieve this, the CU splitting process is modeled as a three-stage binary classification problem according to the CU size from [Formula: see text], [Formula: see text] to [Formula: see text]. In each CU partition stage, a deep learning approach is applied. Appropriate and efficient features for training the deep learning models are extracted from spatial and pixel domains to eliminate the dependency on video content as well as on encoding configurations. Furthermore, the deep learning framework is built as a third-party library and embedded into the HEVC simulator to speed up the process. The experiment results show the proposed algorithm can achieve significant complexity reduction and it can reduce the encoding time by 49.65%(Low Delay) and 48.81% (Random Access) on average compared with the traditional HEVC encoders with a negligible degradation (2.78% loss in BDBR, 0.145[Formula: see text]dB loss in BDPSNR for Low Delay, and 2.68% loss in BDBR, 0.128[Formula: see text]dB loss in BDPSNR for Random Access) in the coding efficiency.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Soulef Bouaafia ◽  
Randa Khemiri ◽  
Amna Maraoui ◽  
Fatma Elzahra Sayadi

High-Efficiency Video Coding provides a better compression ratio compared to earlier standard, H.264/Advanced Video Coding. In fact, HEVC saves 50% bit rate compared to H.264/AVC for the same subjective quality. This improvement is notably obtained through the hierarchical quadtree structured Coding Unit. However, the computational complexity significantly increases due to the full search Rate-Distortion Optimization, which allows reaching the optimal Coding Tree Unit partition. Despite the many speedup algorithms developed in the literature, the HEVC encoding complexity still remains a crucial problem in video coding field. Towards this goal, we propose in this paper a deep learning model-based fast mode decision algorithm for HEVC intermode. Firstly, we provide a deep insight overview of the proposed CNN-LSTM, which plays a kernel and pivotal role in this contribution, thus predicting the CU splitting and reducing the HEVC encoding complexity. Secondly, a large training and inference dataset for HEVC intercoding was investigated to train and test the proposed deep framework. Based on this framework, the temporal correlation of the CU partition for each video frame is solved by the LSTM network. Numerical results prove that the proposed CNN-LSTM scheme reduces the encoding complexity by 58.60% with an increase in the BD rate of 1.78% and a decrease in the BD-PSNR of -0.053 dB. Compared to the related works, the proposed scheme has achieved a best compromise between RD performance and complexity reduction, as proven by experimental results.


2020 ◽  
Vol 10 (2) ◽  
pp. 496-501
Author(s):  
Wen Si ◽  
Qian Zhang ◽  
Zhengcheng Shi ◽  
Bin Wang ◽  
Tao Yan ◽  
...  

High Efficiency Video Coding (HEVC) is the next generation video coding standard. In HEVC, 35 intra prediction modes are defined to improve coding efficiency, which result in huge computational complexity, as a large number of prediction modes and a flexible coding unit (CU) structure is adopted in CU coding. To reduce this computational burden, this paper presents a gradient-based candidate list clipping algorithm for Intra mode prediction. Experimental results show that the proposed algorithm can reduce 29.16% total encoding time with just 1.34% BD-rate increase and –0.07 dB decrease of BD-PSNR.


2016 ◽  
Vol 16 (4) ◽  
pp. 883-899 ◽  
Author(s):  
Ismail Marzuki ◽  
Jonghyun Ma ◽  
Yong-Jo Ahn ◽  
Donggyu Sim

H.265 also called High Efficiency Video Coding is the new futuristic international standard proposed by Joint collaboration Team on Video Coding and released in 2013 in the view of constantly increasing demand of video applications. This new standard reduces the bitrate to half as compared to its predecessor H.264 at the expense of huge amount of computational burden on the encoder. In the proposed work we focus on intraprediction phase of video encoding where 33 new angular modes are introduced in addition to DC and Planar mode in order to achieve high quality videos at higher resolutions. We have proposed the use of applied machine learning to HEVC intra prediction to accelerate angular mode decision process. The features used are also low complexity features with minimal computation so as to avoid any additional burden on the encoder. The Decision tree model built is simple yet efficient which is the requirement of the complexity reduction scenario. The proposed method achieves substantial average encoding time saving of 86.59%, with QP values 4,22,27,32 respectively with minimal loss of 0.033 of PSNR and 0.0023 loss in SSIM which makes it suitable for acceptance of High Efficiency Video coding in real time applications


2020 ◽  
pp. 599-609
Author(s):  
Hajar Touzani ◽  
Ibtissem Wali ◽  
Fatima Errahimi ◽  
Anass Mansouri ◽  
Nouri Masmoudi ◽  
...  

New and stronger video compression standard was developed during the last years, called H.265/HEVC (High Efficiency Video Coding). This standard has undergone several improvements compared to H.264/AVC (Advanced Video Coding). In intra prediction block, 33 directional intra prediction modes were included in H.265 to have an efficient coding instead of 8 modes that were used in H.264 in addition to planar and DC modes, which has generated computational complexities in the new standard. Therefore one of the most issues for embedded implementation of HEVC is time reduction of the encoding process. In this paper, an embedded implementation of a fast intra prediction algorithm is performed on ARM processors under the embedded Linux Operating System. Experimental results included the comparison between the original HM16.7 and the proposed algorithm show that the encoding time was reduced by an average of 61.5% with an increase of 1.19 in the bit rate and a small degradation in the PSNR of 0.05%.


Sign in / Sign up

Export Citation Format

Share Document