scholarly journals Spatio-Temporal Deformable Convolution for Compressed Video Quality Enhancement

2020 ◽  
Vol 34 (07) ◽  
pp. 10696-10703 ◽  
Author(s):  
Jianing Deng ◽  
Li Wang ◽  
Shiliang Pu ◽  
Cheng Zhuo

Recent years have witnessed remarkable success of deep learning methods in quality enhancement for compressed video. To better explore temporal information, existing methods usually estimate optical flow for temporal motion compensation. However, since compressed video could be seriously distorted by various compression artifacts, the estimated optical flow tends to be inaccurate and unreliable, thereby resulting in ineffective quality enhancement. In addition, optical flow estimation for consecutive frames is generally conducted in a pairwise manner, which is computational expensive and inefficient. In this paper, we propose a fast yet effective method for compressed video quality enhancement by incorporating a novel Spatio-Temporal Deformable Fusion (STDF) scheme to aggregate temporal information. Specifically, the proposed STDF takes a target frame along with its neighboring reference frames as input to jointly predict an offset field to deform the spatio-temporal sampling positions of convolution. As a result, complementary information from both target and reference frames can be fused within a single Spatio-Temporal Deformable Convolution (STDC) operation. Extensive experiments show that our method achieves the state-of-the-art performance of compressed video quality enhancement in terms of both accuracy and efficiency.

Author(s):  
Ruixin Liu ◽  
Zhenyu Weng ◽  
Yuesheng Zhu ◽  
Bairong Li

Video inpainting aims to synthesize visually pleasant and temporally consistent content in missing regions of video. Due to a variety of motions across different frames, it is highly challenging to utilize effective temporal information to recover videos. Existing deep learning based methods usually estimate optical flow to align frames and thereby exploit useful information between frames. However, these methods tend to generate artifacts once the estimated optical flow is inaccurate. To alleviate above problem, we propose a novel end-to-end Temporal Adaptive Alignment Network(TAAN) for video inpainting. The TAAN aligns reference frames with target frame via implicit motion estimation at a feature level and then reconstruct target frame by taking the aggregated aligned reference frame features as input. In the proposed network, a Temporal Adaptive Alignment (TAA) module based on deformable convolutions is designed to perform temporal alignment in a local, dense and adaptive manner. Both quantitative and qualitative evaluation results show that our method significantly outperforms existing deep learning based methods.


2020 ◽  
Vol 34 (07) ◽  
pp. 10713-10720
Author(s):  
Mingyu Ding ◽  
Zhe Wang ◽  
Bolei Zhou ◽  
Jianping Shi ◽  
Zhiwu Lu ◽  
...  

A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.


2020 ◽  
Vol 5 (3) ◽  
pp. 159-166
Author(s):  
Sofiane KHOUDOUR ◽  
Zoubeida MESSALI ◽  
Rima BELKHITER

In this paper we establish an extensive quantitative comparative study of patch-based video denoising with optical flow estimation algorithms. Namely, SPTWO, VBM3D and VBM4D algorithms are considered. The aim of this study is to combine these video denoising algorithms in a hybrid proposed process to take advantage of. SPTWO takes advantage of the self-similarity and redundancy of adjacent frames. The proposed hybrid algorithm and the three video denoising algorithms are implemented and tested on real sequences degraded by various level Additive White Gaussian Noise (AWGN). The obtained results are compared in terms of the most used performance criteria for various test cases. The performance criteria computed in this study are: RMSE and SSIM in addition to the running time and visual quality of the sequence video. Experimental results, illustrate that the proposed algorithm and SPTWO provide the best video quality and appear to be efficient in terms of preserving fine texture and detail reconstruction.


Sign in / Sign up

Export Citation Format

Share Document