Video Super Resolution Using Temporal Encoding ConvLSTM and Multi-Stage Fusion

Author(s):  
Yuhang Zhang ◽  
Zhenzhong Chen ◽  
Shan Liu
Author(s):  
Zhongguo Li ◽  
Magnus Oskarsson ◽  
Anders Heyden

AbstractThe task of reconstructing detailed 3D human body models from images is interesting but challenging in computer vision due to the high freedom of human bodies. This work proposes a coarse-to-fine method to reconstruct detailed 3D human body from multi-view images combining Voxel Super-Resolution (VSR) based on learning the implicit representation. Firstly, the coarse 3D models are estimated by learning an Pixel-aligned Implicit Function based on Multi-scale Features (MF-PIFu) which are extracted by multi-stage hourglass networks from the multi-view images. Then, taking the low resolution voxel grids which are generated by the coarse 3D models as input, the VSR is implemented by learning an implicit function through a multi-stage 3D convolutional neural network. Finally, the refined detailed 3D human body models can be produced by VSR which can preserve the details and reduce the false reconstruction of the coarse 3D models. Benefiting from the implicit representation, the training process in our method is memory efficient and the detailed 3D human body produced by our method from multi-view images is the continuous decision boundary with high-resolution geometry. In addition, the coarse-to-fine method based on MF-PIFu and VSR can remove false reconstructions and preserve the appearance details in the final reconstruction, simultaneously. In the experiments, our method quantitatively and qualitatively achieves the competitive 3D human body models from images with various poses and shapes on both the real and synthetic datasets.


2019 ◽  
Vol 19 (04) ◽  
pp. 1950024
Author(s):  
Gunnam Suryanarayana ◽  
Ravindra Dhuli ◽  
Jie Yang

In real time surveillance video applications, it is often required to identify a region of interest in a degraded low resolution (LR) image. State-of-the-art super-resolution (SR) techniques produce images with poor illumination and degraded high frequency details. In this paper, we present a different approach for SISR by correcting the dual-tree complex wavelet transform (DT-CWT) subbands using the multi-stage cascaded joint bilateral filter (MSCJBF) and singular value decomposition (SVD). The proposed method exploits geometric regularity for implementing the covariance-based interpolation in the spatial domain. We decompose the interpolated LR image into different image and wavelet coefficients by employing DT-CWT. To preserve edges, we alter the wavelet sub-bands with the high frequency details obtained from the MSCJBF. Simultaneously, we retain uniform illumination by improving the image coefficients using SVD. In addition, the wavelet sub-bands undergo lanczos interpolation prior to the subband refinement. Experimental results demonstrate the effectiveness of our method.


Author(s):  
Liang Chen ◽  
Jinshan Pan ◽  
Junjun Jiang ◽  
Jiawei Zhang ◽  
Zhen Han ◽  
...  
Keyword(s):  

2021 ◽  
Vol 13 (16) ◽  
pp. 3167
Author(s):  
Lize Zhang ◽  
Wen Lu ◽  
Yuanfei Huang ◽  
Xiaopeng Sun ◽  
Hongyi Zhang

Mainstream image super-resolution (SR) methods are generally based on paired training samples. As the high-resolution (HR) remote sensing images are difficult to collect with a limited imaging device, most of the existing remote sensing super-resolution methods try to down-sample the collected original images to generate an auxiliary low-resolution (LR) image and form a paired pseudo HR-LR dataset for training. However, the distribution of the generated LR images is generally inconsistent with the real images due to the limitation of remote sensing imaging devices. In this paper, we propose a perceptually unpaired super-resolution method by constructing a multi-stage aggregation network (MSAN). The optimization of the network depends on consistency losses. In particular, the first phase is to preserve the contents of the super-resolved results, by constraining the content consistency between the down-scaled SR results and the low-quality low-resolution inputs. The second stage minimizes perceptual feature loss between the current result and LR input to constrain perceptual-content consistency. The final phase employs the generative adversarial network (GAN) to adding photo-realistic textures by constraining perceptual-distribution consistency. Numerous experiments on synthetic remote sensing datasets and real remote sensing images show that our method obtains more plausible results than other SR methods quantitatively and qualitatively. The PSNR of our network is 0.06dB higher than the SOTA method—HAN on the UC Merced test set with complex degradation.


Author(s):  
Huan Wang ◽  
Hao Wu ◽  
Qian Hu ◽  
Jianning Chi ◽  
Xiaosheng Yu ◽  
...  

2021 ◽  
pp. 140-149
Author(s):  
Chun-Mei Feng ◽  
Huazhu Fu ◽  
Shuhao Yuan ◽  
Yong Xu
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document