Still image/video frame lossy compression providing a desired visual quality

2015 ◽  
Vol 27 (3) ◽  
pp. 697-718 ◽  
Author(s):  
Alexander Zemliachenko ◽  
Vladimir Lukin ◽  
Nikolay Ponomarenko ◽  
Karen Egiazarian ◽  
Jaakko Astola
Author(s):  
Nikolay Ponomarenko ◽  
Sergey Krivenko ◽  
Vladimir Lukin ◽  
Karen Egiazarian ◽  
Jaakko T. Astola

Author(s):  
Fangfang Li ◽  
Sergey Krivenko ◽  
Vladimir Lukin

Image information technology has become an important perception technology considering the task of providing lossy image compression with the desired quality using certain encoders Recent researches have shown that the use of a two-step method can perform the compression in a very simple manner and with reduced compression time under the premise of providing a desired visual quality accuracy. However, different encoders have different compression algorithms. These issues involve providing the accuracy of the desired quality. This paper considers the application of the two-step method in an encoder based on a discrete wavelet transform (DWT). In the experiment, bits per pixel (BPP) is used as the control parameter to vary and predict the compressed image quality, and three visual quality evaluation metrics (PSNR, PSNR-HVS, PSNR-HVS-M) are analyzed. In special cases, the two-step method is allowed to be modified. This modification relates to the cases when images subject to lossy compression are either too simple or too complex and linear approximation of dependences is no more valid. Experimental data prove that, compared with the single-step method, after performing the two-step compression method, the mean square error of differences between desired and provided values drops by an order of magnitude. For PSNR-HVS-M, the error of the two-step method does not exceed 3.6 dB. The experiment has been conducted for Set Partitioning in Hierarchical Trees (SPIHT), a typical image encoder based on DWT, but it can be expected that the proposed method applies to other DWT-based image compression techniques. The results show that the application range of the two-step lossy compression method has been expanded. It is not only suitable for encoders based on discrete cosine transform (DCT) but also works well for DWT-based encoders.


Symmetry ◽  
2019 ◽  
Vol 11 (5) ◽  
pp. 619 ◽  
Author(s):  
Ha-Eun Ahn ◽  
Jinwoo Jeong ◽  
Je Woo Kim

Visual quality and algorithm efficiency are two main interests in video frame interpolation. We propose a hybrid task-based convolutional neural network for fast and accurate frame interpolation of 4K videos. The proposed method synthesizes low-resolution frames, then reconstructs high-resolution frames in a coarse-to-fine fashion. We also propose edge loss, to preserve high-frequency information and make the synthesized frames look sharper. Experimental results show that the proposed method achieves state-of-the-art performance and performs 2.69x faster than the existing methods that are operable for 4K videos, while maintaining comparable visual and quantitative quality.


2021 ◽  
Author(s):  
Ágnes Lipovits ◽  
László Czúni ◽  
Katalin Tömördi ◽  
Zsolt Vörösházi

Object tracking is a key task in many applications using video analytics. While there is a huge number of algo- rithms to track objects, there is still a need for new methods to solve the correspondence problem under certain circumstances. In our article, we assume a very typical but still open scenario: a still image object detector has already identified the objects to be tracked; thus, we have object labels, confidence values, and bounding boxes in each video frame captured at a low sampling rate. That is, optical flow methods difficult to be applied (also due to bad lighting conditions, cluttered or homogeneous areas and strong ego-motion), and moreover, many objects look similar (having the same category labels). Our proposed approach is based on the Hungarian method and incorporates the above information into the cost function evaluating the possible pairings of objects. To consider the uncertainty of the detector, the elements of the confusion matrix also contribute to the cost of pairs, as well as the probability of spatial translations based on prior observations. As a use case, we apply the algorithm to a data-set, where images were captured from onboard cameras and traffic signs were detected by RetinaNet. We analyze the performance with different parameter settings


2013 ◽  
Vol 2013 ◽  
pp. 1-5
Author(s):  
Shaik. Mahaboob Basha ◽  
B. C. Jinaga

The research trends that are available in the area of image compression for various imaging applications are not adequate for some of the applications. These applications require good visual quality in processing. In general the tradeoff between compression efficiency and picture quality is the most important parameter to validate the work. The existing algorithms for still image compression were developed by considering the compression efficiency parameter by giving least importance to the visual quality in processing. Hence, we proposed a novel lossless image compression algorithm based on Golomb-Rice coding which was efficiently suited for various types of digital images. Thus, in this work, we specifically address the following problem that is to maintain the compression ratio for better visual quality in the reconstruction and considerable gain in the values of peak signal-to-noise ratios (PSNR). We considered medical images, satellite extracted images, and natural images for the inspection and proposed a novel technique to increase the visual quality of the reconstructed image.


Author(s):  
Kangle Deng ◽  
Tianyi Fei ◽  
Xin Huang ◽  
Yuxin Peng

Automatically generating videos according to the given text is a highly challenging task, where visual quality and semantic consistency with captions are two critical issues. In existing methods, when generating a specific frame, the information in those frames generated before is not fully exploited. And an effective way to measure the semantic accordance between videos and captions remains to be established. To address these issues, we present a novel Introspective Recurrent Convolutional GAN (IRC-GAN) approach. First, we propose a recurrent transconvolutional generator, where LSTM cells are integrated with 2D transconvolutional layers. As 2D transconvolutional layers put more emphasis on the details of each frame than 3D ones, our generator takes both the definition of each video frame and temporal coherence across the whole video into consideration, and thus can generate videos with better visual quality. Second, we propose mutual information introspection to semantically align the generated videos to text. Unlike other methods simply judging whether the video and the text match or not, we further take mutual information to concretely measure the semantic consistency. In this way,  our model is able to introspect the semantic distance between the generated video and the corresponding text, and try to minimize it to boost the semantic consistency.We conduct experiments on 3 datasets and compare with state-of-the-art methods. Experimental results demonstrate the effectiveness of our IRC-GAN to generate plausible videos from given text.


Author(s):  
Ragnar Langseth ◽  
Vamsidhar Reddy Gaddam ◽  
Håkon Kvale Stensland ◽  
Carsten Griwodz ◽  
Pål Halvorsen ◽  
...  

Modern video cameras often only capture a single color per pixel in a single pass operation. This process is called ltering, where pixels are ltered through a color lter array, and the Bayer lter is perhaps the most common lter used today. This means that the missing color channels must be restored in the image or the video frame in a post-processing step, i.e., a process referred to as debayering. In a live video scenario, this operation must be performed eciently in order to output each video frame in real-time, while also yielding acceptable visual quality. Here, the authors evaluate debayering algorithms implemented on a GPU for real-time panoramic video recordings using multiple 2K-resolution cameras.


Sign in / Sign up

Export Citation Format

Share Document