scholarly journals Practical Evaluation of VMAF Perceptual Video Quality for WebRTC Applications

Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 854 ◽  
Author(s):  
Boni García ◽  
Luis López-Fernández ◽  
Francisco Gortázar ◽  
Micael Gallego

WebRTC is the umbrella term for several emergent technologies aimed to exchange real-time media in the Web. Like other media-related services, the perceived quality of WebRTC communication can be measured using Quality of Experience (QoE) indicators. QoE assessment methods can be classified as subjective (users’ evaluation scores) or objective (models computed as a function of different parameters). In this paper, we focus on VMAF (Video Multi-method Assessment Fusion), which is an emergent full-reference objective video quality assessment model developed by Netflix. VMAF is typically used to assess video streaming services. This paper evaluates the use of VMAF in a different type of application: WebRTC. To that aim, we present a practical use case built on the top of well-known open source technologies, such as JUnit, Selenium, Docker, and FFmpeg. In addition to VMAF, we also calculate other objective QoE video metrics such as Visual Information Fidelity in the pixel domain (VIFp), Structural Similarity (SSIM), or Peak Signal-to-Noise Ratio (PSNR) applied to a WebRTC communication in different network conditions in terms of packet loss. Finally, we compare these objective results with a subjective evaluation using a Mean Opinion Score (MOS) scale to the same WebRTC streams. As a result, we found a strong correlation of the subjective video quality perceived in WebRTC video calls with the objective results computed with VMAF and VIFp in comparison with SSIM and PSNR and their variants.

Author(s):  
Jelena Vlaović ◽  
Drago Žagar ◽  
Snježana Rimac-Drlje ◽  
Mario Vranješ

With the development of Video on Demand applications due to the availability of high-speed internet access, adaptive streaming algorithms have been developing and improving. The focus is on improving user’s Quality of Experience (QoE) and taking it into account as one of the parameters for the adaptation algorithm. Users often experience changing network conditions, so the goal is to ensure stable video playback with satisfying QoE level. Although subjective Video Quality Assessment (VQA) methods provide more accurate results regarding user’s QoE, objective VQA methods cost less and are less time-consuming. In this article, nine different objective VQA methods are compared on a large set of video sequences with various spatial and temporal activities. VQA methods used in this analysis are: Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), MultiScale Structural Similarity Index (MS-SSIM), Video Quality Metric (VQM), Mean Sum of Differences (DELTA), Mean Sum of Absolute Differences (MSAD), Mean Squared Error (MSE), Netflix Video Multimethod Assessment Fusion (Netflix VMAF) and Visual Signal-to-Noise Ratio (VSNR). The video sequences used for testing purposes were encoded according to H.264/AVC with twelve different target coding bitrates, at three different spatial resolutions (resulting in a total of 190 sequences). In addition to objective quality assessment, subjective quality assessment was performed for these sequences. All results acquired by objective VQA methods have been compared with subjective Mean Opinion Score (MOS) results using Pearson Linear Correlation Coefficient (PLCC). Measurement results obtained on a large set of video sequences with different spatial resolutions show that VQA methods like SSIM and VQM correlate better with MOS results compared to PSNR, SSIM, VSNR, DELTA, MSE, VMAF and MSAD. However, the PLCC results for SSIM and VQM are too low (0.7799 and 0.7734, respectively), for the usage of these methods in streaming services instead of subjective testing. These results suggest that more efficient VQA methods should be developed to be used in streaming testing procedures as well as to support the video segmentation process. Furthermore, when comparing results obtained for different spatial resolutions, it can be concluded that the quality of video sequences encoded at lower spatial resolutions in cases of lower target coding bitrate is higher compared to the quality of video sequences encoded at higher spatial resolutions at the same target coding bitrate, particularly when video sequences with higher spatial and temporal information are used.


2018 ◽  
Vol 5 ◽  
pp. 58-67
Author(s):  
Milan Chikanbanjar

Digital images have been a major form of transmission of visual information, but due to the presence of noise, the image gets corrupted. Thus, processing of the received image needs to be done before being used in an application. Denoising of image involves data manipulation to remove noise in order to produce a good quality image retaining different details. Quantitative measures have been used to show the improvement in the quality of the restored image by the use of various thresholding techniques by the use of parameters mainly, MSE (Mean Square Error), PSNR (Peak-Signal-to-Noise-Ratio) and SSIM (Structural Similarity index). Here, non-linear wavelet transform denoising techniques of natural images are studied, analyzed and compared using thresholding techniques such as soft, hard, semi-soft, LevelShrink, SUREShrink, VisuShrink and BayesShrink. On most of the tests, PSNR and SSIM values for LevelShrink Hard thresholding method is higher as compared to other thresholding methods. For instance, from tests PSNR and SSIM values of lena image for VISUShrink Hard, VISUShrink Soft, VISUShrink Semi Soft, LevelShrink Hard, LevelShrink Soft, LevelShrink Semi Soft, SUREShrink, BayesShrink thresholding methods at the variance of 10 are 23.82, 16.51, 23.25, 24.48, 23.25, 20.67, 23.42, 23.14 and 0.28, 0.28, 0.28, 0.29, 0.22, 0.25, 0.16 respectively which shows that the PSNR and SSIM values for LevelShrink Hard thresholding method is higher as compared to other thresholding methods, and so on. Thus, it can be stated that the performance of LevelShrink Hard thresholding method is better on most of tests.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5540
Author(s):  
Nayeem Hasan ◽  
Md Saiful Islam ◽  
Wenyu Chen ◽  
Muhammad Ashad Kabir ◽  
Saad Al-Ahmadi

This paper proposes an encryption-based image watermarking scheme using a combination of second-level discrete wavelet transform (2DWT) and discrete cosine transform (DCT) with an auto extraction feature. The 2DWT has been selected based on the analysis of the trade-off between imperceptibility of the watermark and embedding capacity at various levels of decomposition. DCT operation is applied to the selected area to gather the image coefficients into a single vector using a zig-zig operation. We have utilized the same random bit sequence as the watermark and seed for the embedding zone coefficient. The quality of the reconstructed image was measured according to bit correction rate, peak signal-to-noise ratio (PSNR), and similarity index. Experimental results demonstrated that the proposed scheme is highly robust under different types of image-processing attacks. Several image attacks, e.g., JPEG compression, filtering, noise addition, cropping, sharpening, and bit-plane removal, were examined on watermarked images, and the results of our proposed method outstripped existing methods, especially in terms of the bit correction ratio (100%), which is a measure of bit restoration. The results were also highly satisfactory in terms of the quality of the reconstructed image, which demonstrated high imperceptibility in terms of peak signal-to-noise ratio (PSNR ≥ 40 dB) and structural similarity (SSIM ≥ 0.9) under different image attacks.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Fayadh Alenezi ◽  
K. C. Santosh

One of the major shortcomings of Hopfield neural network (HNN) is that the network may not always converge to a fixed point. HNN, predominantly, is limited to local optimization during training to achieve network stability. In this paper, the convergence problem is addressed using two approaches: (a) by sequencing the activation of a continuous modified HNN (MHNN) based on the geometric correlation of features within various image hyperplanes via pixel gradient vectors and (b) by regulating geometric pixel gradient vectors. These are achieved by regularizing proposed MHNNs under cohomology, which enables them to act as an unconventional filter for pixel spectral sequences. It shifts the focus to both local and global optimizations in order to strengthen feature correlations within each image subspace. As a result, it enhances edges, information content, contrast, and resolution. The proposed algorithm was tested on fifteen different medical images, where evaluations were made based on entropy, visual information fidelity (VIF), weighted peak signal-to-noise ratio (WPSNR), contrast, and homogeneity. Our results confirmed superiority as compared to four existing benchmark enhancement methods.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Álvaro Garcia ◽  
Maria De Lourdes Melo Guedes Alcoforado ◽  
Francisco Madeiro ◽  
Valdemar Cardoso Da Rocha Jr.

This paper investigates the transmission of grey scale images encoded with polar codes and de-coded with successive cancellation list (SCL) decoders in the presence of additive white Gaussian noise. Po-lar codes seem a natural choice for this application be-cause of their error-correction efficiency combined with fast decoding. Computer simulations are carried out for evaluating the influence of different code block lengths in the quality of the decoded images. At the encoder a default polar code construction is used in combination with binary phase shift keying modulation. The results are compared with those obtained by using the clas-sic successive cancellation (SC) decoding introduced by Arikan. The quality of the reconstructed images is assessed by using peak signal to noise ratio (PSNR) and the structural similarity (SSIM) index. Curves of PSNR and SSIM versus code block length are presented il-lustrating the improvement in performance of SCL in comparison with SC.


Algorithms ◽  
2019 ◽  
Vol 12 (7) ◽  
pp. 130 ◽  
Author(s):  
Dinh Trieu Duong ◽  
Huy Phi Cong ◽  
Xiem Hoang Van

Distributed video coding (DVC) is an attractive and promising solution for low complexity constrained video applications, such as wireless sensor networks or wireless surveillance systems. In DVC, visual quality consistency is one of the most important issues to evaluate the performance of a DVC codec. However, it is the fact that the quality of the decoded frames that is achieved in most recent DVC codecs is not consistent and it is varied with high quality fluctuation. In this paper, we propose a novel DVC solution named Joint exploration model based DVC (JEM-DVC) to solve the problem, which can provide not only higher performance as compared to the traditional DVC solutions, but also an effective scheme for the quality consistency control. We first employ several advanced techniques that are provided in the Joint exploration model (JEM) of the future video coding standard (FVC) in the proposed JEM-DVC solution to effectively improve the performance of JEM-DVC codec. Subsequently, for consistent quality control, we propose two novel methods, named key frame quantization (KF-Q) and Wyner-Zip frame quantization (WZF-Q), which determine the optimal values of the quantization parameter (QP) and quantization matrix (QM) applied for the key and WZ frame coding, respectively. The optimal values of QP and QM are adaptively controlled and updated for every key and WZ frames to guarantee the consistent video quality for the proposed codec unlike the conventional approaches. Our proposed JEM-DVC is the first DVC codec in literature that employs the JEM coding technique, and then all of the results that are presented in this paper are new. The experimental results show that the proposed JEM-DVC significantly outperforms the relevant DVC benchmarks, notably the DISCOVER DVC and the recent H.265/HEVC based DVC, in terms of both Peak signal-to-noise ratio (PSNR) performance and consistent visual quality.


Mathematics ◽  
2020 ◽  
Vol 8 (9) ◽  
pp. 1636
Author(s):  
Noé Ortega-Sánchez ◽  
Diego Oliva ◽  
Erik Cuevas ◽  
Marco Pérez-Cisneros ◽  
Angel A. Juan

The techniques of halftoning are widely used in marketing because they reduce the cost of impression and maintain the quality of graphics. Halftoning converts a digital image into a binary image conformed by dots. The output of the halftoning contains less visual information; a possible benefit of this task is the reduction of ink when graphics are printed. The human eye is not able to detect the absence of information, but the printed image stills have good quality. The most used method for halftoning is called Floyd-Steinberger, and it defines a specific matrix for the halftoning conversion. However, most of the proposed techniques in halftoning use predefined kernels that do not permit adaptation to different images. This article introduces the use of the harmony search algorithm (HSA) for halftoning. The HSA is a popular evolutionary algorithm inspired by the musical improvisation. The different operators of the HSA permit an efficient exploration of the search space. The HSA is applied to find the best configuration of the kernel in halftoning; meanwhile, as an objective function, the use of the structural similarity index (SSIM) is proposed. A set of rules are also introduced to reduce the regular patterns that could be created by non-appropriate kernels. The SSIM is used due to the fact that it is a perception model used as a metric that permits comparing images to interpret the differences between them numerically. The aim of combining the HSA with the SSIM for halftoning is to generate an adaptive method that permits estimating the best kernel for each image based on its intrinsic attributes. The graphical quality of the proposed algorithm has been compared with classical halftoning methodologies. Experimental results and comparisons provide evidence regarding the quality of the images obtained by the proposed optimization-based approach. In this context, classical algorithms have a lower graphical quality in comparison with our proposal. The results have been validated by a statistical analysis based on independent experiments over the set of benchmark images by using the mean and standard deviation.


2016 ◽  
Vol 2016 ◽  
pp. 1-17 ◽  
Author(s):  
Diego José Luis Botia Valderrama ◽  
Natalia Gaviria Gómez

The measurement and evaluation of the QoE (Quality of Experience) have become one of the main focuses in the telecommunications to provide services with the expected quality for their users. However, factors like the network parameters and codification can affect the quality of video, limiting the correlation between the objective and subjective metrics. The above increases the complexity to evaluate the real quality of video perceived by users. In this paper, a model based on artificial neural networks such as BPNNs (Backpropagation Neural Networks) and the RNNs (Random Neural Networks) is applied to evaluate the subjective quality metrics MOS (Mean Opinion Score) and the PSNR (Peak Signal Noise Ratio), SSIM (Structural Similarity Index Metric), VQM (Video Quality Metric), and QIBF (Quality Index Based Frame). The proposed model allows establishing the QoS (Quality of Service) based in the strategyDiffserv. The metrics were analyzed through Pearson’s and Spearman’s correlation coefficients, RMSE (Root Mean Square Error), and outliers rate. Correlation values greater than 90% were obtained for all the evaluated metrics.


2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Lei He ◽  
Yan Xing ◽  
Kangxiong Xia ◽  
Jieqing Tan

In view of the drawback of most image inpainting algorithms by which texture was not prominent, an adaptive inpainting algorithm based on continued fractions was proposed in this paper. In order to restore every damaged point, the information of known pixel points around the damaged point was used to interpolate the intensity of the damaged point. The proposed method included two steps; firstly, Thiele’s rational interpolation combined with the mask image was used to interpolate adaptively the intensities of damaged points to get an initial repaired image, and then Newton-Thiele’s rational interpolation was used to refine the initial repaired image to get a final result. In order to show the superiority of the proposed algorithm, plenty of experiments were tested on damaged images. Subjective evaluation and objective evaluation were used to evaluate the quality of repaired images, and the objective evaluation was comparison of Peak Signal to Noise Ratios (PSNRs). The experimental results showed that the proposed algorithm had better visual effect and higher Peak Signal to Noise Ratio compared with the state-of-the-art methods.


Author(s):  
Vinay D R, Dr. Anand Babu J

Data hiding in video streams became more popular in the present world, since there is a high frequency of data communication over the internet. Hiding the data in video streams provides more security as well as increases embedding capacity than hiding inside the images. The quantity of information to be embedded into the video increases, it can badly influence the video excellence make it inappropriate for certain appliances. The main concerns in data hiding in videos are its high visual excellence, increased hiding capacity, video stream size etc. In this paper, a new data hiding technique is proposed in compressed H.264 Video Streams. At first, the information to be embedded is encrypted using Cryptography approach. The Cryptographic approach helps to encrypt the plain information based on the elliptic points produced by choosing the large prime number. The encrypted data is embedded into the transformed DCT coefficients of I, B and P video frames. The experiment is conducted for different set of video sequences. The results shows that the proposed method yields better performance in terms of Peak signal to noise ratio (PSNR), Structural similarity index (SSIM) and Video quality measure (VQM) when compare to existing methods.


Sign in / Sign up

Export Citation Format

Share Document