Subjective and Objective Quality Evaluation of Watermarked Audio

Author(s):  
Michael Arnold

Methods for evaluating the quality of watermarked objects are detailed in this chapter. It will provide an overview of subjective and objective methods usable in order to judge the influence of watermark embedding on the quality of audio tracks. The problem associated with the quality evaluation of watermarked audio data will be presented. This is followed by a presentation of subjective evaluation standards used in testing the transparency of marked audio tracks as well as the evaluation of marked items with intermediate quality. Since subjective listening tests are expensive and dependent on many not easily controllable parameters, objective quality measurement methods are discussed in section Objective Evaluation Standards. Section Implementation of a Quality Evaluation presents the whole process of testing the quality taking into account the methods discussed in this chapter. Special emphasis is devoted to a detailed description of the test setup, item selection and the practical limitations. The last section summarizes the chapter.

Author(s):  
Abdulhussain E. Mahdi

Speech quality is the most visible and important aspect of quality of service (QoS) for telecommunication networks. Hence, the ability to monitor and design for this quality has become a top priority. Speech quality refers to the clearness of a speaker’s voice as perceived by a listener. Speech quality measurement offers a means of adding the human end user’s perspective to traditional ways of performing network management evaluation of voice telephony services. Traditionally, measurement of users’ perception of speech quality has been performed by expensive and time-consuming subjective listening tests. Over the last three decades, numerous attempts have been made to supplement subjective tests with objective measurements based on algorithms that can be computerised and automated. This chapter describes the technicalities associated with speech quality measurement, and presents a review of current subjective and objective speech quality evaluation methods and standards in telecommunications.


Akustika ◽  
2020 ◽  
pp. 58-66
Author(s):  
Stanislav Žiaran ◽  
Ondrej Chlebo ◽  
Ĺubomír Šooš

The quality of bearing production has an impact not only on their reliability and lifetime, but also on the dynamic load of the working and living environment by excessive vibration and thus also noise. The intensity of the noise emitted by a bearing which is perceived by man characterizes the quality of its production. Reducing the dynamic load of mechanical systems and their components is reflected in the working environment by reducing noise emissions and immissions. The article proposes an objective method of bearing quality assessment based on measuring vibro-acoustic parameters of dynamic load of a new bearing using FFT analysis and the magnitude of the amplitude of bearing vibration acceleration and compares it with a subjective method that also uses the human auditory organ to assess bearing quality. The results of vibro-acoustic measurements were analysed in terms of vibration intensity and the noise of the produced bearings. The proposed objective methodology was compared with the subjective evaluation of the quality of bearings and the results of this methodology matched. The proposed methodology is applicable to all types of bearings, and it is possible to automate this methodology in the production process.


2013 ◽  
Vol 318 ◽  
pp. 572-575
Author(s):  
Li Li Yu ◽  
Yu Hong Li ◽  
Ai Feng Wang

In this paper a quality monitoring system for seismic while drilling (SWD) that integrates the whole process of data acquisition was developed. The acquisition equipment, network status and signals of accelerometer and geophone were monitored real-time. With fast signal analysis and quality evaluation, the acquisition parameters and drilling engineering parameters can be adjusted timely. The application of the system can improve the quality of data acquisition and provide subsequent processing and interpretation with high qualified reliable data.


2013 ◽  
Vol 411-414 ◽  
pp. 1362-1367 ◽  
Author(s):  
Qing Lan Wei ◽  
Yuan Zhang

This paper presents the thoughts about application of saliency map to the video objective quality evaluation system. It computes the SMSE and SPSNR values as the objective assessment scores according to the saliency map, and compares with conditional objective evaluation methods as PSNR and MSE. Experimental results demonstrate that this method can well fit the subjective assessment results.


2019 ◽  
Vol 11 (10) ◽  
pp. 204 ◽  
Author(s):  
Dogan ◽  
Haddad ◽  
Ekmekcioglu ◽  
Kondoz

When it comes to evaluating perceptual quality of digital media for overall quality of experience assessment in immersive video applications, typically two main approaches stand out: Subjective and objective quality evaluation. On one hand, subjective quality evaluation offers the best representation of perceived video quality assessed by the real viewers. On the other hand, it consumes a significant amount of time and effort, due to the involvement of real users with lengthy and laborious assessment procedures. Thus, it is essential that an objective quality evaluation model is developed. The speed-up advantage offered by an objective quality evaluation model, which can predict the quality of rendered virtual views based on the depth maps used in the rendering process, allows for faster quality assessments for immersive video applications. This is particularly important given the lack of a suitable reference or ground truth for comparing the available depth maps, especially when live content services are offered in those applications. This paper presents a no-reference depth map quality evaluation model based on a proposed depth map edge confidence measurement technique to assist with accurately estimating the quality of rendered (virtual) views in immersive multi-view video content. The model is applied for depth image-based rendering in multi-view video format, providing comparable evaluation results to those existing in the literature, and often exceeding their performance.


2019 ◽  
Vol 78 (23) ◽  
pp. 33549-33572
Author(s):  
Mohammed Salah Al-Radhi ◽  
Tamás Gábor Csapó ◽  
Géza Németh

Abstract In this paper, a novel vocoder is proposed for a Statistical Voice Conversion (SVC) framework using deep neural network, where multiple features from the speech of two speakers (source and target) are converted acoustically. Traditional conversion methods focus on the prosodic feature represented by the discontinuous fundamental frequency (F0) and the spectral envelope. Studies have shown that speech analysis/synthesis solutions play an important role in the overall quality of the converted voice. Recently, we have proposed a new continuous vocoder, originally for statistical parametric speech synthesis, in which all parameters are continuous. Therefore, this work introduces a new method by using a continuous F0 (contF0) in SVC to avoid alignment errors that may happen in voiced and unvoiced segments and can degrade the converted speech. Our contribution includes the following. (1) We integrate into the SVC framework the continuous vocoder, which provides an advanced model of the excitation signal, by converting its contF0, maximum voiced frequency, and spectral features. (2) We show that the feed-forward deep neural network (FF-DNN) using our vocoder yields high quality conversion. (3) We apply a geometric approach to spectral subtraction (GA-SS) in the final stage of the proposed framework, to improve the signal-to-noise ratio of the converted speech. Our experimental results, using two male and one female speakers, have shown that the resulting converted speech with the proposed SVC technique is similar to the target speaker and gives state-of-the-art performance as measured by objective evaluation and subjective listening tests.


2018 ◽  
Vol 2018 ◽  
pp. 1-16 ◽  
Author(s):  
Lei He ◽  
Yan Xing ◽  
Kangxiong Xia ◽  
Jieqing Tan

In view of the drawback of most image inpainting algorithms by which texture was not prominent, an adaptive inpainting algorithm based on continued fractions was proposed in this paper. In order to restore every damaged point, the information of known pixel points around the damaged point was used to interpolate the intensity of the damaged point. The proposed method included two steps; firstly, Thiele’s rational interpolation combined with the mask image was used to interpolate adaptively the intensities of damaged points to get an initial repaired image, and then Newton-Thiele’s rational interpolation was used to refine the initial repaired image to get a final result. In order to show the superiority of the proposed algorithm, plenty of experiments were tested on damaged images. Subjective evaluation and objective evaluation were used to evaluate the quality of repaired images, and the objective evaluation was comparison of Peak Signal to Noise Ratios (PSNRs). The experimental results showed that the proposed algorithm had better visual effect and higher Peak Signal to Noise Ratio compared with the state-of-the-art methods.


2020 ◽  
Vol 10 (9) ◽  
pp. 3188
Author(s):  
Miroslaw Narbutt ◽  
Jan Skoglund ◽  
Andrew Allen ◽  
Michael Chinen ◽  
Dan Barry ◽  
...  

Spatial audio is essential for creating a sense of immersion in virtual environments. Efficient encoding methods are required to deliver spatial audio over networks without compromising Quality of Service (QoS). Streaming service providers such as YouTube typically transcode content into various bit rates and need a perceptually relevant audio quality metric to monitor users’ perceived quality and spatial localization accuracy. The aim of the paper is two-fold. First, it is to investigate the effect of Opus codec compression on the quality of spatial audio as perceived by listeners using subjective listening tests. Secondly, it is to introduce AMBIQUAL, a full reference objective metric for spatial audio quality, which derives both listening quality and localization accuracy metrics directly from the B-format Ambisonic audio. We compare AMBIQUAL quality predictions with subjective quality assessments across a variety of audio samples which have been compressed using the Opus 1.2 codec at various bit rates. Listening quality and localization accuracy of first and third-order Ambisonics were evaluated. Several fixed and dynamic audio sources (single and multiple) were used to evaluate localization accuracy. Results show good correlation regarding listening quality and localization accuracy between objective quality scores using AMBIQUAL and subjective scores obtained during listening tests.


Symmetry ◽  
2020 ◽  
Vol 12 (9) ◽  
pp. 1535
Author(s):  
Jaroslav Frnda ◽  
Jan Nedoma ◽  
Radek Martinek ◽  
Michael Fridrich

Quality of service (QoS) and quality of experience (QoE) are two major concepts for the quality evaluation of video services. QoS analyzes the technical performance of a network transmission chain (e.g., utilization or packet loss rate). On the other hand, subjective evaluation (QoE) relies on the observer’s opinion, so it cannot provide output in a form of score immediately (extensive time requirements). Although several well-known methods for objective evaluation exist (trying to adopt psychological principles of the human visual system via mathematical models), each of them has its own rating scale without an existing symmetric conversion to a standardized subjective output like MOS (mean opinion score), typically represented by a five-point rating scale. This makes it difficult for network operators to recognize when they have to apply resource reservation control mechanisms. For this reason, we propose an application (classifier) that derivates the subjective end-user quality perception based on a score of objective assessment and selected parameters of each video sequence. Our model integrates the unique benefits of unsupervised learning and clustering techniques such as overfitting avoidance or small dataset requirements. In fact, most of the published papers are based on regression models or supervised clustering. In this article, we also investigate the possibility of a graphical SOM (self-organizing map) representation called a U-matrix as a feature selection method.


2013 ◽  
Vol 756-759 ◽  
pp. 1259-1263
Author(s):  
Chun Ling Zhang ◽  
Sheng Hui Zhao ◽  
Hong Yuan Xiao ◽  
Jing Wang ◽  
Jing Ming Kuang

an improved method is proposed to skip the look-ahead period in this paper. The improved method uses the autocorrelation algorithm to calculate the Linear Prediction (LP) coefficients and then the LP coefficients are employed to extrapolate new samples for replacing the look-ahead samples. To evaluate the quality of this method, perceptual evaluation of speech quality (PESQ) and the A/B listening test method are designed for the objective evaluation and subjective evaluation. The reconstructed quality of the modified method is near to the original AMR codec, at the same time, the delay of the improved method is lower 5ms than the original method.


Sign in / Sign up

Export Citation Format

Share Document