Evaluation of perceived speech quality for VoIP codecs under different loudness and background noise condition

Author(s):  
Suparnakanti Das ◽  
Paromita Choudhury
1997 ◽  
Vol 84 (2) ◽  
pp. 695-698 ◽  
Author(s):  
Mary E. Reynolds ◽  
Donald Fucci ◽  
Z. S. Bond

This study compared the effect of visual cuing on the intelligibility of DECtalk for native and nonnative speakers of English in both ideal listening conditions and in the presence of background noise at a signal to noise (S/N) ratio of + 10dB. Visual cuing improved DECtalk's intelligibility for normative speakers more than for native speakers, especially in the background noise condition. Implications of these findings and the need for further research are discussed.


1974 ◽  
Vol 39 (3) ◽  
pp. 1255-1262 ◽  
Author(s):  
Philip Tolin ◽  
Paul G. Fisher

80 Ss participated in a visual vigilance task under one of 4 background-noise conditions. Results indicated that (a) female Ss showed a greater time-related performance decrement in correct detections than males in the regular-intermittent background condition, (b) RTs increased with time, (c) males responded more rapidly than females, (d) intermittent noise attenuated time-related changes in incorrect detections, (e) males made more incorrect detections than females in the intermittent background conditions but not in the constant-background conditions, and (f) there was a sex × trial block interaction in the constant noise condition. Several correlations between and within response measures were reported.


2020 ◽  
Vol 19 (04) ◽  
pp. 2050035
Author(s):  
Sandeep Kumar

In general, the background noise degrades the speech quality. Thus, the intelligibility of the speech can be enhanced by mitigating the effects of background noise and echo suppression. So, speech enhancement can also be viewed as one of the optimization problems. In this work, directed search optimization (DSO) method is used to enhance the speech quality which is originally degraded. The performance of DSO-based speech enhancement method is compared with particle swarm optimization (PSO) and least mean square (LMS)-based methods in terms of output average segmental SNR and speech quality. From the experimental results, it was observed that the output spectrogram, output ASSNR and speech quality using DSO algorithm are far better as compared to PSO and LMS-based methods. Moreover, DSO-based method is computationally less complex as compared to the PSO-based method.


2019 ◽  
Vol 19 (1) ◽  
pp. 207-217 ◽  
Author(s):  
Julia A. Zeroth ◽  
Lynnda M. Dahlquist ◽  
Emily C. Foxen-Craft

Abstract Background and aims The present study was designed to evaluate the relative efficacy of two video game display modalities – virtual reality (VR) assisted video game distraction, in which the game is presented via a VR head-mounted display (HMD) helmet, versus standard video game distraction, in which the game is projected on a television – and to determine whether environmental context (quiet versus noisy) moderates the relative efficacy of the two display modalities in reducing cold pressor pain in healthy college students. Methods Undergraduate students (n=164) were stratified by sex and self-reported video game skill and were randomly assigned to a quiet or a noisy environment. Participants then underwent three cold pressor trials consisting of one baseline followed by two distraction trials differing in display modality (i.e. VR-assisted or standard distraction) in counter-balanced order. Results Participants experienced improvement in pain tolerance from baseline to distraction in both display modality conditions (p<0.001, partial η2=0.41), and there was a trend toward greater improvement in pain tolerance from baseline to distraction when using the VR HMD helmet than during standard video game distraction (p=0.057, partial η2=0.02). Participants rated pain as more intense when experienced with concurrent experimental background noise (p=0.047, partial η2=0.02). Pain tolerance was not influenced by the presence or absence of background noise, and there was not a significant interaction between display modality and noise condition. Though exploratory sex analyses demonstrated a significant three-way interaction between noise condition, sex, and display modality on pain intensity (p=0.040, partial η2=0.040), follow-up post-hoc analyses conducted for males and females separately did not reveal significant differences in pain intensity based on the interaction between noise condition and display modality. Conclusions As expected, video game distraction both with and without an HMD helmet increased pain tolerance; however, the two display modalities only marginally differed in efficacy within the population under study. The effect of auditory background noise on pain was mixed; while pain tolerance did not vary as a function of the presence or absence of background noise, the addition of noise increased pain intensity ratings. The interaction between participant sex, noise condition, and distraction modality on pain intensity trended toward significance but would require replication in future research. Implications Results suggest that video game distraction via HMD helmet may be superior to standard video game distraction for increasing pain tolerance, though further research is required to replicate the trending findings observed in this study. Though it does not appear that background noise significantly impacted the relative efficacy of the two different video game display modalities, the presence of noise does appear to alter the pain response through amplified pain intensity ratings. Further research utilizing more sophisticated VR technology and clinically relevant background auditory stimuli is necessary in order to better understand the impact of these findings in real-world settings and to test the clinical utility of VR technology for pain management relative to standard video game distraction.


Author(s):  
Lauren V. Hadley ◽  
William M. Whitmer ◽  
W. Owen Brimijoin ◽  
Graham Naylor

Abstract Many conversations in our day-to-day lives are held in noisy environments – impeding comprehension, and in groups – taxing auditory attention-switching processes. These situations are particularly challenging for older adults in cognitive and sensory decline. In noisy environments, a variety of extra-linguistic strategies are available to speakers and listeners to facilitate communication, but while models of language account for the impact of context on word choice, there has been little consideration of the impact of context on extra-linguistic behaviour. To address this issue, we investigate how the complexity of the acoustic environment and interaction situation impacts extra-linguistic conversation behaviour of older adults during face-to-face conversations. Specifically, we test whether the use of intelligibility-optimising strategies increases with complexity of the background noise (from quiet to loud, and in speech-shaped vs. babble noise), and with complexity of the conversing group (dyad vs. triad). While some communication strategies are enhanced in more complex background noise, with listeners orienting to talkers more optimally and moving closer to their partner in babble than speech-shaped noise, this is not the case with all strategies, as we find greater vocal level increases in the less complex speech-shaped noise condition. Other behaviours are enhanced in the more complex interaction situation, with listeners using more optimal head orientations, and taking longer turns when gaining the floor in triads compared to dyads. This study elucidates how different features of the conversation context impact individuals’ communication strategies, which is necessary to both develop a comprehensive cognitive model of multimodal conversation behaviour, and effectively support individuals that struggle conversing.


2014 ◽  
Vol 564 ◽  
pp. 129-134
Author(s):  
Abdul Hakim Abdullah ◽  
Zamir A. Zulkefli

This study presents the assessment of the quality of speech intelligibility of two Malaysian mosques and the results are used to develop a set of general acoustical guidelines to be used in the design of a mosque. Two mosques were selected for the research: Masjid UPM and the Masjid Jamek. The objective of the research is to enable the comparison of the acoustics and speech intelligibility between the mosques as function of the size, volume, occupancy and other parameters of the main prayer hall on the acoustic and speech intelligibility of the respective mosques. The reverberation time (RT60), speech level (SL), background noise (BN), signal-to-noise ratio (S/N ratio) were determined and are used to develop the speech transmission index (STI) and rapid transmission index (RASTI) prediction models for both mosques. It was observed from the results that the RT60, STI and RASTI values shows better performance over number of occupancy for both mosques. Furthermore, the BN and SL results were visualized using the spatial distribution patterns (SDP) of the main hall. The results of the analysis show that the overall acoustic and speech quality of Masjid Jamek is better when compared to the overall acoustic and speech quality of Masjid UPM. These results are then used to develop a set of design recommendations to ensure adequate speech intelligibility quality a mosque.


1997 ◽  
Vol 40 (1) ◽  
pp. 159-169 ◽  
Author(s):  
Alison L. Winkworth ◽  
Pamela J. Davis

Respiratory measurements were made using linearized magnetometers placed antero-posteriorly over the rib cages and abdomens of five healthy young women. Background noise was introduced over headphones simultaneously as "babble" presented binaurally at 55 dB ("moderate noise") and 70 dB ("high noise"). Speech during oral reading and spontaneous monologue was transduced with a microphone positioned near the lips, from which a speaking intensity signal (dBA) was derived. Subjects were instructed to speak during the noise conditions, but no instruction was given to alter speaking intensity. Compared with a "no noise" condition, the speaking intensities of all the subjects increased significantly for both speech tasks in the moderate and high noise conditions, thereby replicating the well-documented Lombard effect. No consistent trend of lung volume change was observed, in contrast to the linear increases in speech intensity as the noise level increased. For the higher speech intensities during the moderate and high noise conditions both initiation and termination lung volumes either increased or decreased. These preliminary findings suggest that when speech intensity is increased following the introduction of noise via headphones rather than by specific instructions to speak more loudly, speakers employ variable lung volume strategies for intensity control.


Author(s):  
Edin Šabić ◽  
Daniel Henning ◽  
Justin MacDonald

Missing a message from an in-vehicle device can range in severity from annoying at best to dangerous at worst. The in-cab auditory environment can vary spontaneously, making some volume levels too loud while rendering others too quiet. It is in the best interest of system designers, both from a safety and user experience perspective, to ensure that users are able to adequately hear alerts, and that drivers do not have to alter their gaze or attention during a visually and attentionally demanding task such as driving. To this end, we propose a system for dynamically tracking the background noise intensity level immediately prior to alert presentation in order to present an alert at an appropriate loudness. Furthermore, we evaluated the proposed system across both behavioral (accuracy and reaction time) and subjective (questionnaire results) measures. Behavioral results showed that while the proposed system increased recognition in one noise condition (background music), it also led to slower responses in two other noise conditions (windows-down and windows-up noise).


Sign in / Sign up

Export Citation Format

Share Document