Localizing Sound Sources in a CAVE-Like Virtual Environment with Loudspeaker Array Reproduction

2007 ◽  
Vol 16 (2) ◽  
pp. 157-171 ◽  
Author(s):  
Matti Gröhn ◽  
Tapio Lokki ◽  
Tapio Takala

In a CAVE-like virtual environment spatial audio is typically reproduced with amplitude panning on loudspeakers behind the screens. We arranged a localization experiment where the subjects' task was to point to the perceived location of a sound source. Measured accuracy for a static source was as good as the accuracy in previous headphone experiments using head-related transfer functions. We also measured the localization accuracy of a moving auditory stimulus. The accuracy was decreased by an amount comparable to the minimum audible movement angle.

2019 ◽  
Vol 9 (13) ◽  
pp. 2618 ◽  
Author(s):  
Tomasz Rudzki ◽  
Ignacio Gomez-Lanzaco ◽  
Jessica Stubbs ◽  
Jan Skoglund ◽  
Damian T. Murphy ◽  
...  

The increasing popularity of Ambisonics as a spatial audio format for streaming services poses new challenges to existing audio coding techniques. Immersive audio delivered to mobile devices requires an efficient bitrate compression that does not affect the spatial quality of the content. Good localizability of virtual sound sources is one of the key elements that must be preserved. This study was conducted to investigate the localization precision of virtual sound source presentations within Ambisonic scenes encoded with Opus low-bitrate compression at different bitrates and Ambisonic orders (1st, 3rd, and 5th). The test stimuli were reproduced over a 50-channel spherical loudspeaker configuration and binaurally using individually measured and generic Head-Related Transfer Functions (HRTFs). Participants were asked to adjust the position of a virtual acoustic pointer to match the position of virtual sound source within the bitrate-compressed Ambisonic scene. Results show that auditory localization in low-bitrate compressed Ambisonic scenes is not significantly affected by codec parameters. The key factors influencing localization are the rendering method and Ambisonic order truncation. This suggests that efficient perceptual coding might be successfully used for mobile spatial audio delivery.


1999 ◽  
Vol 58 (3) ◽  
pp. 170-179 ◽  
Author(s):  
Barbara S. Muller ◽  
Pierre Bovet

Twelve blindfolded subjects localized two different pure tones, randomly played by eight sound sources in the horizontal plane. Either subjects could get information supplied by their pinnae (external ear) and their head movements or not. We found that pinnae, as well as head movements, had a marked influence on auditory localization performance with this type of sound. Effects of pinnae and head movements seemed to be additive; the absence of one or the other factor provoked the same loss of localization accuracy and even much the same error pattern. Head movement analysis showed that subjects turn their face towards the emitting sound source, except for sources exactly in the front or exactly in the rear, which are identified by turning the head to both sides. The head movement amplitude increased smoothly as the sound source moved from the anterior to the posterior quadrant.


2021 ◽  
Vol 263 (5) ◽  
pp. 1488-1496
Author(s):  
Yunqi Chen ◽  
Chuang Shi ◽  
Hao Mu

Earphones are commonly equipped with miniature loudspeaker units, which cannot transmit enough power of low-frequency sound. Meanwhile, there is often only one loudspeaker unit employed on each side of the earphone, whereby the multi-channel spatial audio processing cannot be applied. Therefore, the combined usage of the virtual bass (VB) and head-related transfer functions (HRTFs) is necessary for an immersive listening experience with earphones. However, the combining effect of the VB and HRTFs has not been comprehensively reported. The VB is developed based on the missing fundamental effect, providing that the presence of harmonics can be perceived as their fundamental frequency, even if the fundamental frequency is not presented. HRTFs describe the transmission process of a sound propagating from the sound source to human ears. Monaural audio processed by a pair of HRTFs can be perceived by the listener as a sound source located in the direction associated with the HRTFs. This paper carries out subjective listening tests and their results reveal that the harmonics required by the VB should be generated in the same direction as the high-frequency sound. The bass quality is rarely distorted by the presence of HRTFs, but the localization accuracy is occasionally degraded by the VB.


2020 ◽  
Author(s):  
Josefa Oberem ◽  
Jan-Gerrit Richter ◽  
Dorothea Setzer ◽  
Julia Seibold ◽  
Iring Koch ◽  
...  

AbstractBinaural reproduction can be used in listening experiments under real-life conditions to achieve a high realism and good reproducibility. In recent years a clear trend to more individual reproduction can be observed as the ability to measure individual head-related-transfer-functions (HRTFs) is becoming more widespread. The question of the accuracy and reproduction methods needed for a realistic playback however has not been sufficiently answered. To evaluate an appropriate approach for binaural reproduction via headphones different head-related-transfer-functions (HRTFs) and reproduction methods were compared in this paper. In a listening test eleven explicitly trained participants were asked to localize eleven sound sources positioned in the right hemisphere using the proximal pointing method. Binaural stimuli based on individually measured HRTFs were compared to those of an artificial head in a static reproduction of stimuli and in three dynamic reproduction methods of different resolutions (5°, 2.5° and 1°). Unsigned errors in azimuth and elevation as well as front-back-confusions and in-head-localization were observed. Dynamic reproduction of any resolution applied turned out fundamental for a reduction of undesired front-back-confusions and in-head-localization. Individually measured HRTFs showed a smaller effect on localization accuracy compared to the influence of dynamic sound reproduction. They were mainly observed to reduce the front-back-confusion rate.


2007 ◽  
Vol 16 (5) ◽  
pp. 509-522 ◽  
Author(s):  
Fakheredine Keyrouz ◽  
Klaus Diepold

Telepresence is generally described as the feeling of being immersed in a remote environment, be it virtual or real. A multimodal telepresence environment, equipped with modalities such as vision, audition, and haptic, improves immersion and augments the overall perceptual presence. The present work focuses on acoustic telepresence at both the teleoperator and operator sites. On the teleoperator side, we build a novel binaural sound source localizer using generic Head Related Transfer Functions (HRTFs). This new localizer provides estimates for the direction of a single sound source given in terms of azimuth and elevation angles in free space by using only two microphones. It also uses an algorithm that is efficient compared to the currently known algorithms used in similar localization processes. On the operator side, the paper addresses the problem of spatially interpolating HRTFs for densely sampled high-fidelity 3D sound synthesis. In our telepresence application scenario the synthesized 3D sound is presented to the operator over headphones and shall achieve a high-fidelity acoustic immersion. Using measured HRTF data, we create interpolated HRTFs between the existing functions using a matrix-valued interpolation function. The comparison with existing interpolation methods reveals that our new method offers superior performance and is capable of achieving high-fidelity reconstructions of HRTFs.


2017 ◽  
Author(s):  
Mark A. Steadman ◽  
Chungeun Kim ◽  
Jean-Hugues Lestang ◽  
Dan F. M. Goodman ◽  
Lorenzo Picinali

ABSTRACTHead-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain’s ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements (“gamification”) and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion (“active listening”). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.


2017 ◽  
Vol 29 (1) ◽  
pp. 72-82 ◽  
Author(s):  
Takuya Suzuki ◽  
◽  
Hiroaki Otsuka ◽  
Wataru Akahori ◽  
Yoshiaki Bando ◽  
...  

[abstFig src='/00290001/07.jpg' width='300' text='Six impulse response measurement signals' ] Two major functions, sound source localization and sound source separation, provided by robot audition open source software HARK exploit the acoustic transfer functions of a microphone array to improve the performance. The acoustic transfer functions are calculated from the measured acoustic impulse response. In the measurement, special signals such as Time Stretched Pulse (TSP) are used to improve the signal-to-noise ratio of the measurement signals. Recent studies have identified the importance of selecting a measurement signal according to the applications. In this paper, we investigate how six measurement signals – up-TSP, down-TSP, M-Series, Log-SS, NW-SS, and MN-SS – influence the performance of the MUSIC-based sound source localization provided by HARK. Experiments with simulated sounds, up to three simultaneous sound sources, demonstrate no significant difference among the six measurement signals in the MUSIC-based sound source localization.


2021 ◽  
Author(s):  
Enrique A. Lopez-Poveda ◽  
Almudena Eustaquio-Martín ◽  
Fernando Martín San Victoriano

ABSTRACTUnderstanding speech presented in competition with other sound sources can be challenging. Here, we reason that this task can be facilitated by improving the signal-to-noise ratio (SNR) in either of the two ears and that in free-field listening scenarios, this can be achieved by attenuating contralateral sounds. We present a binaural (pre)processing algorithm that improves the SNR in the ear ipsilateral to the target sound source by linear subtraction of the weighted contralateral stimulus. Although the weight is regarded as a free parameter, we justify setting it equal to the ratio of ipsilateral to contralateral head-related transfer functions averaged over an appropriate azimuth range. The algorithm is implemented in the frequency domain and evaluated technically and experimentally for normal-hearing listeners in simulated free-field conditions. Results show that (1) it can substantially improve the SNR (up to 20 dB) and the short-term intelligibility metric in the ear ipsilateral to the target source, particularly for speech-like maskers; (2) it can improve speech reception thresholds for sentences in competition with speech-shaped noise by up to 8.5 dB in bilateral listening and 10.0 dB in unilateral listening; (3) it hardly affects sound-source localization; and (4) the improvements, and the algorithm’s directivity pattern depend on the weights. The algorithm accounts qualitatively for binaural unmasking for speech in competition with multiple maskers and for multiple target-masker spatial arrangements, an unexpected property that can inspire binaural intelligibility models.


2019 ◽  
Vol 6 (7) ◽  
pp. 190423 ◽  
Author(s):  
L. Papet ◽  
N. Grimault ◽  
N. Boyer ◽  
N. Mathevon

As top predators, crocodilians have an acute sense of hearing that is useful for their social life and for probing their environment in hunting situations. Although previous studies suggest that crocodilians are able to localize the position of a sound source, how they do this remains largely unknown. In this study, we measured the potential monaural sound localization cues (head-related transfer functions; HRTFs) on alive animals and skulls in two situations, both mimicking natural positions: basking on the land and cruising at the interface between air and water. Binaural cues were also estimated by measuring the interaural level differences (ILDs) and the interaural time differences (ITDs). In both conditions, HRTF measurements show large spectral variations (greater than 10 dB) for high frequencies, depending on the azimuthal angle. These localization cues are influenced by head size and by the internal coupling of the ears. ITDs give reliable information regarding sound-source position for low frequencies, while ILDs are more suitable for frequencies higher than 1.5 kHz. Our results support the hypothesis that crocodilian head morphology is adapted to acquire reliable localization cues from sound sources when outside the water, but also when only a small part of their head is above the air–water interface.


2015 ◽  
Vol 2 (6) ◽  
pp. 140473 ◽  
Author(s):  
Reinhard Lakes-Harlan ◽  
Jan Scherberich

A primary task of auditory systems is the localization of sound sources in space. Sound source localization in azimuth is usually based on temporal or intensity differences of sounds between the bilaterally arranged ears. In mammals, localization in elevation is possible by transfer functions at the ear, especially the pinnae. Although insects are able to locate sound sources, little attention is given to the mechanisms of acoustic orientation to elevated positions. Here we comparatively analyse the peripheral hearing thresholds of three species of bushcrickets in respect to sound source positions in space. The hearing thresholds across frequencies depend on the location of a sound source in the three-dimensional hearing space in front of the animal. Thresholds differ for different azimuthal positions and for different positions in elevation. This position-dependent frequency tuning is species specific. Largest differences in thresholds between positions are found in Ancylecha fenestrata . Correspondingly, A. fenestrata has a rather complex ear morphology including cuticular folds covering the anterior tympanal membrane. The position-dependent tuning might contribute to sound source localization in the habitats. Acoustic orientation might be a selective factor for the evolution of morphological structures at the bushcricket ear and, speculatively, even for frequency fractioning in the ear.


Sign in / Sign up

Export Citation Format

Share Document