Influence of head morphology and natural postures on sound localization cues in crocodilians

As top predators, crocodilians have an acute sense of hearing that is useful for their social life and for probing their environment in hunting situations. Although previous studies suggest that crocodilians are able to localize the position of a sound source, how they do this remains largely unknown. In this study, we measured the potential monaural sound localization cues (head-related transfer functions; HRTFs) on alive animals and skulls in two situations, both mimicking natural positions: basking on the land and cruising at the interface between air and water. Binaural cues were also estimated by measuring the interaural level differences (ILDs) and the interaural time differences (ITDs). In both conditions, HRTF measurements show large spectral variations (greater than 10 dB) for high frequencies, depending on the azimuthal angle. These localization cues are influenced by head size and by the internal coupling of the ears. ITDs give reliable information regarding sound-source position for low frequencies, while ILDs are more suitable for frequencies higher than 1.5 kHz. Our results support the hypothesis that crocodilian head morphology is adapted to acquire reliable localization cues from sound sources when outside the water, but also when only a small part of their head is above the air–water interface.

Download Full-text

Sound localization in headphone reproduction by simulating transfer functions from the sound source to the external ear.

Journal of the Acoustical Society of Japan (E) ◽

10.1250/ast.12.203 ◽

1991 ◽

Vol 12 (5) ◽

pp. 203-216 ◽

Cited By ~ 27

Author(s):

Jun'ichi Kawaura ◽

Yoiti Suzuki ◽

Futoshi Asano ◽

Toshio Sone

Keyword(s):

Sound Localization ◽

Sound Source ◽

Transfer Functions ◽

External Ear

Download Full-text

Sound localization by the bottlenose porpoise Tursiops truncatus

Journal of Experimental Biology ◽

10.1242/jeb.63.3.569 ◽

1975 ◽

Vol 63 (3) ◽

pp. 569-585 ◽

Cited By ~ 1

Author(s):

D. L. Renaud ◽

A. N. Popper

Keyword(s):

Sound Localization ◽

Tursiops Truncatus ◽

External Auditory Meatus ◽

Sound Sources ◽

Low Frequencies ◽

Lower Jaw ◽

Wide Range ◽

Sound Detection ◽

Vertical Sound ◽

Support Evidence

1. Sound localization was measured behaviourally for the Atlantic bottlenose porpoise (Tursiops truncatus) using a wide range of pure tone pulses as well as clicks simulating the species echolocation click. 2. Measurements of the minimum audible angle (MAA) on the horizontal plane give localization discrimination thresholds of between 2 and 3 degrees for sounds from 20 to 90 kHz and thresholds from 2–8 to 4 degrees at 6, 10 and 100 kHz. With the azimuth of the animal changed relative to the speakers the MAAs were 1-3-1-5 degrees at an azimuth of 15 degrees and about 5 degrees for an azimuth of 30 degrees. 3. MAAs to clicks were 0-7-0-8 degrees. 4. The animal was able to do almost as well in determining the position of vertical sound sources as it could for horizontal localization. 5. The data indicate that at low frequencies the animal may have been localizing by using the region around the external auditory meatus as a detector, but at frequencies about 20 kHz it is likely that the animal was detecting sounds through the lateral sides of the lower jaw. 6. Above 20 kHz, it is likely that the animal was localizing using binaural intensity cues. 7. Our data support evidence that the lower jaw is an important channel for sound detection in Tursiops.

Download Full-text

Auditory Localization in Low-Bitrate Compressed Ambisonic Scenes

Applied Sciences ◽

10.3390/app9132618 ◽

2019 ◽

Vol 9 (13) ◽

pp. 2618 ◽

Cited By ~ 1

Author(s):

Tomasz Rudzki ◽

Ignacio Gomez-Lanzaco ◽

Jessica Stubbs ◽

Jan Skoglund ◽

Damian T. Murphy ◽

...

Keyword(s):

Sound Source ◽

Transfer Functions ◽

Auditory Localization ◽

Audio Coding ◽

Spatial Audio ◽

Key Factors ◽

Sound Sources ◽

Localization Precision ◽

Streaming Services

The increasing popularity of Ambisonics as a spatial audio format for streaming services poses new challenges to existing audio coding techniques. Immersive audio delivered to mobile devices requires an efficient bitrate compression that does not affect the spatial quality of the content. Good localizability of virtual sound sources is one of the key elements that must be preserved. This study was conducted to investigate the localization precision of virtual sound source presentations within Ambisonic scenes encoded with Opus low-bitrate compression at different bitrates and Ambisonic orders (1st, 3rd, and 5th). The test stimuli were reproduced over a 50-channel spherical loudspeaker configuration and binaurally using individually measured and generic Head-Related Transfer Functions (HRTFs). Participants were asked to adjust the position of a virtual acoustic pointer to match the position of virtual sound source within the bitrate-compressed Ambisonic scene. Results show that auditory localization in low-bitrate compressed Ambisonic scenes is not significantly affected by codec parameters. The key factors influencing localization are the rendering method and Ambisonic order truncation. This suggests that efficient perceptual coding might be successfully used for mobile spatial audio delivery.

Download Full-text

Short-term effects of sound localization training in virtual reality

10.1101/207753 ◽

2017 ◽

Cited By ~ 1

Author(s):

Mark A. Steadman ◽

Chungeun Kim ◽

Jean-Hugues Lestang ◽

Dan F. M. Goodman ◽

Lorenzo Picinali

Keyword(s):

Sound Localization ◽

Sound Source ◽

Game Design ◽

Transfer Functions ◽

Polar Angle ◽

Active Listening ◽

Head Tracking ◽

Localization Accuracy ◽

Design Elements ◽

Listening Group

ABSTRACTHead-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain’s ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements (“gamification”) and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion (“active listening”). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.

Download Full-text

Re-weighting of Sound Localization Cues by Audiovisual Training

10.1101/616490 ◽

2019 ◽

Author(s):

Daniel P. Kumpik ◽

Connor Campbell ◽

Jan W.H. Schnupp ◽

Andrew J King

Keyword(s):

Sound Localization ◽

Visual Cues ◽

Transfer Functions ◽

Contextual Information ◽

Auditory Localization ◽

Spatial Cues ◽

Auditory Cues ◽

Cue Weighting ◽

Auditory Spatial Cues ◽

Interaural Level Differences

AbstractSound localization requires the integration in the brain of auditory spatial cues generated by interactions with the external ears, head and body. Perceptual learning studies have shown that the relative weighting of these cues can change in a context-dependent fashion if their relative reliability is altered. One factor that may influence this process is vision, which tends to dominate localization judgments when both modalities are present and induces a recalibration of auditory space if they become misaligned. It is not known, however, whether vision can alter the weighting of individual auditory localization cues. Using non-individualized head-related transfer functions, we measured changes in subjects’ sound localization biases and binaural localization cue weights after ~55 minutes of training on an audiovisual spatial oddball task. Four different configurations of spatial congruence between visual and auditory cues (interaural time differences (ITDs) and frequency-dependent interaural level differences (interaural level spectra, ILS) were used. When visual cues were spatially congruent with both auditory spatial cues, we observed an improvement in sound localization, as shown by a reduction in the variance of subjects’ localization biases, which was accompanied by an up-weighting of the more salient ILS cue. However, if the position of either one of the auditory cues was randomized during training, no overall improvement in sound localization occurred. Nevertheless, the spatial gain of whichever cue was matched with vision increased, with different effects observed on the gain for the randomized cue depending on whether ITDs or ILS were matched with vision. As a result, we observed a similar up-weighting in ILS when this cue alone was matched with vision, but no overall change in binaural cue weighting when ITDs corresponded to the visual cues and ILS were randomized. Consistently misaligning both cues with vision produced the ventriloquism aftereffect, i.e., a corresponding shift in auditory localization bias, without affecting the variability of the subjects’ sound localization judgments, and no overall change in binaural cue weighting. These data show that visual contextual information can invoke a reweighting of auditory localization cues, although concomitant improvements in sound localization are only likely to accompany training with fully congruent audiovisual information.

Download Full-text

Influence of Different Impulse Response Measurement Signals on MUSIC-Based Sound Source Localization

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2017.p0072 ◽

2017 ◽

Vol 29 (1) ◽

pp. 72-82 ◽

Cited By ~ 2

Author(s):

Takuya Suzuki ◽

◽

Hiroaki Otsuka ◽

Wataru Akahori ◽

Yoshiaki Bando ◽

...

Keyword(s):

Impulse Response ◽

Source Localization ◽

Sound Source ◽

Transfer Functions ◽

Signal To Noise Ratio ◽

Microphone Array ◽

Sound Source Localization ◽

Sound Sources ◽

Response Measurement ◽

Significant Difference

[abstFig src='/00290001/07.jpg' width='300' text='Six impulse response measurement signals' ] Two major functions, sound source localization and sound source separation, provided by robot audition open source software HARK exploit the acoustic transfer functions of a microphone array to improve the performance. The acoustic transfer functions are calculated from the measured acoustic impulse response. In the measurement, special signals such as Time Stretched Pulse (TSP) are used to improve the signal-to-noise ratio of the measurement signals. Recent studies have identified the importance of selecting a measurement signal according to the applications. In this paper, we investigate how six measurement signals – up-TSP, down-TSP, M-Series, Log-SS, NW-SS, and MN-SS – influence the performance of the MUSIC-based sound source localization provided by HARK. Experiments with simulated sounds, up to three simultaneous sound sources, demonstrate no significant difference among the six measurement signals in the MUSIC-based sound source localization.

Download Full-text

Codes for Sound-Source Location in Nontonotopic Auditory Cortex

Journal of Neurophysiology ◽

10.1152/jn.1998.80.2.863 ◽

1998 ◽

Vol 80 (2) ◽

pp. 863-881 ◽

Cited By ~ 107

Author(s):

John C. Middlebrooks ◽

Li Xu ◽

Ann Clock Eddins ◽

David M. Green

Keyword(s):

Auditory Cortex ◽

Sound Localization ◽

Sound Source ◽

Source Location ◽

Spike Timing ◽

Moderate Level ◽

Single Units ◽

Sound Sources ◽

Anterior Ectosylvian Sulcus ◽

Spike Patterns

Middlebrooks, John C., Li Xu, Ann Clock Eddins, and David M. Green. Codes for sound-source location in nontonopic auditor cortex. J. Neurophysiol. 80: 863–881, 1998. We evaluated two hypothetical codes for sound-source location in the auditory cortex. The topographical code assumed that single neurons are selective for particular locations and that sound-source locations are coded by the cortical location of small populations of maximally activated neurons. The distributed code assumed that the responses of individual neurons can carry information about locations throughout 360° of azimuth and that accurate sound localization derives from information that is distributed across large populations of such panoramic neurons. We recorded from single units in the anterior ectosylvian sulcus area (area AES) and in area A2 of α-chloralose–anesthetized cats. Results obtained in the two areas were essentially equivalent. Noise bursts were presented from loudspeakers spaced in 20° intervals of azimuth throughout 360° of the horizontal plane. Spike counts of the majority of units were modulated >50% by changes in sound-source azimuth. Nevertheless, sound-source locations that produced greater than half-maximal spike counts often spanned >180° of azimuth. The spatial selectivity of units tended to broaden and, often, to shift in azimuth as sound pressure levels (SPLs) were increased to a moderate level. We sometimes saw systematic changes in spatial tuning along segments of electrode tracks as long as 1.5 mm but such progressions were not evident at higher sound levels. Moderate-level sounds presented anywhere in the contralateral hemifield produced greater than half-maximal activation of nearly all units. These results are not consistent with the hypothesis of a topographic code. We used an artificial-neural–network algorithm to recognize spike patterns and, thereby, infer the locations of sound sources. Network input consisted of spike density functions formed by averages of responses to eight stimulus repetitions. Information carried in the responses of single units permitted reasonable estimates of sound-source locations throughout 360° of azimuth. The most accurate units exhibited median errors in localization of <25°, meaning that the network output fell within 25° of the correct location on half of the trials. Spike patterns tended to vary with stimulus SPL, but level-invariant features of patterns permitted estimates of locations of sound sources that varied through 20-dB ranges. Sound localization based on spike patterns that preserved details of spike timing consistently was more accurate than localization based on spike counts alone. These results support the hypothesis that sound-source locations are represented by a distributed code and that individual neurons are, in effect, panoramic localizers.

Download Full-text

A Sound Source Identification Algorithm Based on Bayesian Compressive Sensing and Equivalent Source Method

Sensors ◽

10.3390/s20030865 ◽

2020 ◽

Vol 20 (3) ◽

pp. 865 ◽

Cited By ~ 1

Author(s):

Ming Zan ◽

Zhongming Xu ◽

Linsen Huang ◽

Zhifei Zhang

Keyword(s):

Sound Source ◽

Identification Algorithm ◽

Reconstruction Accuracy ◽

Acoustic Holography ◽

Sound Sources ◽

Low Frequencies ◽

High Frequencies ◽

Equivalent Source ◽

Sound Source Identification ◽

Coherent Sources

Near-field acoustic holography (NAH) based on equivalent source method (ESM) is an effective method for identifying sound sources. Conventional ESM focuses on relatively low frequencies and cannot provide a satisfactory solution at high frequencies. So its improved method called wideband acoustic holography (WBH) has been proposed, which has high reconstruction accuracy at medium-to-high frequencies. However, it is less accurate for coherent sound sources at low frequencies. To improve the reconstruction accuracy of conventional ESM and WBH, a sound source identification algorithm based on Bayesian compressive sensing (BCS) and ESM is proposed. This method uses a hierarchical Laplace sparse prior probability distribution, and adaptively adjusts the regularization parameter, so that the energy is concentrated near the correct equivalent source. Referring to the function beamforming idea, the original algorithm with order v can improve its dynamic range, and then more accurate position information is obtained. Based on the simulation of irregular microphone array, comparisons with conventional ESM and WBH show that the proposed method is more accurate, suitable for a wider range of frequencies, and has better reconstruction performance for coherent sources. By increasing the order v, the coherent sources can be located accurately. Finally, the stability and reliability of the proposed method are verified by experiments.

Download Full-text

Neural encoding of sound source location in the presence of a concurrent, spatially separated source

Journal of Neurophysiology ◽

10.1152/jn.00303.2012 ◽

2012 ◽

Vol 108 (9) ◽

pp. 2612-2628 ◽

Cited By ~ 22

Author(s):

Mitchell L. Day ◽

Kanthaiah Koka ◽

Bertrand Delgutte

Keyword(s):

Sound Localization ◽

Broadband Noise ◽

Neural Encoding ◽

Sound Sources ◽

Small Influence ◽

Acoustic Space ◽

Duplex Theory ◽

Acoustic Environments ◽

Interaural Level Differences ◽

Tuning Function

In the presence of multiple, spatially separated sound sources, the binaural cues used for sound localization in the horizontal plane become distorted from the cues from each sound in isolation, yet localization in everyday multisource acoustic environments remains robust. We examined changes in the azimuth tuning functions of inferior colliculus (IC) neurons in unanesthetized rabbits to a target broadband noise when a concurrent broadband noise interferer was presented at different locations in virtual acoustic space. The presence of an interferer generally degraded sensitivity to target azimuth and distorted the shape of the tuning function, yet most neurons remained significantly sensitive to target azimuth and maintained tuning function shapes somewhat similar to those for the target alone. Using binaural cue manipulations in virtual acoustic space, we found that single-source tuning functions of neurons with high best frequencies (BFs) were primarily determined by interaural level differences (ILDs) or monaural level, with a small influence of interaural time differences (ITDs) in some neurons. However, with a centrally located interferer, the tuning functions of most high-BF neurons were strongly influenced by ITDs as well as ILDs. Model-based analysis showed that the shapes of these tuning functions were in part produced by decorrelation of the left and right cochlea-induced envelopes that occurs with source separation. The strong influence of ITD on the tuning functions of high-BF neurons poses a challenge to the “duplex theory” of sound localization and suggests that ITD may be important for localizing high-frequency sounds in multisource environments.

Download Full-text

Binaural (pre)processing for contralateral sound field attenuation and improved speech-in-noise recognition

10.1101/2021.01.22.427757 ◽

2021 ◽

Author(s):

Enrique A. Lopez-Poveda ◽

Almudena Eustaquio-Martín ◽

Fernando Martín San Victoriano

Keyword(s):

Sound Source ◽

Transfer Functions ◽

Signal To Noise Ratio ◽

Free Field ◽

Sound Field ◽

Directivity Pattern ◽

Sound Sources ◽

Speech In Noise ◽

Speech Reception ◽

Speech Reception Thresholds

ABSTRACTUnderstanding speech presented in competition with other sound sources can be challenging. Here, we reason that this task can be facilitated by improving the signal-to-noise ratio (SNR) in either of the two ears and that in free-field listening scenarios, this can be achieved by attenuating contralateral sounds. We present a binaural (pre)processing algorithm that improves the SNR in the ear ipsilateral to the target sound source by linear subtraction of the weighted contralateral stimulus. Although the weight is regarded as a free parameter, we justify setting it equal to the ratio of ipsilateral to contralateral head-related transfer functions averaged over an appropriate azimuth range. The algorithm is implemented in the frequency domain and evaluated technically and experimentally for normal-hearing listeners in simulated free-field conditions. Results show that (1) it can substantially improve the SNR (up to 20 dB) and the short-term intelligibility metric in the ear ipsilateral to the target source, particularly for speech-like maskers; (2) it can improve speech reception thresholds for sentences in competition with speech-shaped noise by up to 8.5 dB in bilateral listening and 10.0 dB in unilateral listening; (3) it hardly affects sound-source localization; and (4) the improvements, and the algorithm’s directivity pattern depend on the weights. The algorithm accounts qualitatively for binaural unmasking for speech in competition with multiple maskers and for multiple target-masker spatial arrangements, an unexpected property that can inspire binaural intelligibility models.

Download Full-text