Noise-Robust MUSIC-Based Sound Source Localization Using Steering Vector Transformation for Small Humanoids

2017 ◽  
Vol 29 (1) ◽  
pp. 26-36 ◽  
Author(s):  
Ryu Takeda ◽  
◽  
Kazunori Komatani

[abstFig src='/00290001/03.jpg' width='300' text='Sound source localization and problem' ] We focus on the problem of localizing soft/weak voices recorded by small humanoid robots, such as NAO. Sound source localization (SSL) for such robots requires fast processing and noise robustness owing to the restricted resources and the internal noise close to the microphones. Multiple signal classification using generalized eigenvalue decomposition (GEVD-MUSIC) is a promising method for SSL. It achieves noise robustness by whitening robot internal noise using prior noise information. However, whitening increases the computational cost and creates a direction-dependent bias in the localization score, which degrades the localization accuracy. We have thus developed a new implementation of GEVD-MUSIC based on steering vector transformation (TSV-MUSIC). The application of a transformation equivalent to whitening to steering vectors in advance reduces the real-time computational cost of TSV-MUSIC. Moreover, normalization of the transformed vectors cancels the direction-dependent bias and improves the localization accuracy. Experiments using simulated data showed that TSV-MUSIC had the highest accuracy of the methods tested. An experiment using real recoded data showed that TSV-MUSIC outperformed GEVD-MUSIC and other MUSIC methods in terms of localization by about 4 points under low signal-to-noise-ratio conditions.

Author(s):  
Aidan O. T. Hogg ◽  
Vincent W. Neo ◽  
Stephan Weiss ◽  
Christine Evers ◽  
Patrick A. Naylor

2018 ◽  
Vol 30 (3) ◽  
pp. 426-435 ◽  
Author(s):  
Kotaro Hoshiba ◽  
Kazuhiro Nakadai ◽  
Makoto Kumon ◽  
Hiroshi G. Okuno ◽  
◽  
...  

We have studied sound source localization, using a microphone array embedded on a UAV (unmanned aerial vehicle), for the purpose of detecting for people to rescue from disaster-stricken areas or other dangerous situations, and we have proposed sound source localization methods for use in outdoor environments. In these methods, noise robustness and real-time processing have a trade-off relationship, which is a problem to be solved for the practical application of the methods. Sound source localization in a disaster area requires both noise robustness and real-time processing. For this we propose a sound source localization method using an active frequency range filter based on the MUSIC (MUltiple Signal Classification) method. Our proposed method can successively create and apply a frequency range filter by simply using the four arithmetic operations, so it can ensure both noise robustness and real-time processing. As numerical simulations carried out to compare the successful localization rate and the processing delay with conventional methods have affirmed the usefulness of the proposed method, we have successfully produced a sound source localization method that has both noise robustness and real-time processing.


2021 ◽  
Vol 25 ◽  
pp. 233121652110161
Author(s):  
Julian Angermeier ◽  
Werner Hemmert ◽  
Stefan Zirn

Users of a cochlear implant (CI) in one ear, who are provided with a hearing aid (HA) in the contralateral ear, so-called bimodal listeners, are typically affected by a constant and relatively large interaural time delay offset due to differences in signal processing and differences in stimulation. For HA stimulation, the cochlear travelling wave delay is added to the processing delay, while for CI stimulation, the auditory nerve fibers are stimulated directly. In case of MED-EL CI systems in combination with different HA types, the CI stimulation precedes the acoustic HA stimulation by 3 to 10 ms. A self-designed, battery-powered, portable, and programmable delay line was applied to the CI to reduce the device delay mismatch in nine bimodal listeners. We used an A-B-B-A test design and determined if sound source localization improves when the device delay mismatch is reduced by delaying the CI stimulation by the HA processing delay (τHA). Results revealed that every subject in our group of nine bimodal listeners benefited from the approach. The root-mean-square error of sound localization improved significantly from 52.6° to 37.9°. The signed bias also improved significantly from 25.2° to 10.5°, with positive values indicating a bias toward the CI. Furthermore, two other delay values (τHA –1 ms and τHA +1 ms) were applied, and with the latter value, the signed bias was further reduced in some test subjects. We conclude that sound source localization accuracy in bimodal listeners improves instantaneously and sustainably when the device delay mismatch is reduced.


2015 ◽  
Vol 20 (3) ◽  
pp. 166-171 ◽  
Author(s):  
Louise H. Loiselle ◽  
Michael F. Dorman ◽  
William A. Yost ◽  
René H. Gifford

The aim of this article was to study sound source localization by cochlear implant (CI) listeners with low-frequency (LF) acoustic hearing in both the operated ear and in the contralateral ear. Eight CI listeners had symmetrical LF acoustic hearing and 4 had asymmetrical LF acoustic hearing. The effects of two variables were assessed: (i) the symmetry of the LF thresholds in the two ears and (ii) the presence/absence of bilateral acoustic amplification. Stimuli consisted of low-pass, high-pass, and wideband noise bursts presented in the frontal horizontal plane. Localization accuracy was 23° of error for the symmetrical listeners and 76° of error for the asymmetrical listeners. The presence of a unilateral CI used in conjunction with bilateral LF acoustic hearing does not impair sound source localization accuracy, but amplification for acoustic hearing can be detrimental to sound source localization accuracy.


2016 ◽  
Vol 21 (3) ◽  
pp. 127-131 ◽  
Author(s):  
Michael F. Dorman ◽  
Louise H. Loiselle ◽  
Sarah J. Cook ◽  
William A. Yost ◽  
René H. Gifford

Objective: Our primary aim was to determine whether listeners in the following patient groups achieve localization accuracy within the 95th percentile of accuracy shown by younger or older normal-hearing (NH) listeners: (1) hearing impaired with bilateral hearing aids, (2) bimodal cochlear implant (CI), (3) bilateral CI, (4) hearing preservation CI, (5) single-sided deaf CI and (6) combined bilateral CI and bilateral hearing preservation. Design: The listeners included 57 young NH listeners, 12 older NH listeners, 17 listeners fit with hearing aids, 8 bimodal CI listeners, 32 bilateral CI listeners, 8 hearing preservation CI listeners, 13 single-sided deaf CI listeners and 3 listeners with bilateral CIs and bilateral hearing preservation. Sound source localization was assessed in a sound-deadened room with 13 loudspeakers arrayed in a 180-degree arc. Results: The root mean square (rms) error for the NH listeners was 6 degrees. The 95th percentile was 11 degrees. Nine of 16 listeners with bilateral hearing aids achieved scores within the 95th percentile of normal. Only 1 of 64 CI patients achieved a score within that range. Bimodal CI listeners scored at a level near chance, as did the listeners with a single CI or a single NH ear. Listeners with (1) bilateral CIs, (2) hearing preservation CIs, (3) single-sided deaf CIs and (4) both bilateral CIs and bilateral hearing preservation, all showed rms error scores within a similar range (mean scores between 20 and 30 degrees of error). Conclusion: Modern CIs do not restore a normal level of sound source localization for CI listeners with access to sound information from two ears.


2018 ◽  
Vol 29 (03) ◽  
pp. 197-205 ◽  
Author(s):  
Michael F. Dorman ◽  
Sarah Natale ◽  
Louise Loiselle

AbstractSentence understanding scores for patients with cochlear implants (CIs) when tested in quiet are relatively high. However, sentence understanding scores for patients with CIs plummet with the addition of noise.To assess, for patients with CIs (MED-EL), (1) the value to speech understanding of two new, noise-reducing microphone settings and (2) the effect of the microphone settings on sound source localization.Single-subject, repeated measures design. For tests of speech understanding, repeated measures on (1) number of CIs (one, two), (2) microphone type (omni, natural, adaptive beamformer), and (3) type of noise (restaurant, cocktail party). For sound source localization, repeated measures on type of signal (low-pass [LP], high-pass [HP], broadband noise).Ten listeners, ranging in age from 48 to 83 yr (mean = 57 yr), participated in this prospective study.Speech understanding was assessed in two noise environments using monaural and bilateral CIs fit with three microphone types. Sound source localization was assessed using three microphone types.In Experiment 1, sentence understanding scores (in terms of percent words correct) were obtained in quiet and in noise. For each patient, noise was first added to the signal to drive performance off of the ceiling in the bilateral CI-omni microphone condition. The other conditions were then administered at that signal-to-noise ratio in quasi-random order. In Experiment 2, sound source localization accuracy was assessed for three signal types using a 13-loudspeaker array over a 180° arc. The dependent measure was root-mean-score error.Both the natural and adaptive microphone settings significantly improved speech understanding in the two noise environments. The magnitude of the improvement varied between 16 and 19 percentage points for tests conducted in the restaurant environment and between 19 and 36 percentage points for tests conducted in the cocktail party environment. In the restaurant and cocktail party environments, both the natural and adaptive settings, when implemented on a single CI, allowed scores that were as good as, or better, than scores in the bilateral omni test condition. Sound source localization accuracy was unaltered by either the natural or adaptive settings for LP, HP, or wideband noise stimuli.The data support the use of the natural microphone setting as a default setting. The natural setting (1) provides better speech understanding in noise than the omni setting, (2) does not impair sound source localization, and (3) retains low-frequency sensitivity to signals from the rear. Moreover, bilateral CIs equipped with adaptive beamforming technology can engender speech understanding scores in noise that fall only a little short of scores for a single CI in quiet.


Sign in / Sign up

Export Citation Format

Share Document