scholarly journals Fast Sound Source Localization Based on SRP-PHAT Using Density Peaks Clustering

2021 ◽  
Vol 11 (1) ◽  
pp. 445
Author(s):  
De-Bing Zhuo ◽  
Hui Cao

Sound source localization has been increasingly used recently. Among the existing techniques of sound source localization, the steered response power–phase transform (SRP-PHAT) exhibits considerable advantages regarding anti-noise and anti-reverberation. When applied in real-time situations, however, the heavy computational load makes it impossible to localize the sound source in a reasonable time since SRP-PHAT employs a grid search scheme. To solve the problem, an improved procedure called ODB-SRP-PHAT, i.e., steered response power and phase transformation with an offline database (ODB), was proposed by the authors. The basic idea of ODB-SRP-PHAT is to determine the possible sound source positions using SRP-PHAT and density peak clustering before real-time localization and store the identified positions in an ODB. Then, at the online positioning stage, only the power values of the positions in the ODB will be calculated. When used in real-time monitoring, e.g., locating the speaker in a video conference, the computational load of ODB-SRP-PHAT is significantly smaller than that of SRP-PHAT. Simulations and experiments under a real environment verified the high localization accuracy with a small computational load of ODB-SRP-PHAT. In addition, the advantages of anti-noise and anti-reverberation remained. The suggested procedure displayed good applicability in a real environment.

2017 ◽  
Vol 29 (1) ◽  
pp. 26-36 ◽  
Author(s):  
Ryu Takeda ◽  
◽  
Kazunori Komatani

[abstFig src='/00290001/03.jpg' width='300' text='Sound source localization and problem' ] We focus on the problem of localizing soft/weak voices recorded by small humanoid robots, such as NAO. Sound source localization (SSL) for such robots requires fast processing and noise robustness owing to the restricted resources and the internal noise close to the microphones. Multiple signal classification using generalized eigenvalue decomposition (GEVD-MUSIC) is a promising method for SSL. It achieves noise robustness by whitening robot internal noise using prior noise information. However, whitening increases the computational cost and creates a direction-dependent bias in the localization score, which degrades the localization accuracy. We have thus developed a new implementation of GEVD-MUSIC based on steering vector transformation (TSV-MUSIC). The application of a transformation equivalent to whitening to steering vectors in advance reduces the real-time computational cost of TSV-MUSIC. Moreover, normalization of the transformed vectors cancels the direction-dependent bias and improves the localization accuracy. Experiments using simulated data showed that TSV-MUSIC had the highest accuracy of the methods tested. An experiment using real recoded data showed that TSV-MUSIC outperformed GEVD-MUSIC and other MUSIC methods in terms of localization by about 4 points under low signal-to-noise-ratio conditions.


2013 ◽  
Vol 74 (12) ◽  
pp. 1367-1373 ◽  
Author(s):  
Yilu Zhao ◽  
Xiong Chen ◽  
Bin Wang

Author(s):  
Seunghun Jin ◽  
Dongkyun Kim ◽  
Hyung Soon Kim ◽  
Chang Hoon Lee ◽  
Jong Suk Choi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document