sound source localization
Recently Published Documents


TOTAL DOCUMENTS

780
(FIVE YEARS 212)

H-INDEX

31
(FIVE YEARS 5)

2022 ◽  
Vol 12 (1) ◽  
pp. 83
Author(s):  
Sohaib Siddique Butt ◽  
Mahnoor Fatima ◽  
Ali Asghar ◽  
Wasif Muhammad

Sound Source Localization (SSL) and gaze shift to the sound source behavior is an integral part of a socially interactive humanoid robot perception system. In noisy and reverberant environments, it is non-trivial to estimate the location of a sound source and accurately shift gaze in its direction. Previous SSL algorithms are deficient in the optimum approximation of distance to audio sources and to accurately detect, interpret, and differentiate the actual sound from comparable sound sources due to challenging acoustic environments. In this article, a learning-based model is presented to achieve noiseless and reverberation-resistant sound source localization in the real-world scenarios. The proposed system utilizes a multi-layered Gaussian Cross-Correlation with Phase Transform (GCC-PHAT) signal processing technique as a baseline for a Generalized Cross Correlation Convolution Neural Network (GCC-CNN) model. The proposed model is integrated with an efficient rotation algorithm to predict and orient toward the sound source. The performance of the proposed method is compared with the state-of-art deep network-based sound source localization methods. The findings of the proposed method outperform the existing neural network-based approaches by achieving the highest accuracy of 96.21% for an active binaural auditory perceptual system.


Author(s):  
Wageesha Nilmini Manamperi ◽  
Thushara Dheemantha Abhayapala ◽  
Jihui Amiee Zhang ◽  
Prasanga Samarasinghe

Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8031
Author(s):  
Tan-Hsu Tan ◽  
Yu-Tang Lin ◽  
Yang-Lang Chang ◽  
Mohammad Alkhaleefah

In this research, a novel sound source localization model is introduced that integrates a convolutional neural network with a regression model (CNN-R) to estimate the sound source angle and distance based on the acoustic characteristics of the interaural phase difference (IPD). The IPD features of the sound signal are firstly extracted from time-frequency domain by short-time Fourier transform (STFT). Then, the IPD features map is fed to the CNN-R model as an image for sound source localization. The Pyroomacoustics platform and the multichannel impulse response database (MIRD) are used to generate both simulated and real room impulse response (RIR) datasets. The experimental results show that an average accuracy of 98.96% and 98.31% are achieved by the proposed CNN-R for angle and distance estimations in the simulation scenario at SNR = 30 dB and RT60 = 0.16 s, respectively. Moreover, in the real environment, the average accuracies of the angle and distance estimations are 99.85% and 99.38% at SNR = 30 dB and RT60 = 0.16 s, respectively. The performance obtained in both scenarios is superior to that of existing models, indicating the potential of the proposed CNN-R model for real-life applications.


Author(s):  
Rongjiang Tang ◽  
Yingxiang zuo ◽  
Weiya Liu ◽  
Liguo Tang ◽  
Weiguang Zheng ◽  
...  

Abstract In this paper, we propose a compressed sensing (CS) sound source localization algorithm based on signal energy to solve the problem of stopping iteration condition of orthogonal matching pursuit reconstruction algorithm in compressed sensing. The orthogonal matching tracking algorithm needs to stop iteration according to the number of sound sources or the change of residual. Generally, the number of sound sources cannot be known in advance, and the residual often leads to unnecessary calculation. Because the sound source is sparsely distributed in space, and its energy is concentrated and higher than that of the environmental noise, the comparison of the signal energy at different positions in each iteration reconstruction signal is used to determine whether the new sound source is added in this iteration. At the same time, the block sparsity is introduced by using multiple frequency points to avoid the problem of different iteration times of different frequency points in the same frame caused by the uneven energy distribution in the signal frequency domain. Simulation and experimental results show that the proposed algorithm retains the advantages of the orthogonal matching tracking sound source localization algorithm, and can complete the iteration well. Under the premise of not knowing the number of sound sources, the maximum error between the number of iterations and the set number of sound sources is 0.31.


Sign in / Sign up

Export Citation Format

Share Document