Effective DOA Estimation Method for Sound Source Localization Using a Circular Microphone Array

2021 ◽  
pp. 497-505
Author(s):  
Douaer Belgacem
Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 867
Author(s):  
Ali Dehghan Firoozabadi ◽  
Pablo Irarrazaval ◽  
Pablo Adasme ◽  
David Zabala-Blanco ◽  
Pablo Palacios-Játiva ◽  
...  

Sound source localization is one of the applicable areas in speech signal processing. The main challenge appears when the aim is a simultaneous multiple sound source localization from overlapped speech signals with an unknown number of speakers. Therefore, a method able to estimate the number of speakers, along with the speaker’s location, and with high accuracy is required in real-time conditions. The spatial aliasing is an undesirable effect of the use of microphone arrays, which decreases the accuracy of localization algorithms in noisy and reverberant conditions. In this article, a cuboids nested microphone array (CuNMA) is first proposed for eliminating the spatial aliasing. The CuNMA is designed to receive the speech signal of all speakers in different directions. In addition, the inter-microphone distance is adjusted for considering enough microphone pairs for each subarray, which prepares appropriate information for 3D sound source localization. Subsequently, a speech spectral estimation method is considered for evaluating the speech spectrum components. The suitable spectrum components are selected and the undesirable components are denied in the localization process. The speech information is different in frequency bands. Therefore, the adaptive wavelet transform is used for subband processing in the proposed algorithm. The generalized eigenvalue decomposition (GEVD) method is implemented in sub-bands on all nested microphone pairs, and the probability density function (PDF) is calculated for estimating the direction of arrival (DOA) in different sub-bands and continuing frames. The proper PDFs are selected by thresholding on the standard deviation (SD) of the estimated DOAs and the rest are eliminated. This process is repeated on time frames to extract the best DOAs. Finally, K-means clustering and silhouette criteria are considered for DOAs classification in order to estimate the number of clusters (speakers) and the related DOAs. All DOAs in each cluster are intersected for estimating the position of the 3D speakers. The closest point to all DOA planes is selected as a speaker position. The proposed method is compared with a hierarchical grid (HiGRID), perpendicular cross-spectra fusion (PCSF), time-frequency wise spatial spectrum clustering (TF-wise SSC), and spectral source model-deep neural network (SSM-DNN) algorithms based on the accuracy and computational complexity of real and simulated data in noisy and reverberant conditions. The results show the superiority of the proposed method in comparison with other previous works.


2013 ◽  
Vol 397-400 ◽  
pp. 2209-2214
Author(s):  
Chuan Yi Zhang ◽  
Chang Wei Mi ◽  
Pei Yang Yao

In the estimation of time delay, there always would not appear obvious peak with the basic cross-correlation (CC). In order to solve the problem of the basic cross-correlation method, this essay represents an improved time delay estimation method based on the generalized cross-correlation (GCC) and combines with the microphone array structure to achieve sound source localization. Finally, the simulation results show that this method could measure the sound source’s location accurately with noise and reverberation, and the distance positioning error is less than 10cm, the direction angle error is below 3°.


Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4326 ◽  
Author(s):  
Haitao Liu ◽  
Thia Kirubarajan ◽  
Qian Xiao

Various microphone array geometries (e.g., linear, circular, square, cubic, spherical, etc.) have been used to improve the positioning accuracy of sound source localization. However, whether these array structures are optimal for various specific localization scenarios is still a subject of debate. This paper addresses a microphone array optimization method for sound source localization based on TDOA (time difference of arrival). The geometric structure of the microphone array is established in parametric form. A triangulation method with TDOA was used to build the spatial sound source location model, which consists of a group of nonlinear multivariate equations. Through reasonable transformation, the nonlinear multivariate equations can be converted to a group of linear equations that can be approximately solved by the weighted least square method. Then, an optimization model based on particle swarm optimization (PSO) algorithm was constructed to optimize the geometric parameters of the microphone array under different localization scenarios combined with the spatial sound source localization model. In the optimization model, a reasonable fitness evaluation function is established which can comprehensively consider the positioning accuracy and robustness of the microphone array. In order to verify the array optimization method, two specific localization scenarios and two array optimization strategies for each localization scenario were constructed. The optimal array structure parameters were obtained through numerical iteration simulation. The localization performance of the optimal array structures obtained by the method proposed in this paper was compared with the optimal structures proposed in the literature as well as with random array structures. The simulation results show that the optimized array structure gave better positioning accuracy and robustness under both specific localization scenarios. The optimization model proposed could solve the problem of array geometric structure design based on TDOA and could achieve the customization of microphone array structures under different specific localization scenarios.


Sign in / Sign up

Export Citation Format

Share Document