scholarly journals Deep learning assisted sound source localization using two orthogonal first-order differential microphone arrays

2021 ◽  
Vol 149 (2) ◽  
pp. 1069-1084
Author(s):  
Nian Liu ◽  
Huawei Chen ◽  
Kunkun Songgong ◽  
Yanwen Li
Author(s):  
Daniel Gabriel ◽  
Ryosuke Kojima ◽  
Kotaro Hoshiba ◽  
Katsutoshi Itoyama ◽  
Kenji Nishida ◽  
...  

2011 ◽  
Vol 368-373 ◽  
pp. 624-628
Author(s):  
Qing Sheng Wang ◽  
Xin Jiang ◽  
Xiao Hang Liu

Sound source localization is always of great value in many engineering applications. In this paper, a new instrument is designed to accomplish the purpose of localizing the sound source by a relatively compact structure. This bionics structure is designed to mimic the localization function of the ears of the parasitoid fly Ormia ochracea, and it consists of three elastic diaphragms, three bars which connected to the diaphragms, and the other mechanical components. The analysis of this structure’s dynamic behavior shows that the incident angles of the sound have special relationship to the responses of this instrument, and the incident angles can be estimated by detecting the vibrations of the three elastic diaphragms. Compared with traditional microphone arrays, this instrument has the advantage of compaction and higher integrated level.


Author(s):  
Alif Bin Abdul Qayyum ◽  
K. M. Naimul Hassan ◽  
Adrita Anika ◽  
Md. Farhan Shadiq ◽  
Md Mushfiqur Rahman ◽  
...  

Abstract Drone-embedded sound source localization (SSL) has interesting application perspective in challenging search and rescue scenarios due to bad lighting conditions or occlusions. However, the problem gets complicated by severe drone ego-noise that may result in negative signal-to-noise ratios in the recorded microphone signals. In this paper, we present our work on drone-embedded SSL using recordings from an 8-channel cube-shaped microphone array embedded in an unmanned aerial vehicle (UAV). We use angular spectrum-based TDOA (time difference of arrival) estimation methods such as generalized cross-correlation phase-transform (GCC-PHAT), minimum-variance-distortion-less-response (MVDR) as baseline, which are state-of-the-art techniques for SSL. Though we improve the baseline method by reducing ego-noise using speed correlated harmonics cancellation (SCHC) technique, our main focus is to utilize deep learning techniques to solve this challenging problem. Here, we propose an end-to-end deep learning model, called DOANet, for SSL. DOANet is based on a one-dimensional dilated convolutional neural network that computes the azimuth and elevation angles of the target sound source from the raw audio signal. The advantage of using DOANet is that it does not require any hand-crafted audio features or ego-noise reduction for DOA estimation. We then evaluate the SSL performance using the proposed and baseline methods and find that the DOANet shows promising results compared to both the angular spectrum methods with and without SCHC. To evaluate the different methods, we also introduce a well-known parameter—area under the curve (AUC) of cumulative histogram plots of angular deviations—as a performance indicator which, to our knowledge, has not been used as a performance indicator for this sort of problem before.


2017 ◽  
Vol 2017 ◽  
pp. 1-11
Author(s):  
Wei Ke ◽  
Xiunan Zhang ◽  
Yanan Yuan ◽  
Jianhua Shao

In order to enhance the accuracy of sound source localization in noisy and reverberant environments, this paper proposes an adaptive sound source localization method based on distributed microphone arrays. Since sound sources lie at a few points in the discrete spatial domain, our method can exploit this inherent sparsity to convert the localization problem into a sparse recovery problem based on the compressive sensing (CS) theory. In this method, a two-step discrete cosine transform- (DCT-) based feature extraction approach is utilized to cover both short-time and long-time properties of acoustic signals and reduce the dimensions of the sparse model. In addition, an online dictionary learning (DL) method is used to adjust the dictionary for matching the changes of audio signals, and then the sparse solution could better represent location estimations. Moreover, we propose an improved block-sparse reconstruction algorithm using approximate l0 norm minimization to enhance reconstruction performance for sparse signals in low signal-noise ratio (SNR) conditions. The effectiveness of the proposed scheme is demonstrated by simulation results and experimental results where substantial improvement for localization performance can be obtained in the noisy and reverberant conditions.


Sign in / Sign up

Export Citation Format

Share Document