Deep Learning for Binaural Sound Source Localization with Low Signal-to-noise Ratio

A multiple sound source localization and counting method based on an angular spectrum is proposed in this paper. Local signal-to-noise ratio tracking, onset detection, and a coherence test are introduced to filter the generalized cross-correlation angular spectrum in the time-frequency domain for multiple sound source localization and counting in noisy and reverberant environments. Then, dual-width matching pursuit is introduced to replace peak search as the method of localization and counting. A comprehensive comparison of two statistical indicators, mean precision and mean absolute estimated error, indicates that the proposed localization and counting algorithm using both the filtered angular spectrum and dual-width matching pursuit method is more robust and accurate than the classic counterpart, especially in environments with low signal-to-noise ratio, strong reverberation, and abundant sound sources.

Download Full-text

ANALISIS SENSITIVITAS VIDEO MPEG-4 BERDASARKAN STRUKTUR FRAME PADA TRANSMISI DVB-T

Jurnal Ilmiah Informatika Komputer ◽

10.35760/ik.2020.v25i2.2691 ◽

2020 ◽

Vol 25 (2) ◽

pp. 86-97

Author(s):

Sandy Suryo Prayogo ◽

Tubagus Maulana Kusuma

Keyword(s):

Deep Learning ◽

Bit Error Rate ◽

Error Rate ◽

Signal To Noise Ratio ◽

Similarity Index ◽

Structural Similarity ◽

Signal To Noise ◽

Structural Similarity Index ◽

Noise Ratio

DVB merupakan standar transmisi televisi digital yang paling banyak digunakan saat ini. Unsur terpenting dari suatu proses transmisi adalah kualitas gambar dari video yang diterima setelah melalui proses transimisi tersebut. Banyak faktor yang dapat mempengaruhi kualitas dari suatu gambar, salah satunya adalah struktur frame dari video. Pada tulisan ini dilakukan pengujian sensitifitas video MPEG-4 berdasarkan struktur frame pada transmisi DVB-T. Pengujian dilakukan menggunakan simulasi matlab dan simulink. Digunakan juga ffmpeg untuk menyediakan format dan pengaturan video akan disimulasikan. Variabel yang diubah dari video adalah bitrate dan juga group-of-pictures (GOP), sedangkan variabel yang diubah dari transmisi DVB-T adalah signal-to-noise-ratio (SNR) pada kanal AWGN di antara pengirim (Tx) dan penerima (Rx). Hasil yang diperoleh dari percobaan berupa kualitas rata-rata gambar pada video yang diukur menggunakan metode pengukuran structural-similarity-index (SSIM). Dilakukan juga pengukuran terhadap jumlah bit-error-rate BER pada bitstream DVB-T. Percobaan yang dilakukan dapat menunjukkan seberapa besar sensitifitas bitrate dan GOP dari video pada transmisi DVB-T dengan kesimpulan semakin besar bitrate maka akan semakin buruk nilai kualitas gambarnya, dan semakin kecil nilai GOP maka akan semakin baik nilai kualitasnya. Penilitian diharapkan dapat dikembangkan menggunakan deep learning untuk memperoleh frame struktur yang tepat di kondisi-kondisi tertentu dalam proses transmisi televisi digital.

Download Full-text

Frame-Level Signal-to-Noise Ratio Estimation Using Deep Learning

10.21437/interspeech.2020-2475 ◽

2020 ◽

Author(s):

Hao Li ◽

DeLiang Wang ◽

Xueliang Zhang ◽

Guanglai Gao

Keyword(s):

Deep Learning ◽

Signal To Noise Ratio ◽

Ratio Estimation ◽

Signal To Noise ◽

Noise Ratio

Download Full-text

Towards Robust Multiple Blind Source Localization Using Source Separation and Beamforming

Sensors ◽

10.3390/s21020532 ◽

2021 ◽

Vol 21 (2) ◽

pp. 532

Author(s):

Henglin Pu ◽

Chao Cai ◽

Menglan Hu ◽

Tianping Deng ◽

Rong Zheng ◽

...

Keyword(s):

Source Localization ◽

Sound Source ◽

Indoor Localization ◽

Weighting Function ◽

Signal To Noise Ratio ◽

Source Separation ◽

Sound Source Localization ◽

Angle Of Arrival ◽

Sound Sources ◽

Localization Algorithms

Multiple blind sound source localization is the key technology for a myriad of applications such as robotic navigation and indoor localization. However, existing solutions can only locate a few sound sources simultaneously due to the limitation imposed by the number of microphones in an array. To this end, this paper proposes a novel multiple blind sound source localization algorithms using Source seParation and BeamForming (SPBF). Our algorithm overcomes the limitations of existing solutions and can locate more blind sources than the number of microphones in an array. Specifically, we propose a novel microphone layout, enabling salient multiple source separation while still preserving their arrival time information. After then, we perform source localization via beamforming using each demixed source. Such a design allows minimizing mutual interference from different sound sources, thereby enabling finer AoA estimation. To further enhance localization performance, we design a new spectral weighting function that can enhance the signal-to-noise-ratio, allowing a relatively narrow beam and thus finer angle of arrival estimation. Simulation experiments under typical indoor situations demonstrate a maximum of only 4∘ even under up to 14 sources.

Download Full-text

SSLIDE: Sound Source Localization for Indoors Based on Deep Learning

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9415109 ◽

2021 ◽

Author(s):

Yifan Wu ◽

Roshan Ayyalasomayajula ◽

Michael J. Bianco ◽

Dinesh Bharadia ◽

Peter Gerstoft

Keyword(s):

Deep Learning ◽

Source Localization ◽

Sound Source ◽

Sound Source Localization

Download Full-text

Deep Learning Based Prediction of Signal-to-Noise Ratio (SNR) for LTE and 5G Systems

2020 8th International Conference on Wireless Networks and Mobile Communications (WINCOM) ◽

10.1109/wincom50532.2020.9272470 ◽

2020 ◽

Author(s):

Thinh Ngo ◽

Brian Kelley ◽

Paul Rad

Keyword(s):

Deep Learning ◽

Signal To Noise Ratio ◽

Signal To Noise ◽

5G Systems ◽

Noise Ratio

Download Full-text

Source localization of interictal epileptiform discharges: Comparison of three different techniques to improve signal to noise ratio

Clinical Neurophysiology ◽

10.1016/j.clinph.2005.11.014 ◽

2006 ◽

Vol 117 (3) ◽

pp. 562-571 ◽

Cited By ~ 16

Author(s):

Dominik Zumsteg ◽

Alon Friedman ◽

Heinz Gregor Wieser ◽

Richard A. Wennberg

Keyword(s):

Source Localization ◽

Signal To Noise Ratio ◽

Signal To Noise ◽

Epileptiform Discharges ◽

Interictal Epileptiform Discharges ◽

Noise Ratio

Download Full-text

Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2018.2830119 ◽

2019 ◽

Vol 30 (1) ◽

pp. 138-150 ◽

Cited By ~ 5

Author(s):

Jorge Davila-Chacon ◽

Jindong Liu ◽

Stefan Wermter

Keyword(s):

Speech Recognition ◽

Source Localization ◽

Sound Source ◽

Sound Source Localization ◽

Binaural Sound

Download Full-text

An SNR Estimation Technique Based on Deep Learning

Electronics ◽

10.3390/electronics8101139 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1139 ◽

Cited By ~ 1

Author(s):

Kai Yang ◽

Zhitao Huang ◽

Xiang Wang ◽

Fenghua Wang

Keyword(s):

Deep Learning ◽

Signal To Noise Ratio ◽

A Priori ◽

Intermediate Frequency ◽

Estimation Technique ◽

Signal To Noise ◽

Snr Estimation ◽

Application Range ◽

Noise Ratio ◽

Priori Information

Signal-to-noise ratio (SNR) is a priori information necessary for many signal processing algorithms or techniques. However, there are many problems exsisting in conventional SNR estimation techniques, such as limited application range of modulation types, narrow effective estimation range of signal-to-noise ratio, and poor ability to accommodate non-zero timing offsets and frequency offsets. In this paper, an SNR estimation technique based on deep learning (DL) is proposed, which is a non-data-aid (NDA) technique. Second and forth moment (M2M4) estimator is used as a benchmark, and experimental results show that the performance and robustness of the proposed method are better, and the applied ranges of modulation types is wider. At the same time, the proposed method is not only applicable to the baseband signal and the incoherent signal, but can also estimate the SNR of the intermediate frequency signal.

Download Full-text

Dynamic binaural sound source localization with ITD cues: Human listeners

The Journal of the Acoustical Society of America ◽

10.1121/1.4920636 ◽

2015 ◽

Vol 137 (4) ◽

pp. 2376-2376 ◽

Cited By ~ 3

Author(s):

Xuan Zhong ◽

William Yost ◽

Liang Sun

Keyword(s):

Source Localization ◽

Sound Source ◽

Sound Source Localization ◽

Binaural Sound

Download Full-text