sound source separation Latest Research Papers

Thanks to the development of deep learning, various sound source separation networks have been proposed and made significant progress. However, the study on the underlying separation mechanisms is still in its infancy. In this study, deep networks are explained from the perspective of auditory perception mechanisms. For separating two arbitrary sound sources from monaural recordings, three different networks with different parameters are trained and achieve excellent performances. The networks’ output can obtain an average scale-invariant signal-to-distortion ratio improvement (SI-SDRi) higher than 10 dB, comparable with the human performance to separate natural sources. More importantly, the most intuitive principle—proximity—is explored through simultaneous and sequential organization experiments. Results show that regardless of network structures and parameters, the proximity principle is learned spontaneously by all networks. If components are proximate in frequency or time, they are not easily separated by networks. Moreover, the frequency resolution at low frequencies is better than at high frequencies. These behavior characteristics of all three networks are highly consistent with those of the human auditory system, which implies that the learned proximity principle is not accidental, but the optimal strategy selected by networks and humans when facing the same task. The emergence of the auditory-like separation mechanisms provides the possibility to develop a universal system that can be adapted to all sources and scenes.

Download Full-text

A multichannel learning-based approach for sound source separation in reverberant environments

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00227-2 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

You-Siang Chen ◽

Zi-Jie Lin ◽

Mingsian R. Bai

Keyword(s):

Frequency Domain ◽

Sound Source ◽

Signal To Noise Ratio ◽

Source Separation ◽

Objective Evaluation ◽

Linear Mapping ◽

Invariant Mean ◽

Scale Invariant ◽

Sound Source Separation ◽

Reverberant Field

AbstractIn this paper, a multichannel learning-based network is proposed for sound source separation in reverberant field. The network can be divided into two parts according to the training strategies. In the first stage, time-dilated convolutional blocks are trained to estimate the array weights for beamforming the multichannel microphone signals. Next, the output of the network is processed by a weight-and-sum operation that is reformulated to handle real-valued data in the frequency domain. In the second stage, a U-net model is concatenated to the beamforming network to serve as a non-linear mapping filter for joint separation and dereverberation. The scale invariant mean square error (SI-MSE) that is a frequency-domain modification from the scale invariant signal-to-noise ratio (SI-SNR) is used as the objective function for training. Furthermore, the combined network is also trained with the speech segments filtered by a great variety of room impulse responses. Simulations are conducted for comprehensive multisource scenarios of various subtending angles of sources and reverberation times. The proposed network is compared with several baseline approaches in terms of objective evaluation matrices. The results have demonstrated the excellent performance of the proposed network in dereverberation and separation, as compared to baseline methods.

Download Full-text

An Underdetermined Environmental Sound Source Separation Algorithm Based on Improved Complete Ensemble EMD with Adaptive Noise and ICA

10.1109/icct52962.2021.9658032 ◽

2021 ◽

Author(s):

Mei Wang ◽

Lu Bai ◽

Liyan Luo ◽

Ye Jin ◽

Ruibin He ◽

...

Keyword(s):

Sound Source ◽

Source Separation ◽

Separation Algorithm ◽

Environmental Sound ◽

Sound Source Separation ◽

Adaptive Noise

Download Full-text

Multiple Sound Source Separation by Using DOA Estimation and ICA

10.1109/icicsp54369.2021.9611980 ◽

2021 ◽

Author(s):

Wenjie Xu ◽

Maoshen Jia ◽

Shang Gao ◽

Lu Li

Keyword(s):

Sound Source ◽

Source Separation ◽

Doa Estimation ◽

Sound Source Separation

Download Full-text

Compute and Memory Efficient Universal Sound Source Separation

Journal of Signal Processing Systems ◽

10.1007/s11265-021-01683-x ◽

2021 ◽

Author(s):

Efthymios Tzinis ◽

Zhepei Wang ◽

Xilin Jiang ◽

Paris Smaragdis

Keyword(s):

Sound Source ◽

Source Separation ◽

Sound Source Separation ◽

Memory Efficient

Download Full-text

Leveraging Category Information for Single-Frame Visual Sound Source Separation

2021 9th European Workshop on Visual Information Processing (EUVIP) ◽

10.1109/euvip50544.2021.9484036 ◽

2021 ◽

Author(s):

Lingyu Zhu ◽

Esa Rahtu

Keyword(s):

Sound Source ◽

Source Separation ◽

Category Information ◽

Single Frame ◽

Sound Source Separation

Download Full-text

DBnet: Doa-Driven Beamforming Network for end-to-end Reverberant Sound Source Separation

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414187 ◽

2021 ◽

Author(s):

Ali Aroudi ◽

Sebastian Braun

Keyword(s):

Sound Source ◽

Source Separation ◽

Sound Source Separation ◽

End To End

Download Full-text

Auditory Filterbanks Benefit Universal Sound Source Separation

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414105 ◽

2021 ◽

Author(s):

Han Li ◽

Kean Chen ◽

Bernhard U. Seeber

Keyword(s):

Sound Source ◽

Source Separation ◽

Sound Source Separation

Download Full-text

Maximum a Posteriori Estimator for Convolutive Sound Source Separation with Sub-Source Based NTF Model and the Localization Probabilistic Prior on the Mixing Matrix

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413863 ◽

2021 ◽

Author(s):

Mieszko Fras ◽

Konrad Kowalczyk

Keyword(s):

Sound Source ◽

Source Separation ◽

Maximum A Posteriori ◽

A Posteriori ◽

Sound Source Separation ◽

Mixing Matrix

Download Full-text

Multichannel environmental sound segmentation

Applied Intelligence ◽

10.1007/s10489-021-02314-5 ◽

2021 ◽

Author(s):

Yui Sudo ◽

Katsutoshi Itoyama ◽

Kenji Nishida ◽

Kazuhiro Nakadai

Keyword(s):

Source Localization ◽

Sound Source ◽

Source Separation ◽

Sound Source Localization ◽

Segmentation Method ◽

Environmental Sound ◽

Environmental Sounds ◽

Spatial Features ◽

Sound Source Separation ◽

The Relationship

AbstractThis paper proposes a multichannel environmental sound segmentation method. Environmental sound segmentation is an integrated method to achieve sound source localization, sound source separation and classification, simultaneously. When multiple microphones are available, spatial features can be used to improve the localization and separation accuracy of sounds from different directions; however, conventional methods have three drawbacks: (a) Sound source localization and sound source separation methods using spatial features and classification using spectral features trained in the same neural network, may overfit to the relationship between the direction of arrival and the class of a sound, thereby reducing their reliability to deal with novel events. (b) Although permutation invariant training used in autonomous speech recognition could be extended, it is impractical for environmental sounds that include an unlimited number of sound sources. (c) Various features, such as complex values of short time Fourier transform and interchannel phase differences have been used as spatial features, but no study has compared them. This paper proposes a multichannel environmental sound segmentation method comprising two discrete blocks, a sound source localization and separation block and a sound source separation and classification block. By separating the blocks, overfitting to the relationship between the direction of arrival and the class is avoided. Simulation experiments using created datasets including 75-class environmental sounds showed the root mean squared error of the proposed method was lower than that of conventional methods.

Download Full-text

sound source separation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Sound Source Separation Mechanisms of Different Deep Networks Explained from the Perspective of Auditory Perception

A multichannel learning-based approach for sound source separation in reverberant environments

An Underdetermined Environmental Sound Source Separation Algorithm Based on Improved Complete Ensemble EMD with Adaptive Noise and ICA

Multiple Sound Source Separation by Using DOA Estimation and ICA

Compute and Memory Efficient Universal Sound Source Separation

Leveraging Category Information for Single-Frame Visual Sound Source Separation

DBnet: Doa-Driven Beamforming Network for end-to-end Reverberant Sound Source Separation

Auditory Filterbanks Benefit Universal Sound Source Separation

Maximum a Posteriori Estimator for Convolutive Sound Source Separation with Sub-Source Based NTF Model and the Localization Probabilistic Prior on the Mixing Matrix

Multichannel environmental sound segmentation

Export Citation Format

sound source separationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Sound Source Separation Mechanisms of Different Deep Networks Explained from the Perspective of Auditory Perception

A multichannel learning-based approach for sound source separation in reverberant environments

An Underdetermined Environmental Sound Source Separation Algorithm Based on Improved Complete Ensemble EMD with Adaptive Noise and ICA

Multiple Sound Source Separation by Using DOA Estimation and ICA

Compute and Memory Efficient Universal Sound Source Separation

Leveraging Category Information for Single-Frame Visual Sound Source Separation

DBnet: Doa-Driven Beamforming Network for end-to-end Reverberant Sound Source Separation

Auditory Filterbanks Benefit Universal Sound Source Separation

Maximum a Posteriori Estimator for Convolutive Sound Source Separation with Sub-Source Based NTF Model and the Localization Probabilistic Prior on the Mixing Matrix

Multichannel environmental sound segmentation

sound source separation
Recently Published Documents