Time-frequency constraints for phase estimation in single-channel speech enhancement

Abstract. To enhance extreme corrupted speech signals, an Improved Psychoacoustically Motivated Spectral Weighting Rule (IPMSWR) is proposed, that controls the predefined residual noise level by a time-frequency dependent parameter. Unlike conventional Psychoacoustically Motivated Spectral Weighting Rules (PMSWR), the level of the residual noise is here varied throughout the enhanced speech based on the discrimination between the regions with speech presence and speech absence by means of segmental SNR within critical bands. Controlling in such a way the level of the residual noise in the noise only region avoids the unpleasant residual noise perceived at very low SNRs. To derive the gain coefficients, the computation of the masking curve and the estimation of the corrupting noise power are required. Since the clean speech is generally not available for a single channel speech enhancement technique, the rough clean speech components needed to compute the masking curve are here obtained using advanced spectral subtraction techniques. To estimate the corrupting noise, a new technique is employed, that relies on the noise power estimation using rapid adaptation and recursive smoothing principles. The performances of the proposed approach are objectively and subjectively compared to the conventional approaches to highlight the aforementioned improvement.

Download Full-text

Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2015.2430820 ◽

2015 ◽

Vol 23 (8) ◽

pp. 1283-1294 ◽

Cited By ~ 30

Author(s):

Pejman Mowlaee ◽

Josef Kulmer

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Phase Estimation

Download Full-text

Speech enhancement algorithm of improved OMLSA based on bilateral spectrogram filtering

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-192088 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6881-6889

Author(s):

Jie Wang ◽

Linhuang Yan ◽

Jiayi Tian ◽

Minmin Yuan

Keyword(s):

Speech Enhancement ◽

Visual Processing ◽

Single Channel ◽

Signal To Noise Ratio ◽

Spectral Amplitude ◽

Signal To Noise ◽

Noisy Speech ◽

Time Frequency ◽

Perceptual Evaluation ◽

Noise Ratio

In this paper, a bilateral spectrogram filtering (BSF)-based optimally modified log-spectral amplitude (OMLSA) estimator for single-channel speech enhancement is proposed, which can significantly improve the performance of OMLSA, especially in highly non-stationary noise environments, by taking advantage of bilateral filtering (BF), a widely used technology in image and visual processing, to preprocess the spectrogram of the noisy speech. BSF is capable of not only sharpening details, removing unwanted textures or background noise from the noisy speech spectrogram, but also preserving edges when considering a speech spectrogram as an image. The a posteriori signal-to-noise ratio (SNR) of OMLSA algorithm is estimated after applying BSF to the noisy speech. Besides, in order to reduce computing costs, a fast and accurate BF is adopted to reduce the algorithm complexity O(1) for each time-frequency bin. Finally, the proposed algorithm is compared with the original OMLSA and other classic denoising methods using various types of noise with different signal-to-noise ratios in terms of objective evaluation metrics such as segmental signal-to-noise ratio improvement and perceptual evaluation of speech quality. The results show the validity of the improved BSF-based OMLSA algorithm.

Download Full-text

A novel single channel speech enhancement using time frequency mask

2012 International Conference on Computer Science and Information Processing (CSIP) ◽

10.1109/csip.2012.6308996 ◽

2012 ◽

Author(s):

Gongxian Sun ◽

Ming Xiao ◽

Feng Gao

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Time Frequency

Download Full-text

Impact of phase estimation on single-channel speech separation based on time-frequency masking

The Journal of the Acoustical Society of America ◽

10.1121/1.4986647 ◽

2017 ◽

Vol 141 (6) ◽

pp. 4668-4679 ◽

Cited By ~ 11

Author(s):

Florian Mayer ◽

Donald S. Williamson ◽

Pejman Mowlaee ◽

DeLiang Wang

Keyword(s):

Single Channel ◽

Phase Estimation ◽

Speech Separation ◽

Time Frequency

Download Full-text

Time-frequency constraints for phase estimation in single-channel speech enhancement

Unsupervised single-channel speech enhancement based on phase aware time-frequency mask estimation

Harmonic Phase Estimation in Single-Channel Speech Enhancement Using Phase Decomposition and SNR Information

Harmonic phase estimation in single-channel speech enhancement using von mises distribution and prior SNR

Phase Estimation in Single Channel Speech Enhancement Using Phase Decomposition

A probabilistic approach for phase estimation in single-channel speech enhancement using von mises phase priors

A single channel speech enhancement technique exploiting human auditory masking properties

Phase Estimation in Single-Channel Speech Enhancement: Limits-Potential

Speech enhancement algorithm of improved OMLSA based on bilateral spectrogram filtering

A novel single channel speech enhancement using time frequency mask

Impact of phase estimation on single-channel speech separation based on time-frequency masking

Export Citation Format