scholarly journals An Iterative Kalman Filter with Reduced-Biased Kalman Gain for Single Channel Speech Enhancement in Non-stationary Noise Condition

2019 ◽  
Vol 7 (1) ◽  
pp. 7-13
Author(s):  
Sujan Kumar Roy ◽  
◽  
Kuldip K. Paliwal
Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

The inaccurate estimates of linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrades speech enhancement performance. The existing methods proposed a tuning of the biased Kalman gain particularly in stationary noise condition. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then construct a whitening filter (with its coefficients computed from the estimated noise) and employed to the noisy speech, yielding a pre-whitened speech, from where the speech LPC parameters are computed. Then construct KF with the estimated parameters, where the robustness metric offsets the bias in Kalman gain during speech absence to that of the sensitivity metric during speech presence to achieve better noise reduction. Where the noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on NOIZEUS corpus demonstrates that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


Signals ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 434-455
Author(s):  
Sujan Kumar Roy ◽  
Kuldip K. Paliwal

Inaccurate estimates of the linear prediction coefficient (LPC) and noise variance introduce bias in Kalman filter (KF) gain and degrade speech enhancement performance. The existing methods propose a tuning of the biased Kalman gain, particularly in stationary noise conditions. This paper introduces a tuning of the KF gain for speech enhancement in real-life noise conditions. First, we estimate noise from each noisy speech frame using a speech presence probability (SPP) method to compute the noise variance. Then, we construct a whitening filter (with its coefficients computed from the estimated noise) to pre-whiten each noisy speech frame prior to computing the speech LPC parameters. We then construct the KF with the estimated parameters, where the robustness metric offsets the bias in KF gain during speech absence of noisy speech to that of the sensitivity metric during speech presence to achieve better noise reduction. The noise variance and the speech model parameters are adopted as a speech activity detector. The reduced-biased Kalman gain enables the KF to minimize the noise effect significantly, yielding the enhanced speech. Objective and subjective scores on the NOIZEUS corpus demonstrate that the enhanced speech produced by the proposed method exhibits higher quality and intelligibility than some benchmark methods.


Author(s):  
Maximilian Strake ◽  
Bruno Defraene ◽  
Kristoff Fluyt ◽  
Wouter Tirry ◽  
Tim Fingscheidt

AbstractSingle-channel speech enhancement in highly non-stationary noise conditions is a very challenging task, especially when interfering speech is included in the noise. Deep learning-based approaches have notably improved the performance of speech enhancement algorithms under such conditions, but still introduce speech distortions if strong noise suppression shall be achieved. We propose to address this problem by using a two-stage approach, first performing noise suppression and subsequently restoring natural sounding speech, using specifically chosen neural network topologies and loss functions for each task. A mask-based long short-term memory (LSTM) network is employed for noise suppression and speech restoration is performed via spectral mapping with a convolutional encoder-decoder network (CED). The proposed method improves speech quality (PESQ) over state-of-the-art single-stage methods by about 0.1 points for unseen highly non-stationary noise types including interfering speech. Furthermore, it is able to increase intelligibility in low-SNR conditions and consistently outperforms all reference methods.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 64524-64538
Author(s):  
Sujan Kumar Roy ◽  
Aaron Nicolson ◽  
Kuldip K. Paliwal

Sign in / Sign up

Export Citation Format

Share Document