Reduction of residual noise based on eigencomponent filtering for speech enhancement

2018 ◽  
Vol 21 (4) ◽  
pp. 877-886
Author(s):  
Kewen Huang ◽  
Yimin Liu ◽  
Yuanquan Hong

This paper introduces technology to improve sound quality, which serves the needs of media and entertainment. Major challenging problem in the speech processing applications like mobile phones, hands-free phones, car communication, teleconference systems, hearing aids, voice coders, automatic speech recognition and forensics etc., is to eliminate the background noise. Speech enhancement algorithms are widely used for these applications in order to remove the noise from degraded speech in the noisy environment. Hence, the conventional noise reduction methods introduce more residual noise and speech distortion. So, it has been found that the noise reduction process is more effective to improve the speech quality but it affects the intelligibility of the clean speech signal. In this paper, we introduce a new model of coherence-based noise reduction method for the complex noise environment in which a target speech coexists with a coherent noise around. From the coherence model, the information of speech presence probability is added to better track noise variation accurately; and during the speech presence and speech absent period, adaptive coherence-based method is adjusted. The performance of suggested method is evaluated in condition of diffuse and real street noise, and it improves the speech signal quality less speech distortion and residual noise.


2010 ◽  
Vol 8 ◽  
pp. 95-99
Author(s):  
F. X. Nsabimana ◽  
V. Subbaraman ◽  
U. Zölzer

Abstract. To enhance extreme corrupted speech signals, an Improved Psychoacoustically Motivated Spectral Weighting Rule (IPMSWR) is proposed, that controls the predefined residual noise level by a time-frequency dependent parameter. Unlike conventional Psychoacoustically Motivated Spectral Weighting Rules (PMSWR), the level of the residual noise is here varied throughout the enhanced speech based on the discrimination between the regions with speech presence and speech absence by means of segmental SNR within critical bands. Controlling in such a way the level of the residual noise in the noise only region avoids the unpleasant residual noise perceived at very low SNRs. To derive the gain coefficients, the computation of the masking curve and the estimation of the corrupting noise power are required. Since the clean speech is generally not available for a single channel speech enhancement technique, the rough clean speech components needed to compute the masking curve are here obtained using advanced spectral subtraction techniques. To estimate the corrupting noise, a new technique is employed, that relies on the noise power estimation using rapid adaptation and recursive smoothing principles. The performances of the proposed approach are objectively and subjectively compared to the conventional approaches to highlight the aforementioned improvement.


2021 ◽  
pp. 1-12
Author(s):  
Jie Wang ◽  
Linhuang Yan ◽  
Qiaohe Yang ◽  
Minmin Yuan

In this paper, a single-channel speech enhancement algorithm is proposed by using guided spectrogram filtering based on masking properties of human auditory system when considering a speech spectrogram as an image. Guided filtering is capable of sharpening details and estimating unwanted textures or background noise from the noisy speech spectrogram. If we consider the noisy spectrogram as a degraded image, we can estimate the spectrogram of the clean speech signal using guided filtering after subtracting noise components. Combined with masking properties of human auditory system, the proposed algorithm adaptively adjusts and reduces the residual noise of the enhanced speech spectrogram according to the corresponding masking threshold. Because the filtering output is a local linear transform of the guidance spectrogram, the local mask window slides can be efficiently implemented via box filter with O(N) computational complexity. Experimental results show that the proposed algorithm can effectively suppress noise in different noisy environments and thus can greatly improve speech quality and speech intelligibility.


2006 ◽  
Author(s):  
Jong Won Shin ◽  
Seung Yeol Lee ◽  
Hwan Sik Yun ◽  
Nam Soo Kim

2020 ◽  
pp. 2150014
Author(s):  
S. Siva Priyanka ◽  
T. Kishore Kumar

A multi-microphone array speech enhancement method using Generalized Sidelobe Canceller (GSC) beamforming with Combined Postfilter (CP) and Sparse Non-negative Matrix Factorization (SNMF) is proposed in this paper. GSC beamforming with CP and SNMF is implemented to reduce directional noise, diffuse noise, residual noise and to separate interferences in adverse environment. In this paper, the directional noise is reduced using GSC beamforming, whereas the diffuse noise in each subband is reduced with a combined postfilter using Unconstrained Frequency domain Normalized Least Mean Square (UFNLMS) algorithm. Finally, the residual noise at the output of CP is eliminated by SNMF which optimizes the noise. The performance of the proposed method is evaluated using parameters like PESQ, SSNR, STOI, SDR and LSD. The noise reduction for four and eight microphones is compared and illustrated in spectrograms. The proposed method shows better performance in terms of intelligibility and quality when compared to the existing methods in adverse environments.


2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Soojeong Lee ◽  
Gangseong Lee

This paper proposes a noise-biased compensation of minimum statistics (MS) method using a nonlinear function anda priorispeech absence probability (SAP) for speech enhancement in highly nonstationary noisy environments. The MS method is a well-known technique for noise power estimation in nonstationary noisy environments; however, it tends to bias noise estimation below that of the true noise level. The proposed method is combined with an adaptive parameter based on a sigmoid function anda prioriSAP for residual noise reduction. Additionally, our method uses an autoparameter to control the trade-off between speech distortion and residual noise. We evaluate the estimation of noise power in highly nonstationary and varying noise environments. The improvement can be confirmed in terms of signal-to-noise ratio (SNR) and the Itakura-Saito Distortion Measure (ISDM).


2020 ◽  
Vol 10 (8) ◽  
pp. 2894 ◽  
Author(s):  
Andong Li ◽  
Renhua Peng ◽  
Chengshi Zheng ◽  
Xiaodong Li

For voice communication, it is important to extract the speech from its noisy version without introducing unnaturally artificial noise. By studying the subband mean-squared error (MSE) of the speech for unsupervised speech enhancement approaches and revealing its relationship with the existing loss function for supervised approaches, this paper derives a generalized loss function that takes residual noise control into account with a supervised approach. Our generalized loss function contains the well-known MSE loss function and many other often-used loss functions as special cases. Compared with traditional loss functions, our generalized loss function is more flexible to make a good trade-off between speech distortion and noise reduction. This is because a group of well-studied noise shaping schemes can be introduced to control residual noise for practical applications. Objective and subjective test results verify the importance of residual noise control for the supervised speech enhancement approach.


Sign in / Sign up

Export Citation Format

Share Document