SFTRLS-Based Speech Enhancement Method Using CNN to Determine the Noise Type and the Optimal Forgetting Factor

Author(s):  
Deyou Tang ◽  
Guoqiang Chen
2017 ◽  
Vol 31 (19-21) ◽  
pp. 1740096 ◽  
Author(s):  
Wenhua Shi ◽  
Xiongwei Zhang ◽  
Xia Zou ◽  
Wei Han

In this paper, a speech enhancement method using noise classification and Deep Neural Network (DNN) was proposed. Gaussian mixture model (GMM) was employed to determine the noise type in speech-absent frames. DNN was used to model the relationship between noisy observation and clean speech. Once the noise type was determined, the corresponding DNN model was applied to enhance the noisy speech. GMM was trained with mel-frequency cepstrum coefficients (MFCC) and the parameters were estimated with an iterative expectation-maximization (EM) algorithm. Noise type was updated by spectrum entropy-based voice activity detection (VAD). Experimental results demonstrate that the proposed method could achieve better objective speech quality and smaller distortion under stationary and non-stationary conditions.


2021 ◽  
Author(s):  
Kaibei Peng ◽  
Xiaoming Sun ◽  
Haowei Chen ◽  
Zhen He ◽  
Jianrong Wang

2014 ◽  
Vol 13 (10) ◽  
pp. 1730-1736 ◽  
Author(s):  
Cao Bin-Fang ◽  
Li Jian-Qi ◽  
Qu Peixin ◽  
Peng Guang-Han

2003 ◽  
Vol 114 (4) ◽  
pp. 2369-2369 ◽  
Author(s):  
Hiroyuki Ono ◽  
Takahiro Murakami ◽  
Yoshihisa Ishida

2020 ◽  
Vol 10 (3) ◽  
pp. 1167 ◽  
Author(s):  
Lu Zhang ◽  
Mingjiang Wang ◽  
Qiquan Zhang ◽  
Ming Liu

The performance of speech enhancement algorithms can be further improved by considering the application scenarios of speech products. In this paper, we propose an attention-based branchy neural network framework by incorporating the prior environmental information for noise reduction. In the whole denoising framework, first, an environment classification network is trained to distinguish the noise type of each noisy speech frame. Guided by this classification network, the denoising network gradually learns respective noise reduction abilities in different branches. Unlike most deep neural network (DNN)-based methods, which learn speech reconstruction capabilities with a common neural structure from all training noises, the proposed branchy model obtains greater performance benefits from the specially trained branches of prior known noise interference types. Experimental results show that the proposed branchy DNN model not only preserved better enhanced speech quality and intelligibility in seen noisy environments, but also obtained good generalization in unseen noisy environments.


Sign in / Sign up

Export Citation Format

Share Document