An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques

Author(s):  
Hironori Doi ◽  
Keigo Nakamura ◽  
Tomoki Toda ◽  
Hiroshi Saruwatari ◽  
Kiyohiro Shikano
Author(s):  
Wenlong Li ◽  
◽  
Kaoru Hirota ◽  
Yaping Dai ◽  
Zhiyang Jia

An improved fully convolutional network based on post-processing with global variance (GV) equalization and noise-aware training (PN-FCN) for speech enhancement model is proposed. It aims at reducing the complexity of the speech improvement system, and it solves overly smooth speech signal spectrogram problem and poor generalization capability. The PN-FCN is fed with the noisy speech samples augmented with an estimate of the noise. In this way, the PN-FCN uses additional online noise information to better predict the clean speech. Besides, PN-FCN uses the global variance information, which improve the subjective score in a voice conversion task. Finally, the proposed framework adopts FCN, and the number of parameters is one-seventh of deep neural network (DNN). Results of experiments on the Valentini-Botinhaos dataset demonstrate that the proposed framework achieves improvements in both denoising effect and model training speed.


2009 ◽  
Author(s):  
Keigo Nakamura ◽  
Tomoki Toda ◽  
Hiroshi Saruwatari ◽  
Kiyohiro Shikano

2010 ◽  
Vol E93-D (9) ◽  
pp. 2472-2482 ◽  
Author(s):  
Hironori DOI ◽  
Keigo NAKAMURA ◽  
Tomoki TODA ◽  
Hiroshi SARUWATARI ◽  
Kiyohiro SHIKANO

2014 ◽  
Vol 22 (1) ◽  
pp. 172-183 ◽  
Author(s):  
Hironori Doi ◽  
Tomoki Toda ◽  
Keigo Nakamura ◽  
Hiroshi Saruwatari ◽  
Kiyohiro Shikano

Sign in / Sign up

Export Citation Format

Share Document