Time-frequency masking for large scale robust speech recognition

Author(s):  
Yuxuan Wang ◽  
Ananya Misra ◽  
Kean K. Chin
2006 ◽  
Vol 48 (11) ◽  
pp. 1486-1501 ◽  
Author(s):  
Soundararajan Srinivasan ◽  
Nicoleta Roman ◽  
DeLiang Wang

Author(s):  
George Mufungulwa ◽  
Hiroshi Tsutsui ◽  
Yoshikazu Miyanaga ◽  
Shin-ichi Abe

In any real environment, noises degrade the performance of Automatic Speech Recognition (ASR) systems. Additionally, in the case of similar pronunciations, it is not easy to realize a high accuracy of recognition. From  this point of view, our work envisions an enhanced algorithm processing a speech modulation spectrum, such as Running Spectrum Analysis (RSA). It was also adequately applied to observed speech data. In the envisioned method, a modulation spectrum filtering (MSF) method directly modified the observed cepstral modulation spectrum by a Fourier transform of the cepstral time frequency. The method and experiments carried out for various passbands had favorable results that showed an improvement of about 1-4 % in recognition accuracycompared to conventional methods.


Sign in / Sign up

Export Citation Format

Share Document