robust speech recognition
Recently Published Documents


TOTAL DOCUMENTS

1134
(FIVE YEARS 49)

H-INDEX

43
(FIVE YEARS 3)

2021 ◽  
Author(s):  
Lujun Li ◽  
Ludwig Kurzinger ◽  
Tobias Watzel ◽  
Gerhard Rigoll

Author(s):  
Yong Lü ◽  
Han Lin ◽  
Pingping Wu ◽  
Yitao Chen

AbstractIn this paper, we propose a novel feature compensation algorithm based on independent noise estimation, which employs a Gaussian mixture model (GMM) with fewer Gaussian components to rapidly estimate the noise parameters from the noisy speech and monitor the noise variation. The estimated noise model is combined with a GMM with sufficient Gaussian mixtures to produce the noisy GMM for the clean speech estimation so that parameters are updated if and only if the noise variation occurs. Experimental results show that the proposed algorithm can achieve the recognition accuracy similar to that of the traditional GMM-based feature compensation, but significantly reduces the computational cost, and thereby is more useful for resource-limited mobile devices.


Author(s):  
Risanuri Hidayat ◽  
◽  
Anggun Winursito ◽  

Research on the current speech recognition system leads to the creation of a noise-resistant system. The Mel Frequency Cepstral Coefficients (MFCC) extraction method becomes a popular method in the speech recognition system. In this paper, the MFCC's weakness of noise interference is the main reason underlies the accomplishment of a robust speech recognition system. Development was carried out by improving the denoising performance using a wavelet transform. Modifications were carried out by analyzing the weakness of the wavelet denoising process on the recognition system using the MFCC method. The analysis was conducted at one of the MFCC stages, the Fast Fourier Transform (FFT) stage. The proposed method was conducted by performing the denoising process using Wavelet only on the noise-related data based on the FFT process' analysis results. The study utilized speech data in the form of eleven isolated words in English added with noise with several different characteristics. Results showed that the proposed method was capable of generating a better accuracy than conventional wavelet denoising methods on the signal to noise ratio (SNR) of 10dB, 15dB, and 20dB using a Fejer Korovkin 6 wavelet type. The highest accuracy increase of the proposed method was in signal to noise ratio (SNR) of 15dB with a rise of 4.63%, followed by a 3.96% increase at 20dB intensity, and 2.3% at 10dB intensity. The performance of the proposed method is then compared with other methods. The results show that the proposed method has the best performance on clean speech and noisy speech at SNR intensities of 10dB, 15dB, and 20dB.


Sign in / Sign up

Export Citation Format

Share Document