maximum likelihood linear regression
Recently Published Documents


TOTAL DOCUMENTS

22
(FIVE YEARS 3)

H-INDEX

4
(FIVE YEARS 0)

2020 ◽  
Vol 30 (1) ◽  
pp. 165-179
Author(s):  
A. Kumar ◽  
R.K. Aggarwal

Abstract This paper implements the continuous Hindi Automatic Speech Recognition (ASR) system using the proposed integrated features vector with Recurrent Neural Network (RNN) based Language Modeling (LM). The proposed system also implements the speaker adaptation using Maximum-Likelihood Linear Regression (MLLR) and Constrained Maximum likelihood Linear Regression (C-MLLR). This system is discriminatively trained by Maximum Mutual Information (MMI) and Minimum Phone Error (MPE) techniques with 256 Gaussian mixture per Hidden Markov Model(HMM) state. The training of the baseline system has been done using a phonetically rich Hindi dataset. The results show that discriminative training enhances the baseline system performance by up to 3%. Further improvement of ~7% has been recorded by applying RNN LM. The proposed Hindi ASR system shows significant performance improvement over other current state-of-the-art techniques.


Author(s):  
Shweta Ghai ◽  
Rohit Sinha

An algorithm for adaptive Mel frequency cepstral coefficients (MFCC) feature truncation is proposed to improve automatic speech recognition (ASR) performance under acoustically mismatched conditions. Using the relationship found between MFCC base feature truncation and degree of acoustic mismatch of speech signals with respect to recognition models, the proposed algorithm performs utterance-specific MFCC feature truncation for test signals to address their acoustic mismatch in context of ASR. The proposed technique, without any prior knowledge about the speaker of the test utterance, gives 38% (on a connected-digit recognition task) and 36% (on a continuous speech recognition task) relative improvement over baseline in ASR performance for children's speech on models trained on adult speech, which is also found to be additive to improvements obtained with vocal tract length normalization and/or constrained maximum likelihood linear regression. The generality and effectiveness of the algorithm is also validated for automatic recognition of children's and adults' speech under matched and mismatched conditions.


2013 ◽  
Vol 823 ◽  
pp. 618-621 ◽  
Author(s):  
Hui Li ◽  
Yu Hong Dong

This paper adopts GMM-UBM (Gaussian Markov Model-Uniform Background Model) when model speaker recognition system considering of lacking data. In the aspect of adapting in speaker recognition system modeling and parameter estimating, attentions are put on researching in how to improve recognition rate. In the side of adapting in speaker recognition system modeling, we will ameliorate conventional MAP (Maximum A Posterior Probability) means to get speaker recognition model, apply MLLR (Maximum Likelihood Linear Regression) and EigenVoice adaptation ways which used in speech recognition into adapting in speaker recognition system modeling, and compare the results with MAP means.


2012 ◽  
Vol 195-196 ◽  
pp. 859-863
Author(s):  
Fei Ran Wu ◽  
Xin Xin Wang ◽  
Zhi Qian Ye

After attempts on applying the speech recognition system to radiology information system (RIS), we focus on the implementation of introducing adaptation technology to the system. To be consistent with the practical application of RIS, we intend to solve the problems on how to adapt and add new content related to the RIS report to the old model. This paper describes an acoustic model-based speaker and content adaptation scheme using a synthetic method which introduces a simplified maximum likelihood linear regression (MLLR) module to the incremental maximum a posteriori (MAP) processing. We designed and tested a procedure using our own adaptation data collected from hospital diagnostic reports. Finally, the efficiency of this method is supported by experimental results.


Sign in / Sign up

Export Citation Format

Share Document