Model adaptation method for recognition of speech with missing frames

This chapter presents an efficient approach to personalized pronunciation assessment of Taiwanese-accented English. The main goal of this study is to detect frequently occurring mispronunciation patterns of Taiwanese-accented English instead of scoring English pronunciations directly. The proposed assessment help quickly discover personalized mispronunciations of a student, thus English teachers can spend more time on teaching or rectifying students’ pronunciations. In this approach, an unsupervised model adaptation method is performed on the universal acoustic models to recognize the speech of a specific speaker with mispronunciations and Taiwanese accent. A dynamic sentence selection algorithm, considering the mutual information of the related mispronunciations, is proposed to choose a sentence containing the most undetected mispronunciations in order to quickly extract personalized mispronunciations. The experimental results show that the proposed unsupervised adaptation approach obtains an accuracy improvement of about 2.1% on the recognition of Taiwanese-accented English speech.

Download Full-text

Instantaneous model adaptation method for reverberant speech recognition

Electronics Letters ◽

10.1049/el.2014.4152 ◽

2015 ◽

Vol 51 (6) ◽

pp. 528-530 ◽

Cited By ~ 2

Author(s):

Sung Min Ban ◽

Hyung Soon Kim

Keyword(s):

Speech Recognition ◽

Model Adaptation ◽

Reverberant Speech ◽

Adaptation Method ◽

Reverberant Speech Recognition

Download Full-text

Model adaptation apparatus, model adaptation method, storage medium, and pattern recognition apparatus

The Journal of the Acoustical Society of America ◽

10.1121/1.2434321 ◽

2007 ◽

Vol 121 (1) ◽

pp. 27

Author(s):

Hironaga Nakatsuka

Keyword(s):

Pattern Recognition ◽

Model Adaptation ◽

Storage Medium ◽

Adaptation Method

Download Full-text

A Pitch-Contour Model Adaptation Method for Integrated Synthesis of Mandarin, Min-Nan, and Hakka Speech

2005 9th International Workshop on Cellular Neural Networks and Their Applications ◽

10.1109/cnna.2005.1543193 ◽

2005 ◽

Author(s):

Hung-Yan Gu ◽

Hai-Ching Tsai

Keyword(s):

Pitch Contour ◽

Model Adaptation ◽

Contour Model ◽

Adaptation Method

Download Full-text

MODIFIED QUANTILE BASED ONLINE NOISE ADAPTATION METHOD FOR A ROBUST SPEECH RECOGNITION INTERFACE

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001407005661 ◽

2007 ◽

Vol 21 (04) ◽

pp. 759-772

Author(s):

HEUNGKYU LEE ◽

JUNE KIM

Keyword(s):

Speech Recognition ◽

Estimation Method ◽

Estimation Procedure ◽

Gaussian Mixture ◽

Noise Estimation ◽

Noise Model ◽

Robust Speech Recognition ◽

Model Adaptation ◽

Speech Database ◽

Adaptation Method

This paper proposes the online noise model adaptation technique using the modified quantile based noise estimation method for feature compensation of noisy speech that is based on the Gaussian mixture model for a robust speech recognition interface in real car environments. The proposed method is designed for an active online model adaptation method to cope with varying environmental noise conditions, and enhance speech recognition accuracy. This method is compensated on logarithmic filter-bank energies domain, and modified quantile based noise estimation method using beta-order harmonic mean is employed to the online noise estimation procedure. Experimental evaluation is done by using Aurora 2 speech database, and robust results were obtained than from other comparative algorithms.

Download Full-text

Improved hidden Markov model adaptation method for reduced frame rate speech recognition

Electronics Letters ◽

10.1049/el.2017.0458 ◽

2017 ◽

Vol 53 (14) ◽

pp. 962-964 ◽

Cited By ~ 4

Author(s):

L.M. Lee ◽

H.H. Le ◽

F.R. Jean

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Frame Rate ◽

Model Adaptation ◽

Adaptation Method

Download Full-text

Unsupervised adaptation of PLDA models for broadcast diarization

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-019-0167-7 ◽

2019 ◽

Vol 2019 (1) ◽

Author(s):

Ignacio Viñals ◽

Alfonso Ortega ◽

Jesús Villalba ◽

Antonio Miguel ◽

Eduardo Lleida

Keyword(s):

Mean Shift ◽

Basic Block ◽

Model Adaptation ◽

Agglomerative Hierarchical Clustering ◽

Linear Discriminant ◽

Relative Improvement ◽

Data Variability ◽

Fully Bayesian ◽

Novel Model ◽

Adaptation Method

AbstractWe present a novel model adaptation approach to deal with data variability for speaker diarization in a broadcast environment. Expensive human annotated data can be used to mitigate the domain mismatch by means of supervised model adaptation approaches. By contrast, we propose an unsupervised adaptation method which does not need for in-domain labeled data but only the recording that we are diarizing. We rely on an inner adaptation block which combines Agglomerative Hierarchical Clustering (AHC) and Mean-Shift (MS) clustering techniques with a Fully Bayesian Probabilistic Linear Discriminant Analysis (PLDA) to produce pseudo-speaker labels suitable for model adaptation. We propose multiple adaptation approaches based on this basic block, including unsupervised and semi-supervised. Our proposed solutions, analyzed with the Multi-Genre Broadcast 2015 (MGB) dataset, reported significant improvements (16% relative improvement) with respect to the baseline, also outperforming a supervised adaptation proposal with low resources (9% relative improvement). Furthermore, our proposed unsupervised adaptation is totally compatible with a supervised one. The joint use of both adaptation techniques (supervised and unsupervised) shows a 13% relative improvement with respect to only considering the supervised adaptation.

Download Full-text