Concept Type Prediction and Responsive Adaptation in a Dialogue System

Responsive adaptation in spoken dialog systems involves a change in dialog system behavior in response to a user or a dialog situation. In this paper we address responsive adaptation in the automatic speech recognition (ASR) module of a spoken dialog system. We hypothesize that information about the content of a user utterance may help improve speech recognition for the utterance. We use a two-step process to test this hypothesis: first, we automatically predict the task-relevant concept types likely to be present in a user utterance using features from the dialog context and from the output of first-pass ASR of the utterance; and then, we adapt the ASR's language model to the predicted content of the user's utterance and run a second pass of ASR. We show that: (1) it is possible to achieve high accuracy in determining presence or absence of particular concept types in a post-confirmation utterance; and (2) 2-pass speech recognition with concept type classification and language model adaptation can lead to improved speech recognition performance for post-confirmation utterances.

Download Full-text

Boosting of Speech Recognition Performance by Language Model Adaptation

2007 IEEE Aerospace Conference ◽

10.1109/aero.2007.352980 ◽

2007 ◽

Author(s):

Filipp Korkmazsky ◽

Oliver Jojic ◽

Bageshree Shevade

Keyword(s):

Speech Recognition ◽

Recognition Performance ◽

Language Model ◽

Model Adaptation ◽

Language Model Adaptation

Download Full-text

Phonetic Variation Modeling and a Language Model Adaptation for Korean English Code-Switching Speech Recognition

Applied Sciences ◽

10.3390/app11062866 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2866

Author(s):

Damheo Lee ◽

Donghyun Kim ◽

Seung Yun ◽

Sanghun Kim

Keyword(s):

Speech Recognition ◽

Language Model ◽

Reduction Rate ◽

Code Switching ◽

Training Data ◽

Target Domain ◽

Phonetic Variation ◽

Language Model Adaptation ◽

Imbalanced Training Data ◽

Lm Adaptation

In this paper, we propose a new method for code-switching (CS) automatic speech recognition (ASR) in Korean. First, the phonetic variations in English pronunciation spoken by Korean speakers should be considered. Thus, we tried to find a unified pronunciation model based on phonetic knowledge and deep learning. Second, we extracted the CS sentences semantically similar to the target domain and then applied the language model (LM) adaptation to solve the biased modeling toward Korean due to the imbalanced training data. In this experiment, training data were AI Hub (1033 h) in Korean and Librispeech (960 h) in English. As a result, when compared to the baseline, the proposed method improved the error reduction rate (ERR) by up to 11.6% with phonetic variant modeling and by 17.3% when semantically similar sentences were applied to the LM adaptation. If we considered only English words, the word correction rate improved up to 24.2% compared to that of the baseline. The proposed method seems to be very effective in CS speech recognition.

Download Full-text

Attention-based Contextual Language Model Adaptation for Speech Recognition

10.18653/v1/2021.findings-acl.175 ◽

2021 ◽

Author(s):

Richard Diehl Martinez ◽

Scott Novotney ◽

Ivan Bulyko ◽

Ariya Rastrow ◽

Andreas Stolcke ◽

...

Keyword(s):

Speech Recognition ◽

Language Model ◽

Model Adaptation ◽

Language Model Adaptation

Download Full-text

Efficient language model adaptation for automatic speech recognition of spoken translations

10.21437/interspeech.2015-497 ◽

2015 ◽

Author(s):

Joris Pelemans ◽

Tom Vanallemeersch ◽

Kris Demuynck ◽

Hugo Van hamme ◽

Patrick Wambacq

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Language Model ◽

Model Adaptation ◽

Language Model Adaptation

Download Full-text

An Interactive Way to Acquire Internet Documents for Language Model Adaptation of Speech Recognition Systems

2011 Third International Conference on Intelligent Human-Machine Systems and Cybernetics ◽

10.1109/ihmsc.2011.29 ◽

2011 ◽

Cited By ~ 1

Author(s):

Hong Zhang ◽

Xiangdong Wang ◽

Yueliang Qian ◽

Shouxun Lin

Keyword(s):

Speech Recognition ◽

Language Model ◽

Model Adaptation ◽

Language Model Adaptation ◽

Recognition Systems

Download Full-text

A Two-Step Neural Dialog State Tracker for Task-Oriented Dialog Processing

Computational Intelligence and Neuroscience ◽

10.1155/2018/5798684 ◽

2018 ◽

Vol 2018 ◽

pp. 1-11

Author(s):

A-Yeong Kim ◽

Hyun-Je Song ◽

Seong-Bae Park

Keyword(s):

Attention Mechanism ◽

Data Set ◽

Dialog Systems ◽

Dialog System ◽

Fast Training ◽

Proposed Model ◽

Spoken Dialog System ◽

State Tracking ◽

Dialog State Tracking ◽

Task Oriented

Dialog state tracking in a spoken dialog system is the task that tracks the flow of a dialog and identifies accurately what a user wants from the utterance. Since the success of a dialog is influenced by the ability of the system to catch the requirements of the user, accurate state tracking is important for spoken dialog systems. This paper proposes a two-step neural dialog state tracker which is composed of an informativeness classifier and a neural tracker. The informativeness classifier which is implemented by a CNN first filters out noninformative utterances in a dialog. Then, the neural tracker estimates dialog states from the remaining informative utterances. The tracker adopts the attention mechanism and the hierarchical softmax for its performance and fast training. To prove the effectiveness of the proposed model, we do experiments on dialog state tracking in the human-human task-oriented dialogs with the standard DSTC4 data set. Our experimental results prove the effectiveness of the proposed model by showing that the proposed model outperforms the neural trackers without the informativeness classifier, the attention mechanism, or the hierarchical softmax.

Download Full-text