scholarly journals Are Neural Open-Domain Dialog Systems Robust to Speech Recognition Errors in the Dialog History? An Empirical Study

Author(s):  
Karthik Gopalakrishnan ◽  
Behnam Hedayatnia ◽  
Longshaokan Wang ◽  
Yang Liu ◽  
Dilek Hakkani-Tür
Author(s):  
Ronnie W. Smith ◽  
D. Richard Hipp

Every natural language parser will sometimes misunderstand its input. Misunderstandings can arise from speech recognition errors or inadequacies in the language grammar, or they may result from an input that is ungrammatical or ambiguous. Whatever their cause, misunderstandings can jeopardize the success of the larger system of which the parser is a component. For this reason, it is important to reduce the number of misunderstandings to a minimum. In a dialog system, it is possible to reduce the number of misunderstandings by requiring the user to verify each utterance. Some speech dialog systems implement verification by requiring the user to speak every utterance twice, or to confirm a word-by-word readback of every utterance. Such verification is effective at reducing errors that result from word misrecognitions, but does nothing to abate misunderstandings that result from other causes. Furthermore, verification of all utterances can be needlessly wearisome to the user, especially if the system is working well. A superior approach is to have the spoken language system verify the deduced meaning of an input only under circumstances where the accuracy of the deduced meaning is seriously in doubt, or correct understanding is essential to the success of the dialog. The verification is accomplished through the use of a verification subdialog—a short sequence of conversational exchanges intended to confirm or reject the hypothesized meaning. The following example of a verification subdialog will suffice to illustrate the idea. . . . computer: What is the LED displaying? user: The same thing. computer: Did you mean to say that the LED is displaying the same thing? user: Yes. . . . As will be further seen below, selective verification via a subdialog results in an unintrusive, human-like exchange between user and machine. A recent enhancement to the Circuit Fix-it Shop dialog system is a subsystem that uses a verification subdialog to verify the meaning of the user’s utterance only when the meaning is in doubt or when accuracy is critical for the success of the dialog. Notable features of this new verification subsystem include the following.


2006 ◽  
Vol 32 (3) ◽  
pp. 417-438 ◽  
Author(s):  
Diane Litman ◽  
Julia Hirschberg ◽  
Marc Swerts

This article focuses on the analysis and prediction of corrections, defined as turns where a user tries to correct a prior error made by a spoken dialogue system. We describe our labeling procedure of various corrections types and statistical analyses of their features in a corpus collected from a train information spoken dialogue system. We then present results of machine-learning experiments designed to identify user corrections of speech recognition errors. We investigate the predictive power of features automatically computable from the prosody of the turn, the speech recognition process, experimental conditions, and the dialogue history. Our best-performing features reduce classification error from baselines of 25.70–28.99% to 15.72%.


Sign in / Sign up

Export Citation Format

Share Document