A Two-Step Neural Dialog State Tracker for Task-Oriented Dialog Processing

Dialog state tracking in a spoken dialog system is the task that tracks the flow of a dialog and identifies accurately what a user wants from the utterance. Since the success of a dialog is influenced by the ability of the system to catch the requirements of the user, accurate state tracking is important for spoken dialog systems. This paper proposes a two-step neural dialog state tracker which is composed of an informativeness classifier and a neural tracker. The informativeness classifier which is implemented by a CNN first filters out noninformative utterances in a dialog. Then, the neural tracker estimates dialog states from the remaining informative utterances. The tracker adopts the attention mechanism and the hierarchical softmax for its performance and fast training. To prove the effectiveness of the proposed model, we do experiments on dialog state tracking in the human-human task-oriented dialogs with the standard DSTC4 data set. Our experimental results prove the effectiveness of the proposed model by showing that the proposed model outperforms the neural trackers without the informativeness classifier, the attention mechanism, or the hierarchical softmax.

Download Full-text

Dialog History Construction with Long-Short Term Memory for Robust Generative Dialog State Tracking

Dialogue & Discourse ◽

10.5087/dad.2016.302 ◽

2016 ◽

Vol 7 (3) ◽

pp. 47-64 ◽

Cited By ~ 2

Author(s):

Byung-Jun Lee ◽

Kee-Eung Kim

Keyword(s):

Speech Processing ◽

Short Term Memory ◽

Dialog Systems ◽

Dialog System ◽

Gradient Descent Algorithm ◽

Core Areas ◽

Overall Performance ◽

Long Short Term Memory ◽

State Tracking ◽

Dialog State Tracking

One of the crucial components of dialog system is the dialog state tracker, which infers user’s intention from preliminary speech processing. Since the overall performance of the dialog system is heavily affected by that of the dialog tracker, it has been one of the core areas of research on dialog systems. In this paper, we present a dialog state tracker that combines a generative probabilistic model of dialog state tracking with the recurrent neural network for encoding important aspects of the dialog history. We describe a two-step gradient descent algorithm that optimizes the tracker with a complex loss function. We demonstrate that this approach yields a dialog state tracker that performs competitively with top-performing trackers participated in the first and second Dialog State Tracking Challenges.

Download Full-text

The Dialog State Tracking Challenge Series: A Review

Dialogue & Discourse ◽

10.5087/dad.2016.301 ◽

2016 ◽

Vol 7 (3) ◽

pp. 4-33 ◽

Cited By ~ 16

Author(s):

Jason D. Williams ◽

Antoine Raux ◽

Matthew Henderson

Keyword(s):

Speech Recognition ◽

Research Area ◽

The State ◽

Evaluation Metrics ◽

Common Resources ◽

Discriminative Models ◽

Dialog System ◽

Spoken Dialog System ◽

State Tracking ◽

Dialog State Tracking

In a spoken dialog system, dialog state tracking refers to the task of correctly inferring the state of the conversation -- such as the user's goal -- given all of the dialog history up to that turn. Dialog state tracking is crucial to the success of a dialog system, yet until recently there were no common resources, hampering progress. The Dialog State Tracking Challenge series of 3 tasks introduced the first shared testbed and evaluation metrics for dialog state tracking, and has underpinned three key advances in dialog state tracking: the move from generative to discriminative models; the adoption of discriminative sequential techniques; and the incorporation of the speech recognition results directly into the dialog state tracker. This paper reviews this research area, covering both the challenge tasks themselves and summarizing the work they have enabled.

Download Full-text

Task-oriented spoken dialog system for second-language learning

CALL communities and culture – short papers from EUROCALL 2016 ◽

10.14705/rpnet.2016.eurocall2016.568 ◽

2016 ◽

pp. 237-242 ◽

Cited By ~ 2

Author(s):

Oh-Woog Kwon ◽

Young-Kil Kim ◽

Yunkeun Lee

Keyword(s):

Second Language ◽

Language Learning ◽

Second Language Learning ◽

Dialog System ◽

Spoken Dialog System ◽

Task Oriented

Download Full-text

The Dialog State Tracking Challenge Series

AI Magazine ◽

10.1609/aimag.v35i4.2558 ◽

2014 ◽

Vol 35 (4) ◽

pp. 121-124 ◽

Cited By ~ 5

Author(s):

Jason D. Williams ◽

Matthew Henderson ◽

Antoine Raux ◽

Blaise Thomson ◽

Alan Black ◽

...

Keyword(s):

Research Community ◽

Spoken Dialog Systems ◽

Dialog Systems ◽

New Methods ◽

State Tracking ◽

Dialog State Tracking

In spoken dialog systems, dialog state tracking refers to the task of correctly inferring the user's goal at a given turn, given all of the dialog history up to that turn. The Dialog State Tracking Challenge is a research community challenge task that has run for three rounds. The challenge has given rise to a host of new methods for dialog state tracking, and also deeper understandings about the problem itself, including methods for evaluation.

Download Full-text

Spectral decomposition method of dialog state tracking via collective matrix factorization

Dialogue & Discourse ◽

10.5087/dad.2016.304 ◽

2016 ◽

Vol 7 (3) ◽

pp. 34-46

Author(s):

Julien Perez

Keyword(s):

Matrix Factorization ◽

The State ◽

Computationally Efficient ◽

Reward Function ◽

Dialog Management ◽

Dialog System ◽

Dependent Variables ◽

Novel Method ◽

State Tracking ◽

Dialog State Tracking

The task of dialog management is commonly decomposed into two sequential subtasks: dialog state tracking and dialog policy learning. In an end-to-end dialog system, the aim of dialog state tracking is to accurately estimate the true dialog state from noisy observations produced by the speech recognition and the natural language understanding modules. The state tracking task is primarily meant to support a dialog policy. From a probabilistic perspective, this is achieved by maintaining a posterior distribution over hidden dialog states composed of a set of context dependent variables. Once a dialog policy is learned, it strives to select an optimal dialog act given the estimated dialog state and a defined reward function. This paper introduces a novel method of dialog state tracking based on a bilinear algebric decomposition model that provides an efficient inference schema through collective matrix factorization. We evaluate the proposed approach on the second Dialog State Tracking Challenge (DSTC-2) dataset and we show that the proposed tracker gives encouraging results compared to the state-of-the-art trackers that participated in this standard benchmark. Finally, we show that the prediction schema is computationally efficient in comparison to the previous approaches.

Download Full-text

Learning Conversational Systems that Interleave Task and Non-Task Content

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/589 ◽

2017 ◽

Cited By ~ 1

Author(s):

Zhou Yu ◽

Alexander Rudnicky ◽

Alan Black

Keyword(s):

Customer Service ◽

Service Providers ◽

Dialog Systems ◽

Dialog System ◽

Task Success ◽

Personal Assistants ◽

Task Oriented ◽

Smooth Transitions ◽

Oriented System ◽

Task Content

Task-oriented dialog systems have been applied in various tasks, such as automated personal assistants, customer service providers and tutors. These systems work well when users have clear and explicit intentions that are well-aligned to the systems' capabilities. However, they fail if users intentions are not explicit.To address this shortcoming, we propose a framework to interleave non-task content (i.e.everyday social conversation) into task conversations. When the task content fails, the system can still keep the user engaged with the non-task content. We trained a policy using reinforcement learning algorithms to promote long-turn conversation coherence and consistency, so that the system can have smooth transitions between task and non-task content.To test the effectiveness of the proposed framework, we developed a movie promotion dialog system. Experiments with human users indicate that a system that interleaves social and task content achieves a better task success rate and is also rated as more engaging compared to a pure task-oriented system.

Download Full-text

Dialog State Tracking for Unseen Values Using an Extended Attention Mechanism

Lecture Notes in Electrical Engineering - 9th International Workshop on Spoken Dialogue System Technology ◽

10.1007/978-981-13-9443-0_7 ◽

2019 ◽

pp. 77-89 ◽

Cited By ~ 1

Author(s):

Takami Yoshida ◽

Kenji Iwata ◽

Hiroshi Fujimura ◽

Masami Akamine

Keyword(s):

Attention Mechanism ◽

State Tracking ◽

Dialog State Tracking

Download Full-text

Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/614 ◽

2018 ◽

Cited By ~ 14

Author(s):

Chongyang Tao ◽

Shen Gao ◽

Mingyue Shang ◽

Wei Wu ◽

Dongyan Zhao ◽

...

Keyword(s):

Attention Mechanism ◽

Experimental Results ◽

Dialogue Systems ◽

Dialog Systems ◽

Proposed Model ◽

Semantic Aspect

Attention mechanism has become a popular and widely used component in sequence-to-sequence models. However, previous research on neural generative dialogue systems always generates universal responses, and the attention distribution learned by the model always attends to the same semantic aspect. To solve this problem, in this paper, we propose a novel Multi-Head Attention Mechanism (MHAM) for generative dialog systems, which aims at capturing multiple semantic aspects from the user utterance. Further, a regularizer is formulated to force different attention heads to concentrate on certain aspects. The proposed mechanism leads to more informative, diverse, and relevant response generated. Experimental results show that our proposed model outperforms several strong baselines.

Download Full-text

MA-DST: Multi-Attention-Based Scalable Dialog State Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6322 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8107-8114

Author(s):

Adarsh Kumar ◽

Peter Ku ◽

Anuj Goyal ◽

Angeliki Metallinou ◽

Dilek Hakkani-Tur

Keyword(s):

Natural Language ◽

State Of The Art ◽

Core Component ◽

Full Data ◽

Cross Domain ◽

Natural Language Interface ◽

State Tracking ◽

Dialog State Tracking ◽

Task Oriented ◽

Multiple Granularities

Task oriented dialog agents provide a natural language interface for users to complete their goal. Dialog State Tracking (DST), which is often a core component of these systems, tracks the system's understanding of the user's goal throughout the conversation. To enable accurate multi-domain DST, the model needs to encode dependencies between past utterances and slot semantics and understand the dialog context, including long-range cross-domain references. We introduce a novel architecture for this task to encode the conversation history and slot semantics more robustly by using attention mechanisms at multiple granularities. In particular, we use cross-attention to model relationships between the context and slots at different semantic levels and self-attention to resolve cross-domain coreferences. In addition, our proposed architecture does not rely on knowing the domain ontologies beforehand and can also be used in a zero-shot setting for new domains or unseen slot values. Our model improves the joint goal accuracy by 5% (absolute) in the full-data setting and by up to 2% (absolute) in the zero-shot setting over the present state-of-the-art on the MultiWoZ 2.1 dataset.

Download Full-text

CERG: Chinese Emotional Response Generator with Retrieval Method

Research ◽

10.34133/2020/2616410 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Yangyang Zhou ◽

Fuji Ren

Keyword(s):

Emotional Response ◽

Dialogue Systems ◽

Inference Process ◽

Dialogue System ◽

Data Set ◽

Retrieval Method ◽

Input Text ◽

Proposed Model ◽

Semantic Relevance ◽

Task Oriented

The dialogue system has always been one of the important topics in the domain of artificial intelligence. So far, most of the mature dialogue systems are task-oriented based, while non-task-oriented dialogue systems still have a lot of room for improvement. We propose a data-driven non-task-oriented dialogue generator “CERG” based on neural networks. This model has the emotion recognition capability and can generate corresponding responses. The data set we adopt comes from the NTCIR-14 STC-3 CECG subtask, which contains more than 1.7 million Chinese Weibo post-response pairs and 6 emotion categories. We try to concatenate the post and the response with the emotion, then mask the response part of the input text character by character to emulate the encoder-decoder framework. We use the improved transformer blocks as the core to build the model and add regularization methods to alleviate the problems of overcorrection and exposure bias. We introduce the retrieval method to the inference process to improve the semantic relevance of generated responses. The results of the manual evaluation show that our proposed model can make different responses to different emotions to improve the human-computer interaction experience. This model can be applied to lots of domains, such as automatic reply robots of social application.

Download Full-text