scholarly journals Task-Oriented Dialog Systems That Consider Multiple Appropriate Responses under the Same Context

2020 ◽  
Vol 34 (05) ◽  
pp. 9604-9611
Author(s):  
Yichi Zhang ◽  
Zhijian Ou ◽  
Zhou Yu

Conversations have an intrinsic one-to-many property, which means that multiple responses can be appropriate for the same dialog context. In task-oriented dialogs, this property leads to different valid dialog policies towards task completion. However, none of the existing task-oriented dialog generation approaches takes this property into account. We propose a Multi-Action Data Augmentation (MADA) framework to utilize the one-to-many property to generate diverse appropriate dialog responses. Specifically, we first use dialog states to summarize the dialog history, and then discover all possible mappings from every dialog state to its different valid system actions. During dialog system training, we enable the current dialog state to map to all valid system actions discovered in the previous process to create additional state-action pairs. By incorporating these additional pairs, the dialog policy learns a balanced action distribution, which further guides the dialog model to generate diverse responses. Experimental results show that the proposed framework consistently improves dialog policy diversity, and results in improved response diversity and appropriateness. Our model obtains state-of-the-art results on MultiWOZ.

Author(s):  
Liangchen Luo ◽  
Wenhao Huang ◽  
Qi Zeng ◽  
Zaiqing Nie ◽  
Xu Sun

Most existing works on dialog systems only consider conversation content while neglecting the personality of the user the bot is interacting with, which begets several unsolved issues. In this paper, we present a personalized end-to-end model in an attempt to leverage personalization in goal-oriented dialogs. We first introduce a PROFILE MODEL which encodes user profiles into distributed embeddings and refers to conversation history from other similar users. Then a PREFERENCE MODEL captures user preferences over knowledge base entities to handle the ambiguity in user requests. The two models are combined into the PERSONALIZED MEMN2N. Experiments show that the proposed model achieves qualitative performance improvements over state-of-the-art methods. As for human evaluation, it also outperforms other approaches in terms of task completion rate and user satisfaction.


Author(s):  
Zhou Yu ◽  
Alexander Rudnicky ◽  
Alan Black

Task-oriented dialog systems have been applied in various tasks, such as automated personal assistants, customer service providers and tutors. These systems work well when users have clear and explicit intentions that are well-aligned to the systems' capabilities. However, they fail if users intentions are not explicit.To address this shortcoming, we propose a framework to interleave non-task content (i.e.everyday social conversation) into task conversations. When the task content fails, the system can still keep the user engaged with the non-task content. We trained a policy using reinforcement learning algorithms to promote long-turn conversation coherence and consistency, so that the system can have smooth transitions between task and non-task content.To test the effectiveness of the proposed framework, we developed a movie promotion dialog system. Experiments with human users indicate that a system that interleaves social and task content achieves a better task success rate and is also rated as more engaging compared to a pure task-oriented system.


2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
A-Yeong Kim ◽  
Hyun-Je Song ◽  
Seong-Bae Park

Dialog state tracking in a spoken dialog system is the task that tracks the flow of a dialog and identifies accurately what a user wants from the utterance. Since the success of a dialog is influenced by the ability of the system to catch the requirements of the user, accurate state tracking is important for spoken dialog systems. This paper proposes a two-step neural dialog state tracker which is composed of an informativeness classifier and a neural tracker. The informativeness classifier which is implemented by a CNN first filters out noninformative utterances in a dialog. Then, the neural tracker estimates dialog states from the remaining informative utterances. The tracker adopts the attention mechanism and the hierarchical softmax for its performance and fast training. To prove the effectiveness of the proposed model, we do experiments on dialog state tracking in the human-human task-oriented dialogs with the standard DSTC4 data set. Our experimental results prove the effectiveness of the proposed model by showing that the proposed model outperforms the neural trackers without the informativeness classifier, the attention mechanism, or the hierarchical softmax.


2021 ◽  
Vol 11 (22) ◽  
pp. 10675
Author(s):  
Yinpei Dai ◽  
Yichi Zhang ◽  
Hong Liu ◽  
Zhijian Ou ◽  
Yi Huang ◽  
...  

Slot filling is a crucial component in task-oriented dialog systems that is used to parse (user) utterances into semantic concepts called slots. An ontology is defined by the collection of slots and the values that each slot can take. The most widely used practice of treating slot filling as a sequence labeling task suffers from two main drawbacks. First, the ontology is usually pre-defined and fixed and therefore is not able to detect new labels for unseen slots. Second, the one-hot encoding of slot labels ignores the correlations between slots with similar semantics, which makes it difficult to share knowledge learned across different domains. To address these problems, we propose a new model called elastic conditional random field (eCRF), where each slot is represented by the embedding of its natural language description and modeled by a CRF layer. New slot values can be detected by eCRF whenever a language description is available for the slot. In our experiment, we show that eCRFs outperform existing models in both in-domain and cross-domain tasks, especially in predicting unseen slots and values.


2004 ◽  
Vol 46 (6) ◽  
Author(s):  
Jürgen te Vrugt ◽  
Thomas Portele

SummarySpoken language dialog systems allow users to control applications by voice. These systems tightly integrate the applications to control them, even though knowledge sources of the building blocks are often configurable. Some dialog systems controlling multiple applications loosen the coupling.This article introduces a dialog system accessing multiple applications with a dynamic setup that can be changed at run-time, separating the applications from the system. This is achieved by application-independent knowledge processing inside the dialog system based on modular ontological descriptions. A clear interface between dialog system and applications is provided, generic dialog functionality is realized on top of the application independent knowledge processing. Examples illustrate interactions with the system.


2021 ◽  
Vol 11 (11) ◽  
pp. 4887
Author(s):  
Ting He ◽  
Xiaohong Xu ◽  
Yating Wu ◽  
Huazhen Wang ◽  
Jian Chen

Intent detection and slot filling are important modules in task-oriented dialog systems. In order to make full use of the relationship between different modules and resource sharing, solving the problem of a lack of semantics, this paper proposes a multitasking learning intent-detection system, based on the knowledge-base and slot-filling joint model. The approach has been used to share information and rich external utility between intent and slot modules in a three-part process. First, this model obtains shared parameters and features between the two modules based on long short-term memory and convolutional neural networks. Second, a knowledge base is introduced into the model to improve its performance. Finally, a weighted-loss function is built to optimize the joint model. Experimental results demonstrate that our model achieves better performance compared with state-of-the-art algorithms on a benchmark Airline Travel Information System (ATIS) dataset and the Snips dataset. Our joint model achieves state-of-the-art results on the benchmark ATIS dataset with a 1.33% intent-detection accuracy improvement, a 0.94% slot filling F value improvement, and with 0.19% and 0.31% improvements respectively on the Snips dataset.


Author(s):  
K. Mugoye ◽  
H. O. Okoyo ◽  
S. O. Mc Oyowo

Complex domains demand task-oriented dialog system (TODS) to be able to reason and engage with humans in dialog and in information retrieval. This may require contemporary dialog systems to have improved conversation handling capabilities. One stating point is supporting conversations which logically advances, such that they could be able to handle sub dialogs meant to elicit more information, within a topic. This paper presents some findings on the research that has been carried out by the authors with regard to highlighting this problem and suggesting a possible solution. A solution which intended to minimize heavy reliance on handcrafts which have varying challenges. The study discusses an experiment for evaluating a novel architecture envisioned to improve this conversational requirement. The experiment results clearly depict the extent to which we have achieved this desired progression, the underlying effects to users and the potential implications to application. The study recommends combining Agency and Reinforcement learning to deliver the solution and could guide future studies towards achieving even more natural conversations.


2020 ◽  
Vol 34 (05) ◽  
pp. 8327-8335
Author(s):  
Weixin Liang ◽  
Youzhi Tian ◽  
Chengcai Chen ◽  
Zhou Yu

A major bottleneck in training end-to-end task-oriented dialog system is the lack of data. To utilize limited training data more efficiently, we propose Modular Supervision Network (MOSS), an encoder-decoder training framework that could incorporate supervision from various intermediate dialog system modules including natural language understanding, dialog state tracking, dialog policy learning and natural language generation. With only 60% of the training data, MOSS-all (i.e., MOSS with supervision from all four dialog modules) outperforms state-of-the-art models on CamRest676. Moreover, introducing modular supervision has even bigger benefits when the dialog task has a more complex dialog state and action space. With only 40% of the training data, MOSS-all outperforms the state-of-the-art model on a complex laptop network trouble shooting dataset, LaptopNetwork, that we introduced. LaptopNetwork consists of conversations between real customers and customer service agents in Chinese. Moreover, MOSS framework can accommodate dialogs that have supervision from different dialog modules at both framework level and model level. Therefore, MOSS is extremely flexible to update in real-world deployment.


2021 ◽  
Vol 9 ◽  
pp. 807-824
Author(s):  
Baolin Peng ◽  
Chunyuan Li ◽  
Jinchao Li ◽  
Shahin Shayandeh ◽  
Lars Liden ◽  
...  

Abstract We present a new method, Soloist,1 that uses transfer learning and machine teaching to build task bots at scale. We parameterize classical modular task-oriented dialog systems using a Transformer-based auto-regressive language model, which subsumes different dialog modules into a single neural model. We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model, which can generate dialog responses grounded in user goals and real-world knowledge for task completion. The pre-trained model can be efficiently adapted to accomplish new tasks with a handful of task-specific dialogs via machine teaching, where training samples are generated by human teachers interacting with the system. Experiments show that (i)Soloist creates new state-of-the-art on well-studied task-oriented dialog benchmarks, including CamRest676 and MultiWOZ; (ii) in the few-shot fine-tuning settings, Soloist significantly outperforms existing methods; and (iii) the use of machine teaching substantially reduces the labeling cost of fine-tuning. The pre-trained models and codes are available at https://aka.ms/soloist.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Sign in / Sign up

Export Citation Format

Share Document