CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user goals that favor inter-domain dependency and encourage natural transition across domains in conversation. We also provide a user simulator and several benchmark models for pipelined task-oriented dialogue systems, which will facilitate researchers to compare and evaluate their models on this corpus. The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.

Download Full-text

Dialogue state tracking accuracy improvement by distinguishing slot-value pairs and dialogue behaviour

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v21.i2.pp1057-1064 ◽

2021 ◽

Vol 21 (2) ◽

pp. 1057

Author(s):

Khaldoon H. Alhussayni ◽

Alexander Zamyatin ◽

S. Eman Alshamery

Keyword(s):

Critical Role ◽

Historical Context ◽

Dialogue Systems ◽

Tracking Accuracy ◽

Dialogue System ◽

Dialogue Acts ◽

Proposed Model ◽

State Tracking ◽

Task Oriented ◽

Previous User

<div><p>Dialog state tracking (DST) plays a critical role in cycle life of a task-oriented dialogue system. DST represents the goals of the consumer at each step by dialogue and describes such objectives as a conceptual structure comprising slot-value pairs and dialogue actions that specifically improve the performance and effectiveness of dialogue systems. DST faces several challenges: diversity of linguistics, dynamic social context and the dissemination of the state of dialogue over candidate values both in slot values and in dialogue acts determined in ontology. In many turns during the dialogue, users indirectly refer to the previous utterances, and that produce a challenge to distinguishing and use of related dialogue history, Recent methods used and popular for that are ineffective. In this paper, we propose a dialogue historical context self-Attention framework for DST that recognizes relevant historical context by including previous user utterance beside current user utterances and previous system actions where specific slot-value piers variations and uses that together with weighted system utterance to outperform existing models by recognizing the related context and the relevance of a system utterance. For the evaluation of the proposed model the WoZ dataset was used. The implementation was attempted with the prior user utterance as a dialogue encoder and second by the additional score combined with all the candidate slot-value pairs in the context of previous user utterances and current utterances. The proposed model obtained 0.8 per cent better results than all state-of-the-art methods in the combined precision of the target, but this is not the turnaround challenge for the submission.</p></div>

Download Full-text

RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling

10.18653/v1/2020.emnlp-main.67 ◽

2020 ◽

Author(s):

Jun Quan ◽

Shian Zhang ◽

Qian Cao ◽

Zizhong Li ◽

Deyi Xiong

Keyword(s):

Large Scale ◽

Wizard Of Oz ◽

Semantic Annotations ◽

Dialogue Modeling ◽

Task Oriented

Download Full-text

Cross-Domain Slot Filling as Machine Reading Comprehension

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/550 ◽

2021 ◽

Author(s):

Mengshi Yu ◽

Jian Liu ◽

Yufeng Chen ◽

Jinan Xu ◽

Yujie Zhang

Keyword(s):

Reading Comprehension ◽

Large Scale ◽

Training Data ◽

Dialogue Systems ◽

Domain Specific ◽

Cross Domain ◽

New Perspective ◽

Task Oriented ◽

Slot Filling ◽

Machine Reading

With task-oriented dialogue systems being widely applied in everyday life, slot filling, the essential component of task-oriented dialogue systems, is required to be quickly adapted to new domains that contain domain-specific slots with few or no training data. Previous methods for slot filling usually adopt sequence labeling framework, which, however, often has limited ability when dealing with the domain-specific slots. In this paper, we take a new perspective on cross-domain slot filling by framing it as a machine reading comprehension (MRC) problem. Our approach firstly transforms slot names into well-designed queries, which contain rich informative prior knowledge and are very helpful for the detection of domain-specific slots. In addition, we utilize the large-scale MRC dataset for pre-training, which further alleviates the data scarcity problem. Experimental results on SNIPS and ATIS datasets show that our approach consistently outperforms the existing state-of-the-art methods by a large margin.

Download Full-text

An Evaluation of Chinese Human-Computer Dialogue Technology

Data Intelligence ◽

10.1162/dint_a_00007 ◽

2019 ◽

Vol 1 (2) ◽

pp. 187-200

Author(s):

Zhengyu Zhao ◽

Weinan Zhang ◽

Wanxiang Che ◽

Zhigang Chen ◽

Yibo Zhang

Keyword(s):

Artificial Intelligence ◽

Large Scale ◽

Data Sets ◽

Dialogue Systems ◽

Online Testing ◽

User Intent ◽

Existing Problems ◽

Intelligent Processing ◽

Task Oriented ◽

Important Branch

The human-computer dialogue has recently attracted extensive attention from both academia and industry as an important branch in the field of artificial intelligence (AI). However, there are few studies on the evaluation of large-scale Chinese human-computer dialogue systems. In this paper, we introduce the Second Evaluation of Chinese Human-Computer Dialogue Technology, which focuses on the identification of a user's intents and intelligent processing of intent words. The Evaluation consists of user intent classification (Task 1) and online testing of task-oriented dialogues (Task 2), the data sets of which are provided by iFLYTEK Corporation. The evaluation tasks and data sets are introduced in detail, and meanwhile, the evaluation results and the existing problems in the evaluation are discussed.

Download Full-text

UniMF: A Unified Framework to Incorporate Multimodal Knowledge Bases intoEnd-to-End Task-Oriented Dialogue Systems

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/548 ◽

2021 ◽

Author(s):

Shiquan Yang ◽

Rui Zhang ◽

Sarah M. Erfani ◽

Jey Han Lau

Keyword(s):

Large Scale ◽

Knowledge Bases ◽

Language Models ◽

Dialogue Systems ◽

Unified Framework ◽

Proposed Model ◽

Multimodal Information ◽

Task Oriented ◽

Novel Model ◽

Single Modality

Knowledge bases (KBs) are usually essential for building practical dialogue systems. Recently we have seen rapidly growing interest in integrating knowledge bases into dialogue systems. However, existing approaches mostly deal with knowledge bases of a single modality, typically textual information. As today's knowledge bases become abundant with multimodal information such as images, audios and videos, the limitation of existing approaches greatly hinders the development of dialogue systems. In this paper, we focus on task-oriented dialogue systems and address this limitation by proposing a novel model that integrates external multimodal KB reasoning with pre-trained language models. We further enhance the model via a novel multi-granularity fusion mechanism to capture multi-grained semantics in the dialogue history. To validate the effectiveness of the proposed model, we collect a new large-scale (14K) dialogue dataset MMDialKB, built upon multimodal KB. Both automatic and human evaluation results on MMDialKB demonstrate the superiority of our proposed framework over strong baselines.

Download Full-text

Pragmatics and Dialogue

10.1093/oxfordhb/9780199276349.013.0007 ◽

2012 ◽

Author(s):

Geoffrey Leech

Keyword(s):

Speech Acts ◽

Dialogue Systems ◽

Human Beings ◽

Spoken Dialogue Systems ◽

Spoken Dialogue ◽

Cooperative Principle ◽

Dialogue Acts ◽

Analysis And Synthesis ◽

History Of ◽

Task Oriented

This article introduces the linguistic subdiscipline of pragmatics and shows how this is being applied to the development of spoken dialogue systems — currently perhaps the most important applications area for computational pragmatics. It traces the history of pragmatics from its philosophical roots, and outlines some key notions of theoretical pragmatics — speech acts, illocutionary force, the cooperative principle and relevance. It then discusses the application of pragmatics to dialogue modelling, especially the development of spoken dialogue systems intended to interact with human beings in task-oriented scenarios such as providing travel information and shows how and why computational pragmatics differs from ‘linguistic’ pragmatics, and how pragmatics contributes to the computational analysis of dialogues. One major illustration of this is the application of speech act theory in the analysis and synthesis of service interactions in terms of dialogue acts.

Download Full-text

MA-DST: Multi-Attention-Based Scalable Dialog State Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6322 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8107-8114

Author(s):

Adarsh Kumar ◽

Peter Ku ◽

Anuj Goyal ◽

Angeliki Metallinou ◽

Dilek Hakkani-Tur

Keyword(s):

Natural Language ◽

State Of The Art ◽

Core Component ◽

Full Data ◽

Cross Domain ◽

Natural Language Interface ◽

State Tracking ◽

Dialog State Tracking ◽

Task Oriented ◽

Multiple Granularities

Task oriented dialog agents provide a natural language interface for users to complete their goal. Dialog State Tracking (DST), which is often a core component of these systems, tracks the system's understanding of the user's goal throughout the conversation. To enable accurate multi-domain DST, the model needs to encode dependencies between past utterances and slot semantics and understand the dialog context, including long-range cross-domain references. We introduce a novel architecture for this task to encode the conversation history and slot semantics more robustly by using attention mechanisms at multiple granularities. In particular, we use cross-attention to model relationships between the context and slots at different semantic levels and self-attention to resolve cross-domain coreferences. In addition, our proposed architecture does not rely on knowing the domain ontologies beforehand and can also be used in a zero-shot setting for new domains or unseen slot values. Our model improves the joint goal accuracy by 5% (absolute) in the full-data setting and by up to 2% (absolute) in the zero-shot setting over the present state-of-the-art on the MultiWoZ 2.1 dataset.

Download Full-text

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

10.18653/v1/d18-1547 ◽

2018 ◽

Cited By ~ 22

Author(s):

Paweł Budzianowski ◽

Tsung-Hsien Wen ◽

Bo-Hsiang Tseng ◽

Iñigo Casanueva ◽

Stefan Ultes ◽

...

Keyword(s):

Large Scale ◽

Wizard Of Oz ◽

Dialogue Modelling ◽

Task Oriented

Download Full-text

More to diverse: Generating diversified responses in a task oriented multimodal dialog system

PLoS ONE ◽

10.1371/journal.pone.0241271 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0241271

Author(s):

Mauajama Firdaus ◽

Arunav Pratap Shandeelya ◽

Asif Ekbal

Keyword(s):

Large Scale ◽

Contextual Information ◽

Empirical Evaluation ◽

Dialogue Systems ◽

Beam Search ◽

Dialogue System ◽

Text And Image ◽

Dialog System ◽

Multimodal Dialogue ◽

Task Oriented

Multimodal dialogue system, due to its many-fold applications, has gained much attention to the researchers and developers in recent times. With the release of large-scale multimodal dialog dataset Saha et al. 2018 on the fashion domain, it has been possible to investigate the dialogue systems having both textual and visual modalities. Response generation is an essential aspect of every dialogue system, and making the responses diverse is an important problem. For any goal-oriented conversational agent, the system’s responses must be informative, diverse and polite, that may lead to better user experiences. In this paper, we propose an end-to-end neural framework for generating varied responses in a multimodal dialogue setup capturing information from both the text and image. Multimodal encoder with co-attention between the text and image is used for focusing on the different modalities to obtain better contextual information. For effective information sharing across the modalities, we combine the information of text and images using the BLOCK fusion technique that helps in learning an improved multimodal representation. We employ stochastic beam search with Gumble Top K-tricks to achieve diversified responses while preserving the content and politeness in the responses. Experimental results show that our proposed approach performs significantly better compared to the existing and baseline methods in terms of distinct metrics, and thereby generates more diverse responses that are informative, interesting and polite without any loss of information. Empirical evaluation also reveals that images, while used along with the text, improve the efficiency of the model in generating diversified responses.

Download Full-text

Towards Scalable Multi-Domain Conversational Agents: The Schema-Guided Dialogue Dataset

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6394 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8689-8696

Author(s):

Abhinav Rastogi ◽

Xiaoxue Zang ◽

Srinivas Sunkara ◽

Raghav Gupta ◽

Pranav Khaitan

Keyword(s):

Large Scale ◽

Training Data ◽

Conversational Agents ◽

Dialogue System ◽

Conversational Interface ◽

State Tracking ◽

Public Datasets ◽

Multiple Domains ◽

Task Oriented ◽

Slot Filling

Virtual assistants such as Google Assistant, Alexa and Siri provide a conversational interface to a large number of services and APIs spanning multiple domains. Such systems need to support an ever-increasing number of services with possibly overlapping functionality. Furthermore, some of these services have little to no training data available. Existing public datasets for task-oriented dialogue do not sufficiently capture these challenges since they cover few domains and assume a single static ontology per domain. In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains. Our dataset exceeds the existing task-oriented dialogue corpora in scale, while also highlighting the challenges associated with building large-scale virtual assistants. It provides a challenging testbed for a number of tasks including language understanding, slot filling, dialogue state tracking and response generation. Along the same lines, we present a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots, provided as input, using their natural language descriptions. This allows a single dialogue system to easily support a large number of services and facilitates simple integration of new services without requiring additional training data. Building upon the proposed paradigm, we release a model for dialogue state tracking capable of zero-shot generalization to new APIs, while remaining competitive in the regular setting.

Download Full-text