scholarly journals UniMF: A Unified Framework to Incorporate Multimodal Knowledge Bases intoEnd-to-End Task-Oriented Dialogue Systems

Author(s):  
Shiquan Yang ◽  
Rui Zhang ◽  
Sarah M. Erfani ◽  
Jey Han Lau

Knowledge bases (KBs) are usually essential for building practical dialogue systems. Recently we have seen rapidly growing interest in integrating knowledge bases into dialogue systems. However, existing approaches mostly deal with knowledge bases of a single modality, typically textual information. As today's knowledge bases become abundant with multimodal information such as images, audios and videos, the limitation of existing approaches greatly hinders the development of dialogue systems. In this paper, we focus on task-oriented dialogue systems and address this limitation by proposing a novel model that integrates external multimodal KB reasoning with pre-trained language models. We further enhance the model via a novel multi-granularity fusion mechanism to capture multi-grained semantics in the dialogue history. To validate the effectiveness of the proposed model, we collect a new large-scale (14K) dialogue dataset MMDialKB, built upon multimodal KB. Both automatic and human evaluation results on MMDialKB demonstrate the superiority of our proposed framework over strong baselines.

Author(s):  
Junshu Wang ◽  
Guoming Zhang ◽  
Wei Wang ◽  
Ka Zhang ◽  
Yehua Sheng

AbstractWith the rapid development of hospital informatization and Internet medical service in recent years, most hospitals have launched online hospital appointment registration systems to remove patient queues and improve the efficiency of medical services. However, most of the patients lack professional medical knowledge and have no idea of how to choose department when registering. To instruct the patients to seek medical care and register effectively, we proposed CIDRS, an intelligent self-diagnosis and department recommendation framework based on Chinese medical Bidirectional Encoder Representations from Transformers (BERT) in the cloud computing environment. We also established a Chinese BERT model (CHMBERT) trained on a large-scale Chinese medical text corpus. This model was used to optimize self-diagnosis and department recommendation tasks. To solve the limited computing power of terminals, we deployed the proposed framework in a cloud computing environment based on container and micro-service technologies. Real-world medical datasets from hospitals were used in the experiments, and results showed that the proposed model was superior to the traditional deep learning models and other pre-trained language models in terms of performance.


2019 ◽  
Vol 1 (2) ◽  
pp. 187-200
Author(s):  
Zhengyu Zhao ◽  
Weinan Zhang ◽  
Wanxiang Che ◽  
Zhigang Chen ◽  
Yibo Zhang

The human-computer dialogue has recently attracted extensive attention from both academia and industry as an important branch in the field of artificial intelligence (AI). However, there are few studies on the evaluation of large-scale Chinese human-computer dialogue systems. In this paper, we introduce the Second Evaluation of Chinese Human-Computer Dialogue Technology, which focuses on the identification of a user's intents and intelligent processing of intent words. The Evaluation consists of user intent classification (Task 1) and online testing of task-oriented dialogues (Task 2), the data sets of which are provided by iFLYTEK Corporation. The evaluation tasks and data sets are introduced in detail, and meanwhile, the evaluation results and the existing problems in the evaluation are discussed.


2021 ◽  
Author(s):  
Yanjie Gou ◽  
Yinjie Lei ◽  
Lingqiao Liu ◽  
Yong Dai ◽  
Chunxu Shen

2020 ◽  
Vol 8 ◽  
pp. 281-295
Author(s):  
Qi Zhu ◽  
Kaili Huang ◽  
Zheng Zhang ◽  
Xiaoyan Zhu ◽  
Minlie Huang

To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user goals that favor inter-domain dependency and encourage natural transition across domains in conversation. We also provide a user simulator and several benchmark models for pipelined task-oriented dialogue systems, which will facilitate researchers to compare and evaluate their models on this corpus. The large size and rich annotation of CrossWOZ make it suitable to investigate a variety of tasks in cross-domain dialogue modeling, such as dialogue state tracking, policy learning, user simulation, etc.


Author(s):  
Sixing Wu ◽  
Ying Li ◽  
Dawei Zhang ◽  
Yang Zhou ◽  
Zhonghai Wu

Insufficient semantic understanding of dialogue always leads to the appearance of generic responses, in generative dialogue systems. Recently, high-quality knowledge bases have been introduced to enhance dialogue understanding, as well as to reduce the prevalence of boring responses. Although such knowledge-aware approaches have shown tremendous potential, they always utilize the knowledge in a black-box fashion. As a result, the generation process is somewhat uncontrollable, and it is also not interpretable. In this paper, we introduce a topic fact-based commonsense knowledge-aware approach, TopicKA. Different from previous works, TopicKA generates responses conditioned not only on the query message but also on a topic fact with an explicit semantic meaning, which also controls the direction of generation. Topic facts are recommended by a recommendation network trained under the Teacher-Student framework. To integrate the recommendation network and the generation network, this paper designs four schemes, which include two non-sampling schemes and two sampling methods. We collected and constructed a large-scale Chinese commonsense knowledge graph. Experimental results on an open Chinese benchmark dataset indicate that our model outperforms baselines in terms of both the objective and the subjective metrics.


Author(s):  
Andrea Madotto ◽  
Samuel Cahyawijaya ◽  
Genta Indra Winata ◽  
Yan Xu ◽  
Zihan Liu ◽  
...  

Research ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Yangyang Zhou ◽  
Fuji Ren

The dialogue system has always been one of the important topics in the domain of artificial intelligence. So far, most of the mature dialogue systems are task-oriented based, while non-task-oriented dialogue systems still have a lot of room for improvement. We propose a data-driven non-task-oriented dialogue generator “CERG” based on neural networks. This model has the emotion recognition capability and can generate corresponding responses. The data set we adopt comes from the NTCIR-14 STC-3 CECG subtask, which contains more than 1.7 million Chinese Weibo post-response pairs and 6 emotion categories. We try to concatenate the post and the response with the emotion, then mask the response part of the input text character by character to emulate the encoder-decoder framework. We use the improved transformer blocks as the core to build the model and add regularization methods to alleviate the problems of overcorrection and exposure bias. We introduce the retrieval method to the inference process to improve the semantic relevance of generated responses. The results of the manual evaluation show that our proposed model can make different responses to different emotions to improve the human-computer interaction experience. This model can be applied to lots of domains, such as automatic reply robots of social application.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8439
Author(s):  
Shukan Liu ◽  
Ruilin Xu ◽  
Li Duan ◽  
Mingjie Li ◽  
Yiming Liu

The commonly-used large-scale knowledge bases have been facing challenges in open domain question answering tasks which are caused by the loose knowledge association and weak structural logic of triplet-based knowledge. To find a way out of this dilemma, this work proposes a novel metaknowledge enhanced approach for open domain question answering. We design an automatic approach to extract metaknowledge and build a metaknowledge network from Wiki documents. For the purpose of representing the directional weighted graph with hierarchical and semantic features, we present an original graph encoder GE4MK to model the metaknowledge network. Then, a metaknowledge enhanced graph reasoning model MEGr-Net is proposed for question answering, which aggregates both relational and neighboring interactions comparing with R-GCN and GAT. Experiments have proved the improvement of metaknowledge over main-stream triplet-based knowledge. We have found that the graph reasoning models and pre-trained language models also have influences on the metaknowledge enhanced question answering approaches.


Author(s):  
Khaldoon H. Alhussayni ◽  
Alexander Zamyatin ◽  
S. Eman Alshamery

<div><p>Dialog state tracking (DST) plays a critical role in cycle life of a task-oriented dialogue system. DST represents the goals of the consumer at each step by dialogue and describes such objectives as a conceptual structure comprising slot-value pairs and dialogue actions that specifically improve the performance and effectiveness of dialogue systems. DST faces several challenges: diversity of linguistics, dynamic social context and the dissemination of the state of dialogue over candidate values both in slot values and in dialogue acts determined in ontology. In many turns during the dialogue, users indirectly refer to the previous utterances, and that produce a challenge to distinguishing and use of related dialogue history, Recent methods used and popular for that are ineffective. In this paper, we propose a dialogue historical context self-Attention framework for DST that recognizes relevant historical context by including previous user utterance beside current user utterances and previous system actions where specific slot-value piers variations and uses that together with weighted system utterance to outperform existing models by recognizing the related context and the relevance of a system utterance. For the evaluation of the proposed model the WoZ dataset was used. The implementation was attempted with the prior user utterance as a dialogue encoder and second by the additional score combined with all the candidate slot-value pairs in the context of previous user utterances and current utterances. The proposed model obtained 0.8 per cent better results than all state-of-the-art methods in the combined precision of the target, but this is not the turnaround challenge for the submission.</p></div>


Author(s):  
Shukan Liu ◽  
Ruilin Xu ◽  
Li Duan ◽  
Mingjie Li ◽  
Yiming Liu

The commonly-used large-scale knowledge bases have been facing challenges in open domain question answering tasks which are caused by the loose knowledge association and weak structural logic of triplet-based knowledge. To find a way out of this dilemma, this work proposes a novel metaknowledge enhanced approach for open domain question answering. We design an automatic approach to extract metaknowledge and build metaknowledge network from Wiki documents. For the purpose of representing the directional weighted graph with hierarchical and semantic features, we present an original graph encoder GE4MK to model the metaknowledge network. Then a metaknowledge enhanced graph reasoning model MEGr-Net is proposed for question answering, which aggregates both relational and neighboring interactions comparing with R-GCN and GAT. Experiments have proved the improvement of metaknowledge over main-stream triplet-based knowledge. We have found that the graph reasoning models and pre-trained language models also have influences on the metaknowledge enhanced question answering approaches.


Sign in / Sign up

Export Citation Format

Share Document