Learning from Easy to Complex: Adaptive Multi-Curricula Learning for Neural Dialogue Generation

Hengyi Cai; Hongshen Chen; Cheng Zhang; Yonghao Song; Xiaofang Zhao; Yangxi Li; Dongsheng Duan; Dawei Yin

doi:10.1609/aaai.v34i05.6244

Learning from Easy to Complex: Adaptive Multi-Curricula Learning for Neural Dialogue Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6244 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7472-7479

Author(s):

Hengyi Cai ◽

Hongshen Chen ◽

Cheng Zhang ◽

Yonghao Song ◽

Xiaofang Zhao ◽

...

Keyword(s):

State Of The Art ◽

Data Driven ◽

Generation Model ◽

Dialogue Systems ◽

Multiple Perspectives ◽

Learning Efficiency ◽

Learning Framework ◽

Current State ◽

Efficiency And Effectiveness ◽

Complex Adaptive

Current state-of-the-art neural dialogue systems are mainly data-driven and are trained on human-generated responses. However, due to the subjectivity and open-ended nature of human conversations, the complexity of training dialogues varies greatly. The noise and uneven complexity of query-response pairs impede the learning efficiency and effects of the neural dialogue generation models. What is more, so far, there are no unified dialogue complexity measurements, and the dialogue complexity embodies multiple aspects of attributes—specificity, repetitiveness, relevance, etc. Inspired by human behaviors of learning to converse, where children learn from easy dialogues to complex ones and dynamically adjust their learning progress, in this paper, we first analyze five dialogue attributes to measure the dialogue complexity in multiple perspectives on three publicly available corpora. Then, we propose an adaptive multi-curricula learning framework to schedule a committee of the organized curricula. The framework is established upon the reinforcement learning paradigm, which automatically chooses different curricula at the evolving learning process according to the learning status of the neural dialogue generation model. Extensive experiments conducted on five state-of-the-art models demonstrate its learning efficiency and effectiveness with respect to 13 automatic evaluation metrics and human judgments.

Download Full-text

Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation

Journal of Artificial Intelligence Research ◽

10.1613/jair.5477 ◽

2018 ◽

Vol 61 ◽

pp. 65-170 ◽

Cited By ~ 68

Author(s):

Albert Gatt ◽

Emiel Krahmer

Keyword(s):

Natural Language ◽

State Of The Art ◽

Natural Language Generation ◽

Data Driven ◽

Research Topics ◽

Language Generation ◽

The Past ◽

Current State ◽

Linguistic Input ◽

New Applications

This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past two decades, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of NLP, with an emphasis on different evaluation methods and the relationships between them.

Download Full-text

Unsupervised Learning of Monocular Depth and Ego-Motion using Conditional PatchGANs

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/787 ◽

2019 ◽

Cited By ~ 1

Author(s):

Madhu Vankadari ◽

Swagat Kumar ◽

Anima Majumder ◽

Kaushik Das

Keyword(s):

High Frequency ◽

State Of The Art ◽

Image Sequence ◽

Input Image ◽

Performance Comparison ◽

Structural Defects ◽

Learning Framework ◽

Current State ◽

Input And Output ◽

Monocular Depth

This paper presents a new GAN-based deep learning framework for estimating absolute scale awaredepth and ego motion from monocular images using a completely unsupervised mode of learning.The proposed architecture uses two separate generators to learn the distribution of depth and posedata for a given input image sequence. The depth and pose data, thus generated, are then evaluated bya patch-based discriminator using the reconstructed image and its corresponding actual image. Thepatch-based GAN (or PatchGAN) is shown to detect high frequency local structural defects in thereconstructed image, thereby improving the accuracy of overall depth and pose estimation. Unlikeconventional GANs, the proposed architecture uses a conditioned version of input and output of thegenerator for training the whole network. The resulting framework is shown to outperform all existing deep networks in this field and beating the current state-of-the-art method by 8.7% in absoluteerror and 5.2% in RMSE metric. To the best of our knowledge, this is first deep network based modelto estimate both depth and pose simultaneously using a conditional patch-based GAN paradigm.The efficacy of the proposed approach is demonstrated through rigorous ablation studies and exhaustive performance comparison on the popular KITTI outdoor driving dataset.

Download Full-text

Cancer: A Turbulence Problem

10.20944/preprints201902.0139.v1 ◽

2019 ◽

Author(s):

Abicumaran Uthamacumaran

Keyword(s):

Turbulent Flows ◽

State Of The Art ◽

Complex Systems Theory ◽

Current State ◽

Computational Oncology ◽

Complex Adaptive ◽

Cell Networks ◽

Cancer Pattern ◽

Cancer Dynamics ◽

Chemical Turbulence

Cancers are complex, adaptive ecosystems. They remain the leading cause of disease-related death among children in North America. As we approach computational oncology and Deep Learning Healthcare, our mathematical models of cancer dynamics must be revised. Recent findings support the perspective that cancer-microenvironment interactions consist of turbulent flows. As such, cancer pattern formation, protein-folding and metastatic invasion are discussed herein as processes driven by chemical turbulence within the framework of complex systems theory. Current state-of-the-art quantitative approaches used in reconstructing cancer stem cell networks are reviewed. To conclude, cancer stem cells are presented as strange attractors of the Waddington landscape.

Download Full-text

Identification of Causal Effects in the Presence of Selection Bias

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33012744 ◽

2019 ◽

Vol 33 ◽

pp. 2744-2751 ◽

Cited By ~ 1

Author(s):

Juan D. Correa ◽

Jin Tian ◽

Elias Bareinboim

Keyword(s):

State Of The Art ◽

Causal Relation ◽

Causal Effects ◽

Data Driven ◽

Systematic Bias ◽

Types Of Knowledge ◽

External Data ◽

Current State ◽

Back Door ◽

Common Barriers

Cause-and-effect relations are one of the most valuable types of knowledge sought after throughout the data-driven sciences since they translate into stable and generalizable explanations as well as efficient and robust decision-making capabilities. Inferring these relations from data, however, is a challenging task. Two of the most common barriers to this goal are known as confounding and selection biases. The former stems from the systematic bias introduced during the treatment assignment, while the latter comes from the systematic bias during the collection of units into the sample. In this paper, we consider the problem of identifiability of causal effects when both confounding and selection biases are simultaneously present. We first investigate the problem of identifiability when all the available data is biased. We prove that the algorithm proposed by [Bareinboim and Tian, 2015] is, in fact, complete, namely, whenever the algorithm returns a failure condition, no identifiability claim about the causal relation can be made by any other method. We then generalize this setting to when, in addition to the biased data, another piece of external data is available, without bias. It may be the case that a subset of the covariates could be measured without bias (e.g., from census). We examine the problem of identifiability when a combination of biased and unbiased data is available. We propose a new algorithm that subsumes the current state-of-the-art method based on the back-door criterion.

Download Full-text

A Pre-Training Based Personalized Dialogue Generation Model with Persona-Sparse Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6518 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9693-9700

Author(s):

Yinhe Zheng ◽

Rongsheng Zhang ◽

Minlie Huang ◽

Xiaoxi Mao

Keyword(s):

State Of The Art ◽

Language Model ◽

Sparse Data ◽

Generation Model ◽

Dialogue Systems ◽

Inference Process ◽

Training Process ◽

Natural Languages ◽

Dialogue Model ◽

Proposed Model

Endowing dialogue systems with personas is essential to deliver more human-like conversations. However, this problem is still far from well explored due to the difficulties of both embodying personalities in natural languages and the persona sparsity issue observed in most dialogue corpora. This paper proposes a pre-training based personalized dialogue model that can generate coherent responses using persona-sparse dialogue data. In this method, a pre-trained language model is used to initialize an encoder and decoder, and personal attribute embeddings are devised to model richer dialogue contexts by encoding speakers' personas together with dialogue histories. Further, to incorporate the target persona in the decoding process and to balance its contribution, an attention routing structure is devised in the decoder to merge features extracted from the target persona and dialogue contexts using dynamically predicted weights. Our model can utilize persona-sparse dialogues in a unified manner during the training process, and can also control the amount of persona-related features to exhibit during the inference process. Both automatic and manual evaluation demonstrates that the proposed model outperforms state-of-the-art methods for generating more coherent and persona consistent responses with persona-sparse data.

Download Full-text

Impresso Inspect and Compare. Visual Comparison of Semantically Enriched Historical Newspaper Articles

Information ◽

10.3390/info12090348 ◽

2021 ◽

Vol 12 (9) ◽

pp. 348 ◽

Cited By ~ 1

Author(s):

Marten Düring ◽

Roman Kalyakin ◽

Estelle Bunout ◽

Daniele Guido

Keyword(s):

Text Mining ◽

Interface Design ◽

State Of The Art ◽

Critical Assessment ◽

Data Driven ◽

Distant Reading ◽

Named Entities ◽

Document Collections ◽

Search Queries ◽

Current State

The automated enrichment of mass-digitised document collections using techniques such as text mining is becoming increasingly popular. Enriched collections offer new opportunities for interface design to allow data-driven and visualisation-based search, exploration and interpretation. Most such interfaces integrate close and distant reading and represent semantic, spatial, social or temporal relations, but often lack contrastive views. Inspect and Compare (I&C) contributes to the current state of the art in interface design for historical newspapers with highly versatile side-by-side comparisons of query results and curated article sets based on metadata and semantic enrichments. I&C takes search queries and pre-curated article sets as inputs and allows comparisons based on the distributions of newspaper titles, publication dates and automatically generated enrichments, such as language, article types, topics and named entities. Contrastive views of such data reveal patterns, help humanities scholars to improve search strategies and to facilitate a critical assessment of the overall data quality. I&C is part of the impresso interface for the exploration of digitised and semantically enriched historical newspapers.

Download Full-text

Mediated by Code: Unpacking Algorithmic Curation of Urban Experiences

Media and Communication ◽

10.17645/mac.v9i4.4086 ◽

2021 ◽

Vol 9 (4) ◽

pp. 250-259 ◽

Cited By ~ 1

Author(s):

Annelien Smets ◽

Pieter Ballon ◽

Nils Walravens

Keyword(s):

State Of The Art ◽

Communication Technologies ◽

Urban Life ◽

Data Driven ◽

Information Flows ◽

Life Data ◽

Current State ◽

Online Platforms ◽

The City ◽

Timely Topic

Amid the widespread diffusion of digital communication technologies, our cities are at a critical juncture as these technologies are entering all aspects of urban life. Data-driven technologies help citizens to navigate the city, find friends, or discover new places. While these technology-mediated activities come in scope of scholarly research, we lack an understanding of the underlying curation mechanisms that select and present the particular information citizens are exposed to. Nevertheless, such an understanding is crucial to deal with the risk of the socio-cultural polarization assumedly reinforced by this kind of algorithmic curation. Drawing upon the vast amount of work on algorithmic curation in online platforms, we construct an analytical lens that is applied to the urban environment to establish an understanding of algorithmic curation of urban experiences. In this way, this article demonstrates that cities could be considered as a new materiality of curational platforms. Our framework outlines the various urban information flows, curation logics, and stakeholders involved. This work contributes to the current state of the art by bridging the gap between online and offline algorithmic curation and by providing a novel conceptual framework to study this timely topic.

Download Full-text

CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6495 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9507-9514 ◽

Cited By ~ 1

Author(s):

Daojian Zeng ◽

Haoran Zhang ◽

Qianying Liu

Keyword(s):

Detailed Analysis ◽

State Of The Art ◽

Effective Model ◽

Model Structure ◽

Entity Extraction ◽

Learning Framework ◽

Task Learning ◽

Current State ◽

Significant Attention

Joint extraction of entities and relations has received significant attention due to its potential of providing higher performance for both tasks. Among existing methods, CopyRE is effective and novel, which uses a sequence-to-sequence framework and copy mechanism to directly generate the relation triplets. However, it suffers from two fatal problems. The model is extremely weak at differing the head and tail entity, resulting in inaccurate entity extraction. It also cannot predict multi-token entities (e.g. Steven Jobs). To address these problems, we give a detailed analysis of the reasons behind the inaccurate entity extraction problem, and then propose a simple but extremely effective model structure to solve this problem. In addition, we propose a multi-task learning framework equipped with copy mechanism, called CopyMTL, to allow the model to predict multi-token entities. Experiments reveal the problems of CopyRE and show that our model achieves significant improvement over the current state-of-the-art method by 9% in NYT and 16% in WebNLG (F1 score). Our code is available at https://github.com/WindChimeRan/CopyMTL

Download Full-text

Research Paper on Data-Driven Business Models: Why the Asia-Pacific Region Serves as an Example for the Western Companies? How Might they Improve Their Business Models in 5 Years?

International Journal of Trade Economics and Finance ◽

10.18178/ijtef.2020.11.6.684 ◽

2020 ◽

Vol 11 (6) ◽

pp. 169-176

Author(s):

Tsvetelin Anastasov ◽

Keyword(s):

Business Models ◽

State Of The Art ◽

Data Driven ◽

Asia Pacific ◽

Pacific Region ◽

Clear Definition ◽

Asia Pacific Region ◽

Current State ◽

Challenges And Opportunities ◽

Master's Thesis

This article is an expansion of Anastasov, T.’s master’s thesis (2019) and attempts to give a clear definition and taxonomy of the Data-Driven Business Models (DDBMs) as well as illustrate data challenges and opportunities that come along with this. These definitions were cross-analyzed with 3 cases from the Asia-Pacific region to deliver concrete insights and inspiration for Western companies to reinvent their businesses in the next 5 years. A comparison between Data-Driven and Data-Centric models was given as well, not previously analyzed in the thesis, as a view on the current state-of-the-art data business models.

Download Full-text

Neural Simile Recognition with Cyclic Multitask Learning and Local Attention

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6496 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9515-9522 ◽

Cited By ~ 1

Author(s):

Jiali Zeng ◽

Linfeng Song ◽

Jinsong Su ◽

Jun Xie ◽

Wei Song ◽

...

Keyword(s):

Recent Work ◽

State Of The Art ◽

Source Code ◽

Multitask Learning ◽

Learning Framework ◽

Current State ◽

Sentence Classification ◽

Component Extraction ◽

Simple Parameter ◽

Parameter Sharing

Simile recognition is to detect simile sentences and to extract simile components, i.e., tenors and vehicles. It involves two subtasks: simile sentence classification and simile component extraction. Recent work has shown that standard multitask learning is effective for Chinese simile recognition, but it is still uncertain whether the mutual effects between the subtasks have been well captured by simple parameter sharing. We propose a novel cyclic multitask learning framework for neural simile recognition, which stacks the subtasks and makes them into a loop by connecting the last to the first. It iteratively performs each subtask, taking the outputs of the previous subtask as additional inputs to the current one, so that the interdependence between the subtasks can be better explored. Extensive experiments show that our framework significantly outperforms the current state-of-the-art model and our carefully designed baselines, and the gains are still remarkable using BERT. Source Code of this paper are available on https://github.com/DeepLearnXMU/Cyclic.

Download Full-text