A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer

Unsupervised text style transfer aims to transfer the underlying style of text but keep its main content unchanged without parallel data. Most existing methods typically follow two steps: first separating the content from the original style, and then fusing the content with the desired style. However, the separation in the first step is challenging because the content and style interact in subtle ways in natural language. Therefore, in this paper, we propose a dual reinforcement learning framework to directly transfer the style of the text via a one-step mapping model, without any separation of content and style. Specifically, we consider the learning of the source-to-target and target-to-source mappings as a dual task, and two rewards are designed based on such a dual structure to reflect the style accuracy and content preservation, respectively. In this way, the two one-step mapping models can be trained via reinforcement learning, without any use of parallel data. Automatic evaluations show that our model outperforms the state-of-the-art systems by a large margin, especially with more than 10 BLEU points improvement averaged on two benchmark datasets. Human evaluations also validate the effectiveness of our model in terms of style accuracy, content preservation and fluency. Our code and data, including outputs of all baselines and our model are available at https://github.com/luofuli/DualRL.

Download Full-text

Utilizing Non-Parallel Text for Style Transfer by Making Partial Comparisons

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/747 ◽

2019 ◽

Author(s):

Di Yin ◽

Shujian Huang ◽

Xin-Yu Dai ◽

Jiajun Chen

Keyword(s):

Data Augmentation ◽

State Of The Art ◽

Language Models ◽

Style Transfer ◽

Parallel Corpora ◽

Parallel Data ◽

Distributional Information ◽

Parallel Text ◽

Using Data ◽

Content Preservation

Text style transfer aims to rephrase a given sentence into a different style without changing its original content. Since parallel corpora (i.e. sentence pairs with the same content but different styles) are usually unavailable, most previous works solely guide the transfer process with distributional information, i.e. using style-related classifiers or language models, which neglect the correspondence of instances, leading to poor transfer performance, especially for the content preservation. In this paper, we propose making partial comparisons to explicitly model the content and style correspondence of instances, respectively. To train the partial comparators, we propose methods to extract partial-parallel training instances automatically from the non-parallel data, and to further enhance the training process by using data augmentation. We perform experiments that compare our method to other existing approaches on two review datasets. Both automatic and manual evaluations show that our approach can significantly improve the performance of existing adversarial methods, and outperforms most state-of-the-art models. Our code and data will be available on Github.

Download Full-text

Learning to Incorporate Structure Knowledge for Image Inpainting

10.20944/preprints202002.0125.v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jie Yang ◽

Zhiquan Qi ◽

Yong Shi

Keyword(s):

Structure Learning ◽

State Of The Art ◽

Image Inpainting ◽

Image Completion ◽

Image Structure ◽

Learning Framework ◽

Task Learning ◽

Pyramid Structure ◽

Benchmark Datasets ◽

Structure Knowledge

This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures --- edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.

Download Full-text

Learning to Incorporate Structure Knowledge for Image Inpainting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6951 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12605-12612 ◽

Cited By ~ 1

Author(s):

Jie Yang ◽

Zhiquan Qi ◽

Yong Shi

Keyword(s):

Structure Learning ◽

State Of The Art ◽

Image Inpainting ◽

Image Completion ◽

Image Structure ◽

Learning Framework ◽

Task Learning ◽

Pyramid Structure ◽

Benchmark Datasets ◽

Structure Knowledge

This paper develops a multi-task learning framework that attempts to incorporate the image structure knowledge to assist image inpainting, which is not well explored in previous works. The primary idea is to train a shared generator to simultaneously complete the corrupted image and corresponding structures — edge and gradient, thus implicitly encouraging the generator to exploit relevant structure knowledge while inpainting. In the meantime, we also introduce a structure embedding scheme to explicitly embed the learned structure features into the inpainting process, thus to provide possible preconditions for image completion. Specifically, a novel pyramid structure loss is proposed to supervise structure learning and embedding. Moreover, an attention mechanism is developed to further exploit the recurrent structures and patterns in the image to refine the generated structures and contents. Through multi-task learning, structure embedding besides with attention, our framework takes advantage of the structure knowledge and outperforms several state-of-the-art methods on benchmark datasets quantitatively and qualitatively.

Download Full-text

Lifelong Spectral Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6045 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5867-5874

Author(s):

Gan Sun ◽

Yang Cong ◽

Qianqian Wang ◽

Jun Li ◽

Yun Fu

Keyword(s):

Machine Learning ◽

Real World ◽

Spectral Clustering ◽

State Of The Art ◽

Clustering Algorithms ◽

Orthogonal Basis ◽

Learning Framework ◽

The Past ◽

Benchmark Datasets ◽

Over Time

In the past decades, spectral clustering (SC) has become one of the most effective clustering algorithms. However, most previous studies focus on spectral clustering tasks with a fixed task set, which cannot incorporate with a new spectral clustering task without accessing to previously learned tasks. In this paper, we aim to explore the problem of spectral clustering in a lifelong machine learning framework, i.e., Lifelong Spectral Clustering (L2SC). Its goal is to efficiently learn a model for a new spectral clustering task by selectively transferring previously accumulated experience from knowledge library. Specifically, the knowledge library of L2SC contains two components: 1) orthogonal basis library: capturing latent cluster centers among the clusters in each pair of tasks; 2) feature embedding library: embedding the feature manifold information shared among multiple related tasks. As a new spectral clustering task arrives, L2SC firstly transfers knowledge from both basis library and feature library to obtain encoding matrix, and further redefines the library base over time to maximize performance across all the clustering tasks. Meanwhile, a general online update formulation is derived to alternatively update the basis library and feature library. Finally, the empirical experiments on several real-world benchmark datasets demonstrate that our L2SC model can effectively improve the clustering performance when comparing with other state-of-the-art spectral clustering algorithms.

Download Full-text

What Makes A Good Story? Designing Composite Rewards for Visual Storytelling

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6305 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7969-7976

Author(s):

Junjie Hu ◽

Yu Cheng ◽

Zhe Gan ◽

Jingjing Liu ◽

Jianfeng Gao ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Quality Criteria ◽

High Quality ◽

Visual Storytelling ◽

Learning Framework ◽

Good Story ◽

Human Evaluation ◽

Reward Functions ◽

New Criteria

Previous storytelling approaches mostly focused on optimizing traditional metrics such as BLEU, ROUGE and CIDEr. In this paper, we re-examine this problem from a different angle, by looking deep into what defines a natural and topically-coherent story. To this end, we propose three assessment criteria: relevance, coherence and expressiveness, which we observe through empirical analysis could constitute a “high-quality” story to the human eye. We further propose a reinforcement learning framework, ReCo-RL, with reward functions designed to capture the essence of these quality criteria. Experiments on the Visual Storytelling Dataset (VIST) with both automatic and human evaluation demonstrate that our ReCo-RL model achieves better performance than state-of-the-art baselines on both traditional metrics and the proposed new criteria.

Download Full-text

Extracting Action Sequences from Texts Based on Deep Reinforcement Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/565 ◽

2018 ◽

Cited By ~ 6

Author(s):

Wenfeng Feng ◽

Hankz Hankui Zhuo ◽

Subbarao Kambhampati

Keyword(s):

Reinforcement Learning ◽

Natural Language ◽

State Of The Art ◽

Specific Form ◽

World Knowledge ◽

Learning Framework ◽

Action Sequences

Extracting action sequences from texts is challenging, as it requires commonsense inferences based on world knowledge. Although there has been work on extracting action scripts, instructions, navigation actions, etc., they require either the set of candidate actions be provided in advance, or action descriptions are restricted to a specific form, e.g., description templates. In this paper we aim to extract action sequences from texts in \emph{free} natural language, i.e., without any restricted templates, provided the set of actions is unknown. We propose to extract action sequences from texts based on the deep reinforcement learning framework. Specifically, we view ``selecting'' or ``eliminating'' words from texts as ``actions'', and texts associated with actions as ``states''. We build Q-networks to learn policies of extracting actions and extract plans from the labeled texts. We demonstrate the effectiveness of our approach on several datasets with comparison to state-of-the-art approaches.

Download Full-text

Unpaired Image Enhancement Featuring Reinforcement-Learning-Controlled Image Editing Software

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6790 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11296-11303 ◽

Cited By ~ 3

Author(s):

Satoshi Kosugi ◽

Toshihiko Yamasaki

Keyword(s):

Reinforcement Learning ◽

Image Enhancement ◽

High Performance ◽

State Of The Art ◽

Mapping Function ◽

Generative Adversarial Networks ◽

Image Editing ◽

Learning Framework ◽

Adversarial Networks ◽

Image Pairs

This paper tackles unpaired image enhancement, a task of learning a mapping function which transforms input images into enhanced images in the absence of input-output image pairs. Our method is based on generative adversarial networks (GANs), but instead of simply generating images with a neural network, we enhance images utilizing image editing software such as Adobe® Photoshop® for the following three benefits: enhanced images have no artifacts, the same enhancement can be applied to larger images, and the enhancement is interpretable. To incorporate image editing software into a GAN, we propose a reinforcement learning framework where the generator works as the agent that selects the software's parameters and is rewarded when it fools the discriminator. Our framework can use high-quality non-differentiable filters present in image editing software, which enables image enhancement with high performance. We apply the proposed method to two unpaired image enhancement tasks: photo enhancement and face beautification. Our experimental results demonstrate that the proposed method achieves better performance, compared to the performances of the state-of-the-art methods based on unpaired learning.

Download Full-text

Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6274 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7716-7723

Author(s):

Xiaocheng Feng ◽

Yawei Sun ◽

Bing Qin ◽

Heng Gong ◽

Yibo Sun ◽

...

Keyword(s):

State Of The Art ◽

Neural Model ◽

High Fidelity ◽

Semantic Relationship ◽

Style Transfer ◽

Basketball Game ◽

Sentence Level ◽

Parallel Data ◽

Back Translation ◽

Text Content

In this paper, we focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer and aims to preserve text styles while altering the content. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference. The task is unsupervised due to lack of parallel data, and is challenging to select suitable records and style words from bi-aspect inputs respectively and generate a high-fidelity long document. To tackle those problems, we first build a dataset based on a basketball game report corpus as our testbed, and present an unsupervised neural model with interactive attention mechanism, which is used for learning the semantic relationship between records and reference texts to achieve better content transfer and better style preservation. In addition, we also explore the effectiveness of the back-translation in our task for constructing some pseudo-training pairs. Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset. 1

Download Full-text

Knowledge Base Question Answering with Topic Units

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/701 ◽

2019 ◽

Author(s):

Yunshi Lan ◽

Shuohang Wang ◽

Jing Jiang

Keyword(s):

Natural Language Processing ◽

Reinforcement Learning ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

State Of The Art ◽

Entity Linking ◽

Named Entities ◽

Benchmark Datasets ◽

Previous State

Knowledge base question answering (KBQA) is an important task in natural language processing. Existing methods for KBQA usually start with entity linking, which considers mostly named entities found in a question as the starting points in the KB to search for answers to the question. However, relying only on entity linking to look for answer candidates may not be sufficient. In this paper, we propose to perform topic unit linking where topic units cover a wider range of units of a KB. We use a generation-and-scoring approach to gradually refine the set of topic units. Furthermore, we use reinforcement learning to jointly learn the parameters for topic unit linking and answer candidate ranking in an end-to-end manner. Experiments on three commonly used benchmark datasets show that our method consistently works well and outperforms the previous state of the art on two datasets.

Download Full-text

On Safety and Time Efficiency Enhancement of Robot Navigation in Crowded Environment utilizing Deep Reinforcement Learning

10.36227/techrxiv.17493605.v1 ◽

2021 ◽

Author(s):

Sunil Srivatsav Samsani

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Industrial Applications ◽

Social Robots ◽

Efficiency Enhancement ◽

Time Efficiency ◽

Learning Framework ◽

Multi Agent ◽

Ablation Study ◽

Crowded Environment

<div>The evolution of social robots has increased with the advent of recent artificial intelligence techniques. Alongside humans, social robots play active roles in various household and industrial applications. However, the safety of humans becomes a significant concern when robots navigate in a complex and crowded environment. In literature, the safety of humans in relation to social robots has been addressed by various methods; however, most of these methods compromise the time efficiency of the robot. For robots, safety and time-efficiency are two contrast elements where one dominates the other. To strike a balance between them, a multi-reward formulation in the reinforcement learning framework is proposed, which improves the safety together with time-efficiency of the robot. The multi-reward formulation includes both positive and negative rewards that encourage and punish the robot, respectively. The proposed reward formulation is tested on state-of-the-art methods of multi-agent navigation. In addition, an ablation study is performed to evaluate the importance of individual rewards. Experimental results signify that the proposed approach balances the safety and the time-efficiency of the robot while navigating in a crowded environment.</div>

Download Full-text