Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018465 ◽

2019 ◽

Vol 33 ◽

pp. 8465-8472 ◽

Cited By ~ 8

Author(s):

Qiuyuan Huang ◽

Zhe Gan ◽

Asli Celikyilmaz ◽

Dapeng Wu ◽

Jianfeng Wang ◽

...

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Semantic Concept ◽

Sentence Generation ◽

Visual Storytelling ◽

Empirical Results ◽

Low Level ◽

Story Generation ◽

End To End ◽

High Level

We propose a hierarchically structured reinforcement learning approach to address the challenges of planning for generating coherent multi-sentence stories for the visual storytelling task. Within our framework, the task of generating a story given a sequence of images is divided across a two-level hierarchical decoder. The high-level decoder constructs a plan by generating a semantic concept (i.e., topic) for each image in sequence. The low-level decoder generates a sentence for each image using a semantic compositional network, which effectively grounds the sentence generation conditioned on the topic. The two decoders are jointly trained end-to-end using reinforcement learning. We evaluate our model on the visual storytelling (VIST) dataset. Empirical results from both automatic and human evaluations demonstrate that the proposed hierarchically structured reinforced training achieves significantly better performance compared to a strong flat deep reinforcement learning baseline.

Download Full-text

Policy based reinforcement learning approach Of Jobshop scheduling with high level deadlock detection

10.31274/etd-180810-1488 ◽

2014 ◽

Author(s):

Mengmeng Chen

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Deadlock Detection ◽

Jobshop Scheduling ◽

High Level

Download Full-text

Towards High-Level Intrinsic Exploration in Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/733 ◽

2020 ◽

Author(s):

Nicolas Bougie ◽

Ryutaro Ichise

Keyword(s):

Reinforcement Learning ◽

Time Horizon ◽

State Of The Art ◽

Experimental Results ◽

Prior Work ◽

Extrinsic Rewards ◽

Intrinsic Reward ◽

Long Time ◽

End To End ◽

High Level

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.

Download Full-text

A Reinforcement Learning Approach with Spline-Fit Object Tracking for AIBO Robot’s High Level Decision Making

Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing 2011 - Studies in Computational Intelligence ◽

10.1007/978-3-642-22288-7_14 ◽

2011 ◽

pp. 169-183

Author(s):

Subhasis Mukherjee ◽

Shamsul Huda ◽

John Yearwood

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Object Tracking ◽

Learning Approach ◽

High Level

Download Full-text

Restraining Bolts for Reinforcement Learning Agents

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7114 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13659-13662

Author(s):

Giuseppe De Giacomo ◽

Luca Iocchi ◽

Marco Favorito ◽

Fabio Patrizi

Keyword(s):

Reinforcement Learning ◽

Science Fiction ◽

Linear Time ◽

Learning Agents ◽

Low Level ◽

Learning Agent ◽

Time Logic ◽

The World ◽

High Level

In this work we have investigated the concept of “restraining bolt”, inspired by Science Fiction. We have two distinct sets of features extracted from the world, one by the agent and one by the authority imposing some restraining specifications on the behaviour of the agent (the “restraining bolt”). The two sets of features and, hence the model of the world attainable from them, are apparently unrelated since of interest to independent parties. However they both account for (aspects of) the same world. We have considered the case in which the agent is a reinforcement learning agent on a set of low-level (subsymbolic) features, while the restraining bolt is specified logically using linear time logic on finite traces f/f over a set of high-level symbolic features. We show formally, and illustrate with examples, that, under general circumstances, the agent can learn while shaping its goals to suitably conform (as much as possible) to the restraining bolt specifications.1

Download Full-text

Storytelling from an Image Stream Using Scene Graphs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6455 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9185-9192

Author(s):

Ruize Wang ◽

Zhongyu Wei ◽

Piji Li ◽

Qi Zhang ◽

Xuanjing Huang

Keyword(s):

Semantic Representation ◽

Temporal Dimension ◽

Visual Storytelling ◽

The Public ◽

Fine Grained ◽

Story Generation ◽

Human Evaluation ◽

Image Stream ◽

High Level ◽

Gated Recurrent Unit

Visual storytelling aims at generating a story from an image stream. Most existing methods tend to represent images directly with the extracted high-level features, which is not intuitive and difficult to interpret. We argue that translating each image into a graph-based semantic representation, i.e., scene graph, which explicitly encodes the objects and relationships detected within image, would benefit representing and describing images. To this end, we propose a novel graph-based architecture for visual storytelling by modeling the two-level relationships on scene graphs. In particular, on the within-image level, we employ a Graph Convolution Network (GCN) to enrich local fine-grained region representations of objects on scene graphs. To further model the interaction among images, on the cross-images level, a Temporal Convolution Network (TCN) is utilized to refine the region representations along the temporal dimension. Then the relation-aware representations are fed into the Gated Recurrent Unit (GRU) with attention mechanism for story generation. Experiments are conducted on the public visual storytelling dataset. Automatic and human evaluation results indicate that our method achieves state-of-the-art.

Download Full-text

Regulatory Scope and Its Mental and Social Supports

Perspectives on Psychological Science ◽

10.1177/1745691620950691 ◽

2020 ◽

pp. 174569162095069

Author(s):

Yaacov Trope ◽

Alison Ledgerwood ◽

Nira Liberman ◽

Kentaro Fujita

Keyword(s):

Adaptive Functioning ◽

Social Supports ◽

Construal Level Theory ◽

Construal Level ◽

The Novel ◽

Empirical Results ◽

Low Level ◽

High Level ◽

And Behavior ◽

Current Experience

Adaptive functioning requires the ability to both immerse oneself in the here and now as well as to move beyond current experience. We leverage and expand construal-level theory to understand how individuals and groups regulate thoughts, feelings, and behavior to address both proximal and distal ends. To connect to distant versus proximal events in a way that meaningfully informs and guides responses in the immediate here and now, people must expand versus contract their regulatory scope. We propose that humans have evolved a number of mental and social tools that enable the modulation of regulatory scope and address the epistemic, emotive, and executive demands of regulation. Critically, across these tools, it is possible to distinguish a hierarchy that varies in abstractness. Whereas low-level tools enable contractive scope, high-level tools enable expansion. We review empirical results that support these assertions and highlight the novel insights that a regulatory-scope framework provides for understanding diverse phenomena.

Download Full-text

The Multi-Dimensional Actions Control Approach for Obstacle Avoidance Based on Reinforcement Learning

Symmetry ◽

10.3390/sym13081335 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1335

Author(s):

Menghao Wu ◽

Yanbin Gao ◽

Pengfei Wang ◽

Fan Zhang ◽

Zhejun Liu

Keyword(s):

Reinforcement Learning ◽

Obstacle Avoidance ◽

Control Policy ◽

Continuous Action ◽

Control Approach ◽

Low Level ◽

Learning Technique ◽

Distance Sensor ◽

High Level ◽

Action Spaces

In robotics, obstacle avoidance is an essential ability for distance sensor-based robots. This type of robot has axisymmetrically distributed distance sensors to acquire obstacle distance, so the state is symmetrical. Training the control policy with a reinforcement learning method is a trend. Considering the complexity of environments, such as narrow paths and right-angle turns, robots will have a better ability if the control policy can control the steering direction and speed simultaneously. This paper proposes the multi-dimensional action control (MDAC) approach based on a reinforcement learning technique, which can be used in multiple continuous action space tasks. It adopts a hierarchical structure, which has high and low-level modules. Low-level policies output concrete actions and the high-level policy determines when to invoke low-level modules according to the environment’s features. We design robot navigation experiments with continuous action spaces to test the method’s performance. It is an end-to-end approach and can solve complex obstacle avoidance tasks in navigation.

Download Full-text

Hierarchical Active Tracking Control for UAVs via Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app112210595 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10595

Author(s):

Wenlong Zhao ◽

Zhijun Meng ◽

Kaipeng Wang ◽

Jiahui Zhang ◽

Shaoze Lu

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Tracking Control ◽

Spatial Information ◽

Ground Truth ◽

Tracking Task ◽

Learning To Learn ◽

Low Level ◽

Dynamic Target ◽

High Level

Active tracking control is essential for UAVs to perform autonomous operations in GPS-denied environments. In the active tracking task, UAVs take high-dimensional raw images as input and execute motor actions to actively follow the dynamic target. Most research focuses on three-stage methods, which entail perception first, followed by high-level decision-making based on extracted spatial information of the dynamic target, and then UAV movement control, using a low-level dynamic controller. Perception methods based on deep neural networks are powerful but require considerable effort for manual ground truth labeling. Instead, we unify the perception and decision-making stages using a high-level controller and then leverage deep reinforcement learning to learn the mapping from raw images to the high-level action commands in the V-REP-based environment, where simulation data are infinite and inexpensive. This end-to-end method also has the advantages of a small parameter size and reduced effort requirements for parameter turning in the decision-making stage. The high-level controller, which has a novel architecture, explicitly encodes the spatial and temporal features of the dynamic target. Auxiliary segmentation and motion-in-depth losses are introduced to generate denser training signals for the high-level controller’s fast and stable training. The high-level controller and a conventional low-level PID controller constitute our hierarchical active tracking control framework for the UAVs’ active tracking task. Simulation experiments show that our controller trained with several augmentation techniques sufficiently generalizes dynamic targets with random appearances and velocities, and achieves significantly better performance, compared with three-stage methods.

Download Full-text

I²HRL: Interactive Influence-based Hierarchical Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/433 ◽

2020 ◽

Author(s):

Rundong Wang ◽

Runsheng Yu ◽

Bo An ◽

Zinovi Rabinovich

Keyword(s):

Reinforcement Learning ◽

Learning Performance ◽

Stationary States ◽

Information Theoretic ◽

Low Level ◽

Hierarchical Reinforcement Learning ◽

Long Time ◽

High Level ◽

Data Efficiency ◽

State Of The Environment

Hierarchical reinforcement learning (HRL) is a promising approach to solve tasks with long time horizons and sparse rewards. It is often implemented as a high-level policy assigning subgoals to a low-level policy. However, it suffers the high-level non-stationarity problem since the low-level policy is constantly changing. The non-stationarity also leads to the data efficiency problem: policies need more data at non-stationary states to stabilize training. To address these issues, we propose a novel HRL method: Interactive Influence-based Hierarchical Reinforcement Learning (I^2HRL). First, inspired by agent modeling, we enable the interaction between the low-level and high-level policies to stabilize the high-level policy training. The high-level policy makes decisions conditioned on the received low-level policy representation as well as the state of the environment. Second, we furthermore stabilize the high-level policy via an information-theoretic regularization with minimal dependence on the changing low-level policy. Third, we propose the influence-based exploration to more frequently visit the non-stationary states where more transition data is needed. We experimentally validate the effectiveness of the proposed solution in several tasks in MuJoCo domains by demonstrating that our approach can significantly boost the learning performance and accelerate learning compared with state-of-the-art HRL methods.

Download Full-text

Service Chaining Offloading Decision in the EdgeAI: A Deep Reinforcement Learning Approach

2020 21st Asia-Pacific Network Operations and Management Symposium (APNOMS) ◽

10.23919/apnoms50412.2020.9237048 ◽

2020 ◽

Author(s):

Minkyun Lee ◽

Choong Seon Hong

Keyword(s):

Reinforcement Learning ◽

Learning Approach

Download Full-text