Representation Learning for Grounded Spatial Reasoning

2019 ◽

Vol 33 ◽

pp. 7249-7256

Author(s):

Penghui Wei ◽

Wenji Mao ◽

Guandan Chen

Keyword(s):

Reinforcement Learning ◽

Opinion Mining ◽

State Of The Art ◽

Public Attitudes ◽

Representation Learning ◽

Experimental Results ◽

Training Data ◽

Policy Network ◽

Proposed Model ◽

Weakly Supervised

Analyzing public attitudes plays an important role in opinion mining systems. Stance detection aims to determine from a text whether its author is in favor of, against, or neutral towards a given target. One challenge of this task is that a text may not explicitly express an attitude towards the target, but existing approaches utilize target content alone to build models. Moreover, although weakly supervised approaches have been proposed to ease the burden of manually annotating largescale training data, such approaches are confronted with noisy labeling problem. To address the above two issues, in this paper, we propose a Topic-Aware Reinforced Model (TARM) for weakly supervised stance detection. Our model consists of two complementary components: (1) a detection network that incorporates target-related topic information into representation learning for identifying stance effectively; (2) a policy network that learns to eliminate noisy instances from auto-labeled data based on off-policy reinforcement learning. Two networks are alternately optimized to improve each other’s performances. Experimental results demonstrate that our proposed model TARM outperforms the state-of-the-art approaches.

Download Full-text

Learning an Efficient Gait Cycle of a Biped Robot Based on Reinforcement Learning and Artificial Neural Networks

Applied Sciences ◽

10.3390/app9030502 ◽

2019 ◽

Vol 9 (3) ◽

pp. 502 ◽

Cited By ~ 8

Author(s):

Cristyan Gil ◽

Hiram Calvo ◽

Humberto Sossa

Keyword(s):

Reinforcement Learning ◽

Gait Cycle ◽

Biped Robot ◽

Q Learning ◽

Nao Robot ◽

Simulated Environment ◽

Proposed Model ◽

Normal Speed ◽

Multi Level ◽

Efficient Gait

Programming robots for performing different activities requires calculating sequences of values of their joints by taking into account many factors, such as stability and efficiency, at the same time. Particularly for walking, state of the art techniques to approximate these sequences are based on reinforcement learning (RL). In this work we propose a multi-level system, where the same RL method is used first to learn the configuration of robot joints (poses) that allow it to stand with stability, and then in the second level, we find the sequence of poses that let it reach the furthest distance in the shortest time, while avoiding falling down and keeping a straight path. In order to evaluate this, we focus on measuring the time it takes for the robot to travel a certain distance. To our knowledge, this is the first work focusing both on speed and precision of the trajectory at the same time. We implement our model in a simulated environment using q-learning. We compare with the built-in walking modes of an NAO robot by improving normal-speed and enhancing robustness in fast-speed. The proposed model can be extended to other tasks and is independent of a particular robot model.

Download Full-text

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5479 ◽

2020 ◽

Vol 34 (01) ◽

pp. 1250-1257 ◽

Cited By ~ 1

Author(s):

Haoxi Zhong ◽

Yuzhong Wang ◽

Cunchao Tu ◽

Tianyang Zhang ◽

Zhiyuan Liu ◽

...

Keyword(s):

Reinforcement Learning ◽

Gender Bias ◽

Ethical Issues ◽

State Of The Art ◽

Presumption Of Innocence ◽

The World ◽

Comparable Performance ◽

Reward Functions ◽

Real World Datasets ◽

The Given

Legal Judgment Prediction (LJP) aims to predict judgment results according to the facts of cases. In recent years, LJP has drawn increasing attention rapidly from both academia and the legal industry, as it can provide references for legal practitioners and is expected to promote judicial justice. However, the research to date usually suffers from the lack of interpretability, which may lead to ethical issues like inconsistent judgments or gender bias. In this paper, we present QAjudge, a model based on reinforcement learning to visualize the prediction process and give interpretable judgments. QAjudge follows two essential principles in legal systems across the world: Presumption of Innocence and Elemental Trial. During inference, a Question Net will select questions from the given set and an Answer Net will answer the question according to the fact description. Finally, a Predict Net will produce judgment results based on the answers. Reward functions are designed to minimize the number of questions asked. We conduct extensive experiments on several real-world datasets. Experimental results show that QAjudge can provide interpretable judgments while maintaining comparable performance with other state-of-the-art LJP models. The codes can be found from https://github.com/thunlp/QAjudge.

Download Full-text

Rebalancing Expanding EV Sharing Systems with Deep Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/186 ◽

2020 ◽

Author(s):

Man Luo ◽

Wenzhe Zhang ◽

Tianyou Song ◽

Kun Li ◽

Hongming Zhu ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Charging Time ◽

Novel Approach ◽

The World ◽

Net Revenue ◽

Multi Agent ◽

User Demand ◽

Policy Optimization ◽

Over Time

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i.e., repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i.e., the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273.v1 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

Graph Transformer for Graph-to-Sequence Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6243 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7464-7471

Author(s):

Deng Cai ◽

Wai Lam

Keyword(s):

Neural Networks ◽

Information Exchange ◽

State Of The Art ◽

Structural Information ◽

Representation Learning ◽

Graph Representation ◽

Text Generation ◽

Proposed Model ◽

Graph Neural Networks ◽

Meaning Representation

The dominant graph-to-sequence transduction models employ graph neural networks for graph representation learning, where the structural information is reflected by the receptive field of neurons. Unlike graph neural networks that restrict the information exchange between immediate neighborhood, we propose a new model, known as Graph Transformer, that uses explicit relation encoding and allows direct communication between two distant nodes. It provides a more efficient way for global graph structure modeling. Experiments on the applications of text generation from Abstract Meaning Representation (AMR) and syntax-based neural machine translation show the superiority of our proposed model. Specifically, our model achieves 27.4 BLEU on LDC2015E86 and 29.7 BLEU on LDC2017T10 for AMR-to-text generation, outperforming the state-of-the-art results by up to 2.2 points. On the syntax-based translation tasks, our model establishes new single-model state-of-the-art BLEU scores, 21.3 for English-to-German and 14.1 for English-to-Czech, improving over the existing best results, including ensembles, by over 1 BLEU.

Download Full-text

Multi-View Deep Attention Network for Reinforcement Learning (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7177 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13811-13812

Author(s):

Yueyue Hu ◽

Shiliang Sun ◽

Xin Xu ◽

Jing Zhao

Keyword(s):

Reinforcement Learning ◽

Single Agent ◽

Representation Learning ◽

Learning Task ◽

Comprehensive Strategy ◽

Attention Network ◽

Single View ◽

Learning Agents ◽

Proposed Model ◽

First Time

The representation approximated by a single deep network is usually limited for reinforcement learning agents. We propose a novel multi-view deep attention network (MvDAN), which introduces multi-view representation learning into the reinforcement learning task for the first time. The proposed model approximates a set of strategies from multiple representations and combines these strategies based on attention mechanisms to provide a comprehensive strategy for a single-agent. Experimental results on eight Atari video games show that the MvDAN has effective competitive performance than single-view reinforcement learning methods.

Download Full-text

Intelligent scheduling using a neural network model in conjunction with reinforcement learning

Proceedings of the Institution of Mechanical Engineers Part B Journal of Engineering Manufacture ◽

10.1243/095440505x8181 ◽

2005 ◽

Vol 219 (2) ◽

pp. 229-235

Author(s):

C J Fourie

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Reinforcement Learning ◽

Learning From Experience ◽

Scheduling System ◽

Simulated Environment ◽

The Neural Network ◽

Learning Techniques ◽

Proposed Model ◽

Intelligent Scheduling

This paper describes the use of an artificial neural network in conjunction with reinforcement learning techniques to develop an intelligent scheduling system that is capable of learning from experience. In a simulated environment the model controls a mobile robot that transports material to machines. States of ‘happiness’ are defined for each machine, which are the inputs to the neural network. The output of the neural network is the decision on which machine to service next. After every decision, a critic evaluates the decision and a teacher ‘rewards’ the network to encourage good decisions and discourage bad decisions. From the results obtained, it is concluded that the proposed model is capable of learning from past experience and thereby improving the intelligence of the system.

Download Full-text

Shared Generative Latent Representation Learning for Multi-View Clustering

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6146 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6688-6695

Author(s):

Ming Yin ◽

Weitian Huang ◽

Junbin Gao

Keyword(s):

Large Scale ◽

State Of The Art ◽

Poor Performance ◽

Representation Learning ◽

Performance Criteria ◽

Data Share ◽

Proposed Model ◽

Nonlinear Features ◽

Fundamental Research ◽

Mixture Of Gaussian

Clustering multi-view data has been a fundamental research topic in the computer vision community. It has been shown that a better accuracy can be achieved by integrating information of all the views than just using one view individually. However, the existing methods often struggle with the issues of dealing with the large-scale datasets and the poor performance in reconstructing samples. This paper proposes a novel multi-view clustering method by learning a shared generative latent representation that obeys a mixture of Gaussian distributions. The motivation is based on the fact that the multi-view data share a common latent embedding despite the diversity among the various views. Specifically, benefitting from the success of the deep generative learning, the proposed model can not only extract the nonlinear features from the views, but render a powerful ability in capturing the correlations among all the views. The extensive experimental results on several datasets with different scales demonstrate that the proposed method outperforms the state-of-the-art methods under a range of performance criteria.

Download Full-text

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

10.36227/techrxiv.15048273 ◽

2021 ◽

Author(s):

Zhenhui Ye

Keyword(s):

Reinforcement Learning ◽

Ad Hoc Network ◽

Ad Hoc ◽

State Of The Art ◽

Target Area ◽

Proposed Model ◽

Decentralized Execution ◽

Partially Observable ◽

Gated Recurrent Unit ◽

Uav Navigation

<div>In this paper, we aim to design a deep reinforcement learning(DRL) based control solution to navigate a swarm of unmanned aerial vehicles (UAVs) to fly around an unexplored target area under provide optimal communication coverage for the ground mobile users. Compared with existing DRL-based solutions that mainly solve the problem with global observation and centralized training, a practical and efficient Decentralized Training and Decentralized Execution(DTDE) framework is desirable to train and deploy each UAV in a distributed manner. To this end, we propose a novel DRL approach named Deep Recurrent Graph Network(DRGN) that makes use of Graph Attention Network-based Flying Ad-hoc Network(GAT-FANET) to achieve inter-UAV communications and Gated Recurrent Unit (GRU) to record historical information. We conducted extensive experiments to define an appropriate structure for GAT-FANET and examine the performance of DRGN. The simulation results show that the proposed model outperforms four state-of-the-art DRL-based approaches and four heuristic baselines, and demonstrate the scalability, transferability, robustness, and interpretability of DRGN.</div>

Download Full-text

Representation Learning for Grounded Spatial Reasoning

A Topic-Aware Reinforced Model for Weakly Supervised Stance Detection

Learning an Efficient Gait Cycle of a Biped Robot Based on Reinforcement Learning and Artificial Neural Networks

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Rebalancing Expanding EV Sharing Systems with Deep Reinforcement Learning

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Graph Transformer for Graph-to-Sequence Learning

Multi-View Deep Attention Network for Reinforcement Learning (Student Abstract)

Intelligent scheduling using a neural network model in conjunction with reinforcement learning

Shared Generative Latent Representation Learning for Multi-View Clustering

Multi-UAV Navigation for Partially Observable Communication Coverage by Graph Reinforcement Learning

Export Citation Format