Generalized Representation Learning Methods for Deep Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/748 ◽

2020 ◽

Author(s):

Hanhua Zhu

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Representation Learning ◽

Learning Methods ◽

New Methods ◽

Compact State

Deep reinforcement learning (DRL) increases the successful applications of reinforcement learning (RL) techniques but also brings challenges such as low sample efficiency. In this work, I propose generalized representation learning methods to obtain compact state space suitable for RL from a raw observation state. I expect my new methods will increase sample efficiency of RL by understandable representations of state and therefore improve the performance of RL.

Download Full-text

Hierarchical Neuro-Fuzzy Systems Part II

Encyclopedia of Artificial Intelligence ◽

10.4018/978-1-59904-849-9.ch121 ◽

2011 ◽

pp. 817-824

Author(s):

Marley Vellasco ◽

Marco Pacheco ◽

Karla Figueiredo ◽

Flavio Souza

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Learning Process ◽

Fuzzy Systems ◽

Space Partitioning ◽

Learning Methods ◽

New Class ◽

Neuro Fuzzy ◽

Binary Space Partitioning ◽

Priori Information

This paper describes a new class of neuro-fuzzy models, called Reinforcement Learning Hierarchical Neuro- Fuzzy Systems (RL-HNF). These models employ the BSP (Binary Space Partitioning) and Politree partitioning of the input space [Chrysanthou,1992] and have been developed in order to bypass traditional drawbacks of neuro-fuzzy systems: the reduced number of allowed inputs and the poor capacity to create their own structure and rules (ANFIS [Jang,1997], NEFCLASS [Kruse,1995] and FSOM [Vuorimaa,1994]). These new models, named Reinforcement Learning Hierarchical Neuro-Fuzzy BSP (RL-HNFB) and Reinforcement Learning Hierarchical Neuro-Fuzzy Politree (RL-HNFP), descend from the original HNFB that uses Binary Space Partitioning (see Hierarchical Neuro-Fuzzy Systems Part I). By using hierarchical partitioning, together with the Reinforcement Learning (RL) methodology, a new class of Neuro-Fuzzy Systems (SNF) was obtained, which executes, in addition to automatically learning its structure, the autonomous learning of the actions to be taken by an agent, dismissing a priori information (number of rules, fuzzy rules and sets) relative to the learning process. These characteristics represent an important differential when compared with existing intelligent agents learning systems, because in applications involving continuous environments and/or environments considered to be highly dimensional, the use of traditional Reinforcement Learning methods based on lookup tables (a table that stores value functions for a small or discrete state space) is no longer possible, since the state space becomes too large. This second part of hierarchical neuro-fuzzy systems focus on the use of reinforcement learning process. The first part presented HNFB models based on supervised learning methods. The RL-HNFB and RL-HNFP models were evaluated in a benchmark control application and a simulated Khepera robot environment with multiple obstacles.

Download Full-text

A novel state space representation for the solution of 2D-HP protein folding problem using reinforcement learning methods

Applied Soft Computing ◽

10.1016/j.asoc.2014.09.047 ◽

2015 ◽

Vol 26 ◽

pp. 213-223 ◽

Cited By ~ 3

Author(s):

Berat Doğan ◽

Tamer Ölmez

Keyword(s):

Protein Folding ◽

Reinforcement Learning ◽

State Space ◽

Space Representation ◽

Learning Methods ◽

State Space Representation

Download Full-text

Combined Reinforcement Learning via Abstract Representations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013582 ◽

2019 ◽

Vol 33 ◽

pp. 3582-3589 ◽

Cited By ~ 4

Author(s):

Vincent Francois-Lavet ◽

Yoshua Bengio ◽

Doina Precup ◽

Joelle Pineau

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Transfer Learning ◽

Computationally Efficient ◽

Dimensional Representation ◽

Learning Methods ◽

Model Free ◽

Abstract Representations ◽

Low Dimensional ◽

New Strategies

In the quest for efficient and robust reinforcement learning methods, both model-free and model-based approaches offer advantages. In this paper we propose a new way of explicitly bridging both approaches via a shared low-dimensional learned encoding of the environment, meant to capture summarizing abstractions. We show that the modularity brought by this approach leads to good generalization while being computationally efficient, with planning happening in a smaller latent state space. In addition, this approach recovers a sufficient low-dimensional representation of the environment, which opens up new strategies for interpretable AI, exploration and transfer learning.

Download Full-text

Deep Reinforcement Learning for Multiparameter Optimization in de novo Drug Design

10.26434/chemrxiv.7990910.v2 ◽

2019 ◽

Author(s):

Niclas Ståhl ◽

Göran Falkman ◽

Alexander Karlsson ◽

Gunnar Mathiason ◽

Jonas Boström

Keyword(s):

Reinforcement Learning ◽

Short Term Memory ◽

De Novo ◽

De Novo Drug Design ◽

Generative Process ◽

New Methods ◽

Multiparameter Optimization ◽

Long Short Term Memory ◽

New Compounds

<p>In medicinal chemistry programs it is key to design and make compounds that are efficacious and safe. This is a long, complex and difficult multi-parameter optimization process, often including several properties with orthogonal trends. New methods for the automated design of compounds against profiles of multiple properties are thus of great value. Here we present a fragment-based reinforcement learning approach based on an actor-critic model, for the generation of novel molecules with optimal properties. The actor and the critic are both modelled with bidirectional long short-term memory (LSTM) networks. The AI method learns how to generate new compounds with desired properties by starting from an initial set of lead molecules and then improve these by replacing some of their fragments. A balanced binary tree based on the similarity of fragments is used in the generative process to bias the output towards structurally similar molecules. The method is demonstrated by a case study showing that 93% of the generated molecules are chemically valid, and a third satisfy the targeted objectives, while there were none in the initial set.</p>

Download Full-text

A Drug Target Interaction Prediction Based on LINE-RF Learning

Current Bioinformatics ◽

10.2174/1574893615666191227092453 ◽

2020 ◽

Vol 15 (7) ◽

pp. 750-757

Author(s):

Jihong Wang ◽

Yue Shi ◽

Xiaodan Wang ◽

Huiyou Chang

Keyword(s):

Network Topology ◽

Drug Target ◽

Large Scale ◽

Representation Learning ◽

New Drugs ◽

Combination Method ◽

Learning Methods ◽

Network Representation ◽

On Line ◽

Clinical Experiments

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.

Download Full-text

Continuity and Smoothness Analysis and Possible Improvement of Traditional Reinforcement Learning Methods

2020 IEEE International Conference on Mechatronics and Automation (ICMA) ◽

10.1109/icma49215.2020.9233547 ◽

2020 ◽

Author(s):

Tianhao Chen ◽

Wenchuan Jia ◽

Jianjun Yuan ◽

Shugen Ma ◽

Limei Cheng

Keyword(s):

Reinforcement Learning ◽

Learning Methods

Download Full-text

Reinforcement Learning Methods on Optimization Problems of Natural Gas Pipeline Networks

2020 4th International Conference on Smart Grid and Smart Cities (ICSGSC) ◽

10.1109/icsgsc50906.2020.9248563 ◽

2020 ◽

Author(s):

Dong Yang ◽

Siyun Yan ◽

Dengji Zhou ◽

Tiemin Shao ◽

Lin Zhang ◽

...

Keyword(s):

Reinforcement Learning ◽

Natural Gas ◽

Optimization Problems ◽

Gas Pipeline ◽

Learning Methods ◽

Natural Gas Pipeline ◽

Pipeline Networks

Download Full-text

Reinforcement learning versus swarm intelligence for autonomous multi-HAPS coordination

SN Applied Sciences ◽

10.1007/s42452-021-04658-6 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

Ogbonnaya Anicho ◽

Philip B. Charlesworth ◽

Gurvinder S. Baicher ◽

Atulya K. Nagar

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Swarm Intelligence ◽

Performance Indicators ◽

Convergence Rates ◽

Tuning Parameters ◽

Continuous State Space ◽

Continuous State ◽

User Coverage ◽

Better Than

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.

Download Full-text

RLPath: a knowledge graph link prediction method using reinforcement learning based attentive relation path searching and representation learning

Applied Intelligence ◽

10.1007/s10489-021-02672-0 ◽

2021 ◽

Author(s):

Ling Chen ◽

Jun Cui ◽

Xing Tang ◽

Yuntao Qian ◽

Yansheng Li ◽

...

Keyword(s):

Reinforcement Learning ◽

Link Prediction ◽

Prediction Method ◽

Representation Learning ◽

Knowledge Graph ◽

Graph Link

Download Full-text

An Exploratory Study of COVID-19 Information on Twitter in the Greater Region

Big Data and Cognitive Computing ◽

10.3390/bdcc5010005 ◽

2021 ◽

Vol 5 (1) ◽

pp. 5

Author(s):

Ninghan Chen ◽

Zhiqiang Zhong ◽

Jun Pang

Keyword(s):

Machine Learning ◽

Social Networks ◽

Online Social Networks ◽

Exploratory Study ◽

Representation Learning ◽

Data Driven ◽

Learning Methods ◽

Twitter Users ◽

Over Time

The outbreak of the COVID-19 led to a burst of information in major online social networks (OSNs). Facing this constantly changing situation, OSNs have become an essential platform for people expressing opinions and seeking up-to-the-minute information. Thus, discussions on OSNs may become a reflection of reality. This paper aims to figure out how Twitter users in the Greater Region (GR) and related countries react differently over time through conducting a data-driven exploratory study of COVID-19 information using machine learning and representation learning methods. We find that tweet volume and COVID-19 cases in GR and related countries are correlated, but this correlation only exists in a particular period of the pandemic. Moreover, we plot the changing of topics in each country and region from 22 January 2020 to 5 June 2020, figuring out the main differences between GR and related countries.

Download Full-text