End-to-end Deep Reinforcement Learning for Targeted Drug Generation

Deep reinforcement learning (DRL) methods traditionally struggle with tasks where environment rewards are sparse or delayed, which entails that exploration remains one of the key challenges of DRL. Instead of solely relying on extrinsic rewards, many state-of-the-art methods use intrinsic curiosity as exploration signal. While they hold promise of better local exploration, discovering global exploration strategies is beyond the reach of current methods. We propose a novel end-to-end intrinsic reward formulation that introduces high-level exploration in reinforcement learning. Our curiosity signal is driven by a fast reward that deals with local exploration and a slow reward that incentivizes long-time horizon exploration strategies. We formulate curiosity as the error in an agent’s ability to reconstruct the observations given their contexts. Experimental results show that this high-level exploration enables our agents to outperform prior work in several Atari games.

Download Full-text

Auto-pipeline

Proceedings of the VLDB Endowment ◽

10.14778/3476249.3476303 ◽

2021 ◽

Vol 14 (11) ◽

pp. 2563-2575

Author(s):

Junwen Yang ◽

Yeye He ◽

Surajit Chaudhuri

Keyword(s):

Reinforcement Learning ◽

Recent Work ◽

Pipeline System ◽

Complex Data ◽

Data Preparation ◽

Significant Progress ◽

Database Table ◽

End To End

Recent work has made significant progress in helping users to automate single data preparation steps, such as string-transformations and table-manipulation operators (e.g., Join, GroupBy, Pivot, etc.). We in this work propose to automate multiple such steps end-to-end, by synthesizing complex data-pipelines with both string-transformations and table-manipulation operators. We propose a novel by-target paradigm that allows users to easily specify the desired pipeline, which is a significant departure from the traditional by-example paradigm. Using by-target, users would provide input tables (e.g., csv or json files), and point us to a "target table" (e.g., an existing database table or BI dashboard) to demonstrate how the output from the desired pipeline would schematically "look like". While the problem is seemingly under-specified, our unique insight is that implicit table constraints such as FDs and keys can be exploited to significantly constrain the space and make the problem tractable. We develop an AUTO-PIPELINE system that learns to synthesize pipelines using deep reinforcement-learning (DRL) and search. Experiments using a benchmark of 700 real pipelines crawled from GitHub and commercial vendors suggest that AUTO-PIPELINE can successfully synthesize around 70% of complex pipelines with up to 10 steps.

Download Full-text

An end-to-end reinforcement learning method for automated guided vehicle path planning

International Symposium on Artificial Intelligence and Robotics 2020 ◽

10.1117/12.2579792 ◽

2020 ◽

Author(s):

Yu Sun ◽

Haisheng Li

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Automated Guided Vehicle ◽

Learning Method ◽

End To End ◽

Vehicle Path

Download Full-text

End-to-end Deep Reinforcement Learning for Targeted Drug Generation

An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning

Verifiably safe exploration for end-to-end reinforcement learning

End-to-End Deep Reinforcement Learning based Recommendation with Supervised Embedding

End-to-end Personalization of Digital Health Interventions using Raw Sensor Data with Deep Reinforcement Learning

Traffic Signal Control Using End-to-End Off-Policy Deep Reinforcement Learning

End-to-end on-line rescheduling from Gantt chart images using deep reinforcement learning

Mask RSA: End-To-End Reinforcement Learning-based Routing and Spectrum Assignment in Elastic Optical Networks

Towards High-Level Intrinsic Exploration in Reinforcement Learning

Auto-pipeline

An end-to-end reinforcement learning method for automated guided vehicle path planning

Export Citation Format