Diversity Oriented Deep Reinforcement Learning for Targeted Molecule Generation

AbstractIn this work, we explore the potential of deep learning to streamline the process of identifying new potential drugs through the computational generation of molecules with interesting biological properties. Two deep neural networks compose our targeted generation framework: the Generator, which is trained to learn the building rules of valid molecules employing SMILES strings notation, and the Predictor which evaluates the newly generated compounds by predicting their affinity for the desired target. Then, the Generator is optimized through Reinforcement Learning to produce molecules with bespoken properties. The innovation of this approach is the exploratory strategy applied during the reinforcement training process that seeks to add novelty to the generated compounds. This training strategy employs two Generators interchangeably to sample new SMILES: the initially trained model that will remain fixed and a copy of the previous one that will be updated during the training to uncover the most promising molecules. The evolution of the reward assigned by the Predictor determines how often each one is employed to select the next token of the molecule. This strategy establishes a compromise between the need to acquire more information about the chemical space and the need to sample new molecules, with the experience gained so far. To demonstrate the effectiveness of the method, the Generator is trained to design molecules with an optimized coefficient of partition and also high inhibitory power against the Adenosine $$A_{2A}$$ A 2 A and $$\kappa$$ κ opioid receptors. The results reveal that the model can effectively adjust the newly generated molecules towards the wanted direction. More importantly, it was possible to find promising sets of unique and diverse molecules, which was the main purpose of the newly implemented strategy.

Download Full-text

An Exploration Strategy Improves the Diversity of de novo Ligands Using Deep Reinforcement Learning – A Case for the Adenosine A2A Receptor

10.26434/chemrxiv.7436789 ◽

2019 ◽

Author(s):

Xuhan Liu ◽

Kai Ye ◽

Herman Van Vlijmen ◽

Adriaan P. IJzerman ◽

Gerard JP Van westen

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

De Novo ◽

Chemical Space ◽

Chemical Diversity ◽

Reward Function ◽

Chemical Structures ◽

Exploration Strategy ◽

Machine Learning Model ◽

Adenosine A2a

Over the last five years deep learning has progressed tremendously in both image recognition and natural language processing. Now it is increasingly applied to other data rich fields. In drug discovery, recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES. However, ligands generated by current methods have so far provided relatively low diversity and do not fully cover the whole chemical space occupied by known ligands. Here, we propose a new method (DrugEx) to discover de novo drug-like molecules. DrugEx is an RNN model (generator) trained through reinforcement learning which was integrated with a special exploration strategy. As a case study we applied our method to design ligands against the adenosine A2A receptor. From ChEMBL data, a machine learning model (predictor) was created to predict whether generated molecules are active or not. Based on this predictor as the reward function, the generator was trained by reinforcement learning without any further data. We then compared the performance of our method with two previously published methods, REINVENT and ORGANIC. We found that candidate molecules our model designed, and predicted to be active, had a larger chemical diversity, and better covered the chemical space of known ligands compared to the state-of-the-art.

Download Full-text

An Exploration Strategy Improves the Diversity of de novo Ligands Using Deep Reinforcement Learning – A Case for the 3 Adenosine A2A Receptor

10.26434/chemrxiv.7436789.v1 ◽

2018 ◽

Author(s):

Xuhan Liu ◽

Kai Ye ◽

Herman Van Vlijmen ◽

Adriaan P. IJzerman ◽

Gerard JP Van westen

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

De Novo ◽

Chemical Space ◽

Chemical Diversity ◽

Reward Function ◽

Chemical Structures ◽

Exploration Strategy ◽

Machine Learning Model ◽

Adenosine A2a

Over the last five years deep learning has progressed tremendously in both image recognition and natural language processing. Now it is increasingly applied to other data rich fields. In drug discovery, recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES. However, ligands generated by current methods used to provide relatively little diversity and do not fully cover the whole chemical space occupied by known ligands. Here, we propose a new method (DrugEx) to discover de novo drug-like molecules. DrugEx is an RNN model (generator) trained through a special exploration strategy integrated into reinforcement learning. As a case study we applied our method to design ligands against the adenosine A2A receptor. From ChEMBL data, a machine learning model (predictor) was created to predict whether generated molecules are active or not. Based on this predictor as the reward function, the generator was trained by reinforcement learning without any further data. We then compared the performance of our method with two previously published methods, REINVENT and ORGANIC. We found that candidate molecules our model designed that were predicted to be active, had a larger chemical diversity, and better covered the chemical space of known ligands compared to the state-of-the-art.

Download Full-text

An Exploration Strategy Improves the Diversity of de novo Ligands Using Deep Reinforcement Learning – A Case for the Adenosine A2A Receptor

10.26434/chemrxiv.7436789.v3 ◽

2019 ◽

Author(s):

Xuhan Liu ◽

Kai Ye ◽

Herman Van Vlijmen ◽

Adriaan P. IJzerman ◽

Gerard JP Van westen

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

De Novo ◽

Chemical Space ◽

Chemical Diversity ◽

Reward Function ◽

Chemical Structures ◽

Exploration Strategy ◽

Machine Learning Model ◽

Adenosine A2a

Over the last five years deep learning has progressed tremendously in both image recognition and natural language processing. Now it is increasingly applied to other data rich fields. In drug discovery, recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES. However, ligands generated by current methods have so far provided relatively low diversity and do not fully cover the whole chemical space occupied by known ligands. Here, we propose a new method (DrugEx) to discover de novo drug-like molecules. DrugEx is an RNN model (generator) trained through reinforcement learning which was integrated with a special exploration strategy. As a case study we applied our method to design ligands against the adenosine A2A receptor. From ChEMBL data, a machine learning model (predictor) was created to predict whether generated molecules are active or not. Based on this predictor as the reward function, the generator was trained by reinforcement learning without any further data. We then compared the performance of our method with two previously published methods, REINVENT and ORGANIC. We found that candidate molecules our model designed, and predicted to be active, had a larger chemical diversity, and better covered the chemical space of known ligands compared to the state-of-the-art.

Download Full-text

An Exploration Strategy Improves the Diversity of de novo Ligands Using Deep Reinforcement Learning – A Case for the Adenosine A2A Receptor

10.26434/chemrxiv.7436789.v2 ◽

2018 ◽

Author(s):

Xuhan Liu ◽

Kai Ye ◽

Herman Van Vlijmen ◽

Adriaan P. IJzerman ◽

Gerard JP Van westen

Keyword(s):

Reinforcement Learning ◽

Language Processing ◽

De Novo ◽

Chemical Space ◽

Chemical Diversity ◽

Reward Function ◽

Chemical Structures ◽

Exploration Strategy ◽

Machine Learning Model ◽

Adenosine A2a

Over the last five years deep learning has progressed tremendously in both image recognition and natural language processing. Now it is increasingly applied to other data rich fields. In drug discovery, recurrent neural networks (RNNs) have been shown to be an effective method to generate novel chemical structures in the form of SMILES. However, ligands generated by current methods used to provide relatively little diversity and do not fully cover the whole chemical space occupied by known ligands. Here, we propose a new method (DrugEx) to discover de novo drug-like molecules. DrugEx is an RNN model (generator) trained through a special exploration strategy integrated into reinforcement learning. As a case study we applied our method to design ligands against the adenosine A2A receptor. From ChEMBL data, a machine learning model (predictor) was created to predict whether generated molecules are active or not. Based on this predictor as the reward function, the generator was trained by reinforcement learning without any further data. We then compared the performance of our method with two previously published methods, REINVENT and ORGANIC. We found that candidate molecules our model designed that were predicted to be active, had a larger chemical diversity, and better covered the chemical space of known ligands compared to the state-of-the-art.

Download Full-text

A Decision-Making Method for Connected Autonomous Driving Based on Reinforcement Learning

10.4271/2020-01-5154 ◽

2020 ◽

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Learning Strategy ◽

Ride Comfort ◽

Autonomous Driving ◽

Training Process ◽

Training Strategy ◽

Reward Function ◽

Decision Making System ◽

Model Training

At present, with the development of Intelligent Vehicle Infrastructure Cooperative Systems (IVICS), the decision-making for automated vehicle based on connected environment conditions has attracted more attentions. Reliability, efficiency and generalization performance are the basic requirements for the vehicle decision-making system. Therefore, this paper proposed a decision-making method for connected autonomous driving based on Wasserstein Generative Adversarial Nets-Deep Deterministic Policy Gradient (WGAIL-DDPG) algorithm. In which, the key components for reinforcement learning (RL) model, reward function, is designed from the aspect of vehicle serviceability, such as safety, ride comfort and handling stability. To reduce the complexity of the proposed model, an imitation learning strategy is introduced to improve the RL training process. Meanwhile, the model training strategy based on cloud computing effectively solves the problem of insufficient computing resources of the vehicle-mounted system. Test results show that the proposed method can improve the efficiency for RL training process with reliable decision making performance and reveals excellent generalization capability.

Download Full-text

Location- and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning

ACM Transactions on Internet of Things ◽

10.1145/3424739 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-25

Author(s):

Yongsen Ma ◽

Sheheryar Arshad ◽

Swetha Muniraju ◽

Eric Torkildson ◽

Enrico Rantala ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Reinforcement Learning ◽

Activity Recognition ◽

Deep Neural Networks ◽

State Machine ◽

Recognition Algorithm ◽

The State ◽

Neural Architecture ◽

Learning Agent

In recent years, Channel State Information (CSI) measured by WiFi is widely used for human activity recognition. In this article, we propose a deep learning design for location- and person-independent activity recognition with WiFi. The proposed design consists of three Deep Neural Networks (DNNs): a 2D Convolutional Neural Network (CNN) as the recognition algorithm, a 1D CNN as the state machine, and a reinforcement learning agent for neural architecture search. The recognition algorithm learns location- and person-independent features from different perspectives of CSI data. The state machine learns temporal dependency information from history classification results. The reinforcement learning agent optimizes the neural architecture of the recognition algorithm using a Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM). The proposed design is evaluated in a lab environment with different WiFi device locations, antenna orientations, sitting/standing/walking locations/orientations, and multiple persons. The proposed design has 97% average accuracy when testing devices and persons are not seen during training. The proposed design is also evaluated by two public datasets with accuracy of 80% and 83%. The proposed design needs very little human efforts for ground truth labeling, feature engineering, signal processing, and tuning of learning parameters and hyperparameters.

Download Full-text

Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the ‘Big Data’ Era

10.26434/chemrxiv.12119040 ◽

2020 ◽

Cited By ~ 1

Author(s):

Vishal Babu Siramshetty ◽

Dac-Trung Nguyen ◽

Natalia J. Martinez ◽

Anton Simeonov ◽

Noel T. Southall ◽

...

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Big Data ◽

Recurrent Neural Networks ◽

Deep Neural Networks ◽

Prediction Models ◽

Chemical Space ◽

Superior Performance ◽

Gradient Boosting ◽

Artificial Intelligence Methods

The rise of novel artificial intelligence methods necessitates a comparison of this wave of new approaches with classical machine learning for a typical drug discovery project. Inhibition of the potassium ion channel, whose alpha subunit is encoded by human Ether-à-go-go-Related Gene (hERG), leads to prolonged QT interval of the cardiac action potential and is a significant safety pharmacology target for the development of new medicines. Several computational approaches have been employed to develop prediction models for assessment of hERG liabilities of small molecules including recent work using deep learning methods. Here we perform a comprehensive comparison of prediction models based on classical (random forests and gradient boosting) and modern (deep neural networks and recurrent neural networks) artificial intelligence methods. The training set (~9000 compounds) was compiled by integrating hERG bioactivity data from ChEMBL database with experimental data generated from an in-house, high-throughput thallium flux assay. We utilized different molecular descriptors including the latent descriptors, which are real-valued continuous vectors derived from chemical autoencoders trained on a large chemical space (> 1.5 million compounds). The models were prospectively validated on ~840 in-house compounds screened in the same thallium flux assay. The deep neural networks performed significantly better than the classical methods with the latent descriptors. The recurrent neural networks that operate on SMILES provided highest model sensitivity. The best models were merged into a consensus model that offered superior performance compared to reference models from academic and commercial domains. Further, we shed light on the potential of artificial intelligence methods to exploit the chemistry big data and generate novel chemical representations useful in predictive modeling and tailoring new chemical space.

Download Full-text

Generating stable molecules using imitation and reinforcement learning

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac3eb4 ◽

2021 ◽

Author(s):

Søren Ager Meldgaard ◽

Jonas Köhler ◽

Henrik Lund Mortensen ◽

Mads-Peter Verner Christiansen ◽

Frank Noé ◽

...

Keyword(s):

Reinforcement Learning ◽

Chemical Space ◽

Training Data ◽

Graph Representation ◽

Imitation Learning ◽

Training Set ◽

Machine Learning Methods ◽

Multiple Copies ◽

The Stability ◽

3D Information

Abstract Chemical space is routinely explored by machine learning methods to discover interesting molecules, before time-consuming experimental synthesizing is attempted. However, these methods often rely on a graph representation, ignoring 3D information necessary for determining the stability of the molecules. We propose a reinforcement learning approach for generating molecules in cartesian coordinates allowing for quantum chemical prediction of the stability. To improve sample-efficiency we learn basic chemical rules from imitation learning on the GDB-11 database to create an initial model applicable for all stoichiometries. We then deploy multiple copies of the model conditioned on a specific stoichiometry in a reinforcement learning setting. The models correctly identify low energy molecules in the database and produce novel isomers not found in the training set. Finally, we apply the model to larger molecules to show how reinforcement learning further refines the imitation learning model in domains far from the training data.

Download Full-text

A Novel Rank Selection Scheme in Tensor Ring Decomposition Based on Reinforcement Learning for Deep Neural Networks

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9053292 ◽

2020 ◽

Cited By ~ 1

Author(s):

Zhiyu Cheng ◽

Baopu Li ◽

Yanwen Fan ◽

Yingze Bao

Keyword(s):

Neural Networks ◽

Reinforcement Learning ◽

Deep Neural Networks ◽

Selection Scheme

Download Full-text