DockStream: a docking wrapper to enhance de novo molecular design

In recent years, there has been an uptick in interest in generative models for molecules in drug development. In the field of de novo molecular design, these models are used to make molecules with desired properties from scratch. This is occasionally used instead of virtual screening, which is limited by the size of the libraries that can be searched in practice. Rather than screening existing libraries, generative models can be used to build custom libraries from scratch. Using generative models, which may optimize molecules straight towards the desired profile, this time-consuming approach can be sped up. The purpose of this work is to show how current shortcomings in evaluating generative models for molecules can be avoided. We cover both distribution-learning and goal-directed generation with a focus on the latter. Three well-known targets were downloaded from ChEMBL: Janus kinase 2 (JAK2), epidermal growth factor receptor (EGFR), and dopamine receptor D2 (DRD2) (Bento et al. 2014). We preprocessed the data to get binary classification jobs. Before calculating a scoring function, the data is split into two halves, which we shall refer to as split 1/2. The ratio of active to inactive users. Our goal is to train three bioactivity models with equal prediction performance, one to be used as a scoring function for chemical optimization and the other two to be used as performance evaluation models. Our findings suggest that distribution-learning can attain near-perfect scores on many existing criteria even with the most basic and completely useless models. According to benchmark studies, likelihood-based models account for many of the best technologies, and we propose that test set likelihoods be included in future comparisons.

Download Full-text

De novo molecular design and generative models

Drug Discovery Today ◽

10.1016/j.drudis.2021.05.019 ◽

2021 ◽

Author(s):

Joshua Meyers ◽

Benedek Fabian ◽

Nathan Brown

Keyword(s):

De Novo ◽

Molecular Design ◽

Generative Models ◽

De Novo Molecular Design

Download Full-text

Memory-assisted reinforcement learning for diverse molecular de novo design

10.21203/rs.3.rs-52871/v2 ◽

2020 ◽

Author(s):

Thomas Blaschke ◽

Ola Engkvist ◽

Jürgen Bajorath ◽

Hongming Chen

Keyword(s):

Reinforcement Learning ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

Scoring Function ◽

Chemical Diversity ◽

Chemical Structures ◽

Machine Learning Model ◽

Low Diversity ◽

De Novo Molecular Design

Abstract In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards desired properties. Here, we propose a new method to address the low diversity issue in RL for molecular design. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit. As proof of concept, we applied our method to generate structures with a desired AlogP value. In a second case study, we applied our method to design ligands for the dopamine type 2 receptor and the 5-hydroxytryptamine type 1A receptor. For both receptors, a machine learning model was developed to predict whether generated molecules were active or not for the receptor. In both case studies, it was found that memory-assisted RL led to the generation of more compounds predicted to be active having higher chemical diversity, thus achieving better coverage of chemical space of known ligands compared to established RL methods.

Download Full-text

Attention-based generative models for de novo molecular design

Chemical Science ◽

10.1039/d1sc01050f ◽

2021 ◽

Author(s):

Orion Dollar ◽

Nisarg Joshi ◽

David Beck ◽

Jim Pfaendtner

Keyword(s):

De Novo ◽

Molecular Design ◽

Data Modeling ◽

Generative Models ◽

Sequential Data ◽

De Novo Molecular Design ◽

Generative Algorithms ◽

The Impact

Attention mechanisms have led to many breakthroughs in sequential data modeling but have yet to be incorporated into any generative algorithms for molecular design. Here we explore the impact of...

Download Full-text

Deep Generative Models for Ligand-based de Novo Design Applied to Multi-parametric Optimization

10.26434/chemrxiv.13622417.v1 ◽

2021 ◽

Author(s):

Quentin Perron ◽

Olivier Mirguet ◽

Hamza Tajmouati ◽

Adam Skiredj ◽

Anne Rojas ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

New Technology ◽

De Novo Design ◽

Generative Models ◽

De Novo Molecular Design

<div> <div> <div> <p>Multi-Parameter Optimization (MPO) is a major challenge in New Chemical Entity (NCE) drug discovery projects, and the inability to identify molecules meeting all the criteria of lead optimization (LO) is an important cause of NCE project failure. Several ligand- and structure-based de novo design methods have been published over the past decades, some of which have proved useful multiobjective optimization. However, there is still need for improvement to better address the chemical feasibility of generated compounds as well as increasing the explored chemical space while tackling the MPO challenge. Recently, promising results have been reported for deep learning generative models applied to de novo molecular design, but until now, to our knowledge, no report has been made of the value of this new technology for addressing MPO in an actual drug discovery project. Our objective in this study was to evaluate the potential of a ligand-based de novo design technology using deep learning generative models to accelerate the discovery of an optimized lead compound meeting all in vitro late stage LO criteria. </p> </div> </div> </div>

Download Full-text

Deep Generative Models for Ligand-based de Novo Design Applied to Multi-parametric Optimization

10.26434/chemrxiv.13622417.v2 ◽

2021 ◽

Author(s):

Quentin Perron ◽

Olivier Mirguet ◽

Hamza Tajmouati ◽

Adam Skiredj ◽

Anne Rojas ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

New Technology ◽

De Novo Design ◽

Generative Models ◽

De Novo Molecular Design

<div> <div> <div> <p>Multi-Parameter Optimization (MPO) is a major challenge in New Chemical Entity (NCE) drug discovery projects, and the inability to identify molecules meeting all the criteria of lead optimization (LO) is an important cause of NCE project failure. Several ligand- and structure-based de novo design methods have been published over the past decades, some of which have proved useful multiobjective optimization. However, there is still need for improvement to better address the chemical feasibility of generated compounds as well as increasing the explored chemical space while tackling the MPO challenge. Recently, promising results have been reported for deep learning generative models applied to de novo molecular design, but until now, to our knowledge, no report has been made of the value of this new technology for addressing MPO in an actual drug discovery project. Our objective in this study was to evaluate the potential of a ligand-based de novo design technology using deep learning generative models to accelerate the discovery of an optimized lead compound meeting all in vitro late stage LO criteria. </p> </div> </div> </div>

Download Full-text

Memory-Assisted Reinforcement Learning for Diverse Molecular De Novo Design

10.26434/chemrxiv.12693152.v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Thomas Blaschke ◽

Ola Engkvist ◽

Jürgen Bajorath ◽

Hongming Chen

Keyword(s):

Reinforcement Learning ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

Scoring Function ◽

De Novo Design ◽

Memory Unit ◽

Chemical Structures ◽

Low Diversity ◽

De Novo Molecular Design

<div><div><div><p>In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards particular properties. Here, we propose a new method to address the low diversity issue in RL. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit.</p></div></div></div>

Download Full-text

Memory-assisted Reinforcement Learning for Diverse Molecular De Novo Design

10.21203/rs.3.rs-52871/v1 ◽

2020 ◽

Author(s):

Thomas Blaschke ◽

Ola Engkvist ◽

Jürgen Bajorath ◽

Hongming Chen

Keyword(s):

Reinforcement Learning ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

Scoring Function ◽

Chemical Diversity ◽

Chemical Structures ◽

Low Diversity ◽

De Novo Molecular Design ◽

Dopamine 2 Receptor

Abstract In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards particular properties. Here, we propose a new method to address the low diversity issue in RL. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit. As proof of concept, we applied our method to generate structures with an optimized logP. In a second case study, we applied our method to design ligands for the dopamine 2 receptor and the 5-hydroxytryptamine 1A receptor. For both receptors, a machine learning model was developed to predict whether generated molecules were active or not for the receptor. In both case studies, it was found that memory-assisted RL led to the generation of more active compounds and with higher chemical diversity, thus achieving better coverage of chemical space of known ligands compared to established RL method.

Download Full-text

Deep Generative Models for Ligand-based de Novo Design Applied to Multi-parametric Optimization

10.26434/chemrxiv.13622417 ◽

2021 ◽

Author(s):

Quentin Perron ◽

Olivier Mirguet ◽

Hamza Tajmouati ◽

Adam Skiredj ◽

Anne Rojas ◽

...

Keyword(s):

Deep Learning ◽

Drug Discovery ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

New Technology ◽

De Novo Design ◽

Generative Models ◽

De Novo Molecular Design

<div> <div> <div> <p>Multi-Parameter Optimization (MPO) is a major challenge in New Chemical Entity (NCE) drug discovery projects, and the inability to identify molecules meeting all the criteria of lead optimization (LO) is an important cause of NCE project failure. Several ligand- and structure-based de novo design methods have been published over the past decades, some of which have proved useful multiobjective optimization. However, there is still need for improvement to better address the chemical feasibility of generated compounds as well as increasing the explored chemical space while tackling the MPO challenge. Recently, promising results have been reported for deep learning generative models applied to de novo molecular design, but until now, to our knowledge, no report has been made of the value of this new technology for addressing MPO in an actual drug discovery project. Our objective in this study was to evaluate the potential of a ligand-based de novo design technology using deep learning generative models to accelerate the discovery of an optimized lead compound meeting all in vitro late stage LO criteria. </p> </div> </div> </div>

Download Full-text

Memory-Assisted Reinforcement Learning for Diverse Molecular De Novo Design

10.26434/chemrxiv.12693152 ◽

2020 ◽

Author(s):

Thomas Blaschke ◽

Ola Engkvist ◽

Jürgen Bajorath ◽

Hongming Chen

Keyword(s):

Reinforcement Learning ◽

De Novo ◽

Molecular Design ◽

Chemical Space ◽

Scoring Function ◽

De Novo Design ◽

Memory Unit ◽

Chemical Structures ◽

Low Diversity ◽

De Novo Molecular Design

<div><div><div><p>In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards particular properties. Here, we propose a new method to address the low diversity issue in RL. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit.</p></div></div></div>

Download Full-text