scholarly journals Artificial Intelligence Guided De Novo Molecular Design Targeting COVID-19

Author(s):  
Srilok Srinivasan ◽  
Rohit Batra ◽  
Henry Chan ◽  
Ganesh Kamath ◽  
Mathew J. Cherukara ◽  
...  

An extensive search for active therapeutic agents against the SARS-CoV-2 is being conducted across the globe. Computational docking simulations have traditionally been used for <i>in silico</i> ligand design and remain popular method of choice for high-throughput screening of therapeutic agents in the fight against COVID-19. Despite the vast chemical space (millions to billions of biomolecules) that can be potentially explored as therapeutic agents, we remain severely limited in the search of candidate compounds owing to the high computational cost of these ensemble docking simulations employed in traditional <i>in silico</i> ligand design. Here, we present a <i>de novo</i> molecular design strategy that leverages artificial intelligence to discover new therapeutic biomolecules against SARS-CoV-2. A Monte Carlo Tree Search algorithm combined with a multi-task neural network (MTNN) surrogate model for expensive docking simulations and recurrent neural networks (RNN) for rollouts, is used to sample the exhaustive SMILES space of candidate biomolecules. Using Vina scores as target objective to measure binding of therapeutic molecules to either the isolated spike protein (S-protein) of SARS-CoV-2 at its host receptor region or to the S-protein:Angiotensin converting enzyme 2 (ACE2) receptor interface, we generate several (~100's) new biomolecules that outperform FDA (~1000’s) and non-FDA biomolecules (~million) from existing databases. A transfer learning strategy is deployed to retrain the MTNN surrogate as new candidate molecules are identified - this iterative search and retrain strategy is shown to accelerate the discovery of desired candidates. We perform detailed analysis using Lipinski's rules and also analyze the structural similarities between the various top performing candidates. We spilt the molecules using a molecular fragmenting algorithm and identify the common chemical fragments and patterns – such information is important to identify moieties that are responsible for improved performance. Although we focus on therapeutic biomolecules, our AI strategy is broadly applicable for accelerated design and discovery of any chemical molecules with user-desired functionality.

2020 ◽  
Author(s):  
Srilok Srinivasan ◽  
Rohit Batra ◽  
Henry Chan ◽  
Ganesh Kamath ◽  
Mathew J. Cherukara ◽  
...  

An extensive search for active therapeutic agents against the SARS-CoV-2 is being conducted across the globe. Computational docking simulations have traditionally been used for <i>in silico</i> ligand design and remain popular method of choice for high-throughput screening of therapeutic agents in the fight against COVID-19. Despite the vast chemical space (millions to billions of biomolecules) that can be potentially explored as therapeutic agents, we remain severely limited in the search of candidate compounds owing to the high computational cost of these ensemble docking simulations employed in traditional <i>in silico</i> ligand design. Here, we present a <i>de novo</i> molecular design strategy that leverages artificial intelligence to discover new therapeutic biomolecules against SARS-CoV-2. A Monte Carlo Tree Search algorithm combined with a multi-task neural network (MTNN) surrogate model for expensive docking simulations and recurrent neural networks (RNN) for rollouts, is used to sample the exhaustive SMILES space of candidate biomolecules. Using Vina scores as target objective to measure binding of therapeutic molecules to either the isolated spike protein (S-protein) of SARS-CoV-2 at its host receptor region or to the S-protein:Angiotensin converting enzyme 2 (ACE2) receptor interface, we generate several (~100's) new biomolecules that outperform FDA (~1000’s) and non-FDA biomolecules (~million) from existing databases. A transfer learning strategy is deployed to retrain the MTNN surrogate as new candidate molecules are identified - this iterative search and retrain strategy is shown to accelerate the discovery of desired candidates. We perform detailed analysis using Lipinski's rules and also analyze the structural similarities between the various top performing candidates. We spilt the molecules using a molecular fragmenting algorithm and identify the common chemical fragments and patterns – such information is important to identify moieties that are responsible for improved performance. Although we focus on therapeutic biomolecules, our AI strategy is broadly applicable for accelerated design and discovery of any chemical molecules with user-desired functionality.


2020 ◽  
Author(s):  
Francesca Grisoni ◽  
Berend Huisman ◽  
Alexander Button ◽  
Michael Moret ◽  
Kenneth Atz ◽  
...  

<p>Automation of the molecular design-make-test-analyze cycle speeds up the identification of hit and lead compounds for drug discovery. Using deep learning for computational molecular design and a customized microfluidics platform for on-chip compound synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space defined by known LXRα agonists, and to suggest structural analogs of known ligands and novel molecular cores. To further the design of lead-like molecules and ensure compatibility with automated on-chip synthesis, this chemical space was confined to the set of virtual products obtainable from 17 different one-step reactions. Overall, 25 <i>de novo</i> generated compounds were successfully synthesized in flow via formation of sulfonamide, amide bond, and ester bond. First-pass <i>in vitro</i> activity screening of the crude reaction products in hybrid Gal4 reporter gene assays revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch re-synthesis, purification, and re-testing of 14 of these compounds confirmed that 12 of them were potent LXRα or LXRβ agonists. These results support the utilization of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.<b></b></p>


2021 ◽  
Vol 7 (24) ◽  
pp. eabg3338
Author(s):  
Francesca Grisoni ◽  
Berend J. H. Huisman ◽  
Alexander L. Button ◽  
Michael Moret ◽  
Kenneth Atz ◽  
...  

Automating the molecular design-make-test-analyze cycle accelerates hit and lead finding for drug discovery. Using deep learning for molecular design and a microfluidics platform for on-chip chemical synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space of known LXRα agonists and generate novel molecular candidates. To ensure compatibility with automated on-chip synthesis, the chemical space was confined to the virtual products obtainable from 17 one-step reactions. Twenty-five de novo designs were successfully synthesized in flow. In vitro screening of the crude reaction products revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch resynthesis, purification, and retesting of 14 of these compounds confirmed that 12 of them were potent LXR agonists. These results support the suitability of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.


2018 ◽  
Vol 18 (20) ◽  
pp. 1804-1826 ◽  
Author(s):  
Sahil Sharma ◽  
Deepak Sharma

The intertwining of chemoinformatics with artificial intelligence (AI) has given a tremendous fillip to the field of drug discovery. With the rapid growth of chemical data from high throughput screening and combinatorial synthesis, AI has become an indispensable tool for drug designers to mine chemical information from large compound databases for developing drugs at a much faster rate as never before. The applications of AI have gone beyond bioactivity predictions and have shown promise in addressing diverse problems in drug discovery like de novo molecular design, synthesis prediction and biological image analysis. In this article, we provide an overview of all the algorithms under the umbrella of AI, enlist the tools/frameworks required for implementing these algorithms as well as present a compendium of web servers, databases and open-source platforms implicated in drug discovery, Quantitative Structure-Activity Relationship (QSAR), data mining, solvation free energy and molecular graph mining.


2020 ◽  
Author(s):  
Francesca Grisoni ◽  
Berend Huisman ◽  
Alexander Button ◽  
Michael Moret ◽  
Kenneth Atz ◽  
...  

<p>Automation of the molecular design-make-test-analyze cycle speeds up the identification of hit and lead compounds for drug discovery. Using deep learning for computational molecular design and a customized microfluidics platform for on-chip compound synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space defined by known LXRα agonists, and to suggest structural analogs of known ligands and novel molecular cores. To further the design of lead-like molecules and ensure compatibility with automated on-chip synthesis, this chemical space was confined to the set of virtual products obtainable from 17 different one-step reactions. Overall, 25 <i>de novo</i> generated compounds were successfully synthesized in flow via formation of sulfonamide, amide bond, and ester bond. First-pass <i>in vitro</i> activity screening of the crude reaction products in hybrid Gal4 reporter gene assays revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch re-synthesis, purification, and re-testing of 14 of these compounds confirmed that 12 of them were potent LXRα or LXRβ agonists. These results support the utilization of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.<b></b></p>


2021 ◽  
Author(s):  
Kevin Greenman ◽  
William Green ◽  
Rafael Gómez-Bombarelli

Optical properties are central to molecular design for many applications, including solar cells and biomedical imaging. A variety of ab initio and statistical methods have been developed for their prediction, each with a trade-off between accuracy, generality, and cost. Existing theoretical methods such as time-dependent density functional theory (TD-DFT) are generalizable across chemical space because of their robust physics-based foundations but still exhibit random and systematic errors with respect to experiment despite their high computational cost. Statistical methods can achieve high accuracy at a lower cost, but data sparsity and unoptimized molecule and solvent representations often limit their ability to generalize. Here, we utilize directed message passing neural networks (D-MPNNs) to represent both dye molecules and solvents for predictions of molecular absorption peaks in solution. Additionally, we demonstrate a multi-fidelity approach based on an auxiliary model trained on over 28,000 TD-DFT calculations that further improves accuracy and generalizability, as shown through rigorous splitting strategies. Combining several openly-available experimental datasets, we benchmark these methods against a state-of-the-art regression tree algorithm and compare the D-MPNN solvent representation to several alternatives. Finally, we explore the interpretability of the learned representations using dimensionality reduction and evaluate the use of ensemble variance as an estimator of the epistemic uncertainty in our predictions of molecular peak absorption in solution. The prediction methods proposed herein can be integrated with active learning, generative modeling, and experimental workflows to enable the more rapid design of molecules with targeted optical properties.


2020 ◽  
Author(s):  
Thomas Blaschke ◽  
Ola Engkvist ◽  
Jürgen Bajorath ◽  
Hongming Chen

Abstract In de novo molecular design, recurrent neural networks (RNN) have been shown to be effective methods for sampling and generating novel chemical structures. Using a technique called reinforcement learning (RL), an RNN can be tuned to target a particular section of chemical space with optimized desirable properties using a scoring function. However, ligands generated by current RL methods so far tend to have relatively low diversity, and sometimes even result in duplicate structures when optimizing towards desired properties. Here, we propose a new method to address the low diversity issue in RL for molecular design. Memory-assisted RL is an extension of the known RL, with the introduction of a so-called memory unit. As proof of concept, we applied our method to generate structures with a desired AlogP value. In a second case study, we applied our method to design ligands for the dopamine type 2 receptor and the 5-hydroxytryptamine type 1A receptor. For both receptors, a machine learning model was developed to predict whether generated molecules were active or not for the receptor. In both case studies, it was found that memory-assisted RL led to the generation of more compounds predicted to be active having higher chemical diversity, thus achieving better coverage of chemical space of known ligands compared to established RL methods.


Author(s):  
Oleksii Prykhodko ◽  
Simon Viet Johansson ◽  
Panagiotis-Christos Kotsias ◽  
Josep Arús-Pous ◽  
Esben Jannik Bjerrum ◽  
...  

<p> </p><p>Deep learning methods applied to drug discovery have been used to generate novel structures. In this study, we propose a new deep learning architecture, LatentGAN, which combines an autoencoder and a generative adversarial neural network for de novo molecular design. We applied the method in two scenarios: one to generate random drug-like compounds and another to generate target-biased compounds. Our results show that the method works well in both cases: sampled compounds from the trained model can largely occupy the same chemical space as the training set and also generate a substantial fraction of novel compounds. Moreover, the drug-likeness score of compounds sampled from LatentGAN is also similar to that of the training set. Lastly, generated compounds differ from those obtained with a Recurrent Neural Network-based generative model approach, indicating that both methods can be used complementarily.</p><p> </p>


2020 ◽  
Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


Sign in / Sign up

Export Citation Format

Share Document