scholarly journals Navigating Chemical Space by Interfacing Generative Artificial Intelligence and Molecular Docking

Author(s):  
Ziqiao Xu ◽  
Orrette R. Wauchope ◽  
Aaron T. Frank
2020 ◽  
Author(s):  
Ziqiao Xu ◽  
Orrette Wauchope ◽  
Aaron T. Frank

Here we report the testing and application of a simple, structure-aware framework to design target-specific screening libraries for drug development. Our approach combines advances in generative artificial intelligence (AI) with conventional molecular docking to rapidly explore chemical space conditioned on the unique physiochemical properties of the active site of a biomolecular target. As a proof-of-concept, we used our framework to construct a focused library for cyclin-dependent kinase type-2 (CDK2). We then used it to rapidly generate a library specific to the active site of the main protease (Mpro) of the SARS-CoV-2 virus, which causes COVID-19. By comparing approved and experimental drugs to compounds in our library, we also identified six drugs, namely, Naratriptan, Etryptamine, Panobinostat, Procainamide, Sertraline, and Lidamidine, as possible SARS-CoV-2 Mpro targeting compounds and, as such, potential drug repurposing candidates. To complement the open-science COVID-19 drug discovery initiatives, we make our SARS-CoV-2 Mpro library fully accessible to the research community (https://github.com/atfrank/SARS-CoV-2).


Author(s):  
Vishal Babu Siramshetty ◽  
Dac-Trung Nguyen ◽  
Natalia J. Martinez ◽  
Anton Simeonov ◽  
Noel T. Southall ◽  
...  

The rise of novel artificial intelligence methods necessitates a comparison of this wave of new approaches with classical machine learning for a typical drug discovery project. Inhibition of the potassium ion channel, whose alpha subunit is encoded by human Ether-à-go-go-Related Gene (hERG), leads to prolonged QT interval of the cardiac action potential and is a significant safety pharmacology target for the development of new medicines. Several computational approaches have been employed to develop prediction models for assessment of hERG liabilities of small molecules including recent work using deep learning methods. Here we perform a comprehensive comparison of prediction models based on classical (random forests and gradient boosting) and modern (deep neural networks and recurrent neural networks) artificial intelligence methods. The training set (~9000 compounds) was compiled by integrating hERG bioactivity data from ChEMBL database with experimental data generated from an in-house, high-throughput thallium flux assay. We utilized different molecular descriptors including the latent descriptors, which are real-valued continuous vectors derived from chemical autoencoders trained on a large chemical space (> 1.5 million compounds). The models were prospectively validated on ~840 in-house compounds screened in the same thallium flux assay. The deep neural networks performed significantly better than the classical methods with the latent descriptors. The recurrent neural networks that operate on SMILES provided highest model sensitivity. The best models were merged into a consensus model that offered superior performance compared to reference models from academic and commercial domains. Further, we shed light on the potential of artificial intelligence methods to exploit the chemistry big data and generate novel chemical representations useful in predictive modeling and tailoring new chemical space.<br>


2021 ◽  
Vol 23 (1) ◽  
pp. 393
Author(s):  
Sebastjan Kralj ◽  
Marko Jukič ◽  
Urban Bren

Since December 2019, the new SARS-CoV-2-related COVID-19 disease has caused a global pandemic and shut down the public life worldwide. Several proteins have emerged as potential therapeutic targets for drug development, and we sought out to review the commercially available and marketed SARS-CoV-2-targeted libraries ready for high-throughput virtual screening (HTVS). We evaluated the SARS-CoV-2-targeted, protease-inhibitor-focused and protein–protein-interaction-inhibitor-focused libraries to gain a better understanding of how these libraries were designed. The most common were ligand- and structure-based approaches, along with various filtering steps, using molecular descriptors. Often, these methods were combined to obtain the final library. We recognized the abundance of targeted libraries offered and complimented by the inclusion of analytical data; however, serious concerns had to be raised. Namely, vendors lack the information on the library design and the references to the primary literature. Few references to active compounds were also provided when using the ligand-based design and usually only protein classes or a general panel of targets were listed, along with a general reference to the methods, such as molecular docking for the structure-based design. No receptor data, docking protocols or even references to the applied molecular docking software (or other HTVS software), and no pharmacophore or filter design details were given. No detailed functional group or chemical space analyses were reported, and no specific orientation of the libraries toward the design of covalent or noncovalent inhibitors could be observed. All libraries contained pan-assay interference compounds (PAINS), rapid elimination of swill compounds (REOS) and aggregators, as well as focused on the drug-like model, with the majority of compounds possessing their molecular mass around 500 g/mol. These facts do not bode well for the use of the reviewed libraries in drug design and lend themselves to commercial drug companies to focus on and improve.


2020 ◽  
Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


2020 ◽  
Author(s):  
Francesca Grisoni ◽  
Berend Huisman ◽  
Alexander Button ◽  
Michael Moret ◽  
Kenneth Atz ◽  
...  

<p>Automation of the molecular design-make-test-analyze cycle speeds up the identification of hit and lead compounds for drug discovery. Using deep learning for computational molecular design and a customized microfluidics platform for on-chip compound synthesis, liver X receptor (LXR) agonists were generated from scratch. The computational pipeline was tuned to explore the chemical space defined by known LXRα agonists, and to suggest structural analogs of known ligands and novel molecular cores. To further the design of lead-like molecules and ensure compatibility with automated on-chip synthesis, this chemical space was confined to the set of virtual products obtainable from 17 different one-step reactions. Overall, 25 <i>de novo</i> generated compounds were successfully synthesized in flow via formation of sulfonamide, amide bond, and ester bond. First-pass <i>in vitro</i> activity screening of the crude reaction products in hybrid Gal4 reporter gene assays revealed 17 (68%) hits, with up to 60-fold LXR activation. The batch re-synthesis, purification, and re-testing of 14 of these compounds confirmed that 12 of them were potent LXRα or LXRβ agonists. These results support the utilization of the proposed design-make-test-analyze framework as a blueprint for automated drug design with artificial intelligence and miniaturized bench-top synthesis.<b></b></p>


Author(s):  
Navneet Bung ◽  
Sowmya Ramaswamy Krishnan ◽  
Gopalakrishnan Bulusu ◽  
Arijit Roy

The novel SARS-CoV-2 is the source of a global pandemic COVID-19, which has severely affected the health and economy of several countries. Multiple studies are in progress, employing diverse approaches to design novel therapeutics against the potential target proteins in SARS-CoV-2. One of the well-studied protein targets for coronaviruses is the chymotrypsin-like (3CL) protease, responsible for post-translational modifications of viral polyproteins essential for its survival and replication in the host. There are ongoing attempts to repurpose the existing viral protease inhibitors against 3CL protease of SARS-CoV-2. Recent studies have proven the efficiency of artificial intelligence techniques in learning the known chemical space and generating novel small molecules. In this study, we employed deep neural network-based generative and predictive models for de novo design of new small molecules capable of inhibiting the 3CL protease. The generated small molecules were filtered and screened against the binding site of the 3CL protease structure of SARS-CoV-2. Based on the screening results and further analysis, we have identified 31 potential compounds as ideal candidates for further synthesis and testing against SARS-CoV-2. The generated small molecules were also compared with available natural products. Two of the generated small molecules showed high similarity to a plant natural product, Aurantiamide, which can be used for rapid testing during this time of crisis.


2020 ◽  
Author(s):  
Oky Hermansyah ◽  
Alhadi Bustamam ◽  
Arry Yanuar

Abstract Background: Dipeptidyl Peptidase-4 (DPP-4) inhibitors are becoming an essential drug in the treatment of type 2 diabetes mellitus, but some classes of these drugs have side effects such as joint pain that can become severe to pancreatitis. It is thought that these side effects appear related to their inhibition against enzymes DPP-8 and DPP-9. Objective: This study aims to find DPP-4 inhibitor hit compounds that are selective against the DPP-8 and DPP-9 enzymes. By building a virtual screening workflow using the Quantitative Structure-Activity Relationship (QSAR) method based on artificial intelligence (AI), millions of molecules from the database can be screened for the DPP-4 enzyme target with a faster time compared to other screening methods. Result: Five regression machine learning algorithms and four classification machine learning algorithms were used to build virtual screening workflows. The algorithm that qualifies for the regression QSAR model was Support Vector regression with R 2 pred 0.78, while the classification QSAR model was Random Forest with 92.21% accuracy. The virtual screening results of more than 10 million molecules from the database, obtained 2,716 hit compounds with pIC50 above 7.5. Molecular docking results of several potential hit compounds to the DPP-4, DPP-8 and DPP-9 enzymes, obtained CH0002 hit compound that has a high inhibitory potential against the DPP-4 enzyme and low inhibition of the DPP-8 and DPP-9 enzymes. Conclusion: This research was able to produce DPP-4 inhibitor hit compounds that are potential to DPP-4 and selective to DPP-8 and DPP-9 enzymes so that they can be further developed in the DPP-4 inhibitors discovery. The resulting virtual screening workflow can be applied to the discovery of hit compounds on other targets. Keywords: Artificial Intelligence; DPP-4; KNIME; Machine Learning; QSAR; Virtual Screening


2020 ◽  
Author(s):  
Srilok Srinivasan ◽  
Rohit Batra ◽  
Henry Chan ◽  
Ganesh Kamath ◽  
Mathew J. Cherukara ◽  
...  

An extensive search for active therapeutic agents against the SARS-CoV-2 is being conducted across the globe. Computational docking simulations have traditionally been used for <i>in silico</i> ligand design and remain popular method of choice for high-throughput screening of therapeutic agents in the fight against COVID-19. Despite the vast chemical space (millions to billions of biomolecules) that can be potentially explored as therapeutic agents, we remain severely limited in the search of candidate compounds owing to the high computational cost of these ensemble docking simulations employed in traditional <i>in silico</i> ligand design. Here, we present a <i>de novo</i> molecular design strategy that leverages artificial intelligence to discover new therapeutic biomolecules against SARS-CoV-2. A Monte Carlo Tree Search algorithm combined with a multi-task neural network (MTNN) surrogate model for expensive docking simulations and recurrent neural networks (RNN) for rollouts, is used to sample the exhaustive SMILES space of candidate biomolecules. Using Vina scores as target objective to measure binding of therapeutic molecules to either the isolated spike protein (S-protein) of SARS-CoV-2 at its host receptor region or to the S-protein:Angiotensin converting enzyme 2 (ACE2) receptor interface, we generate several (~100's) new biomolecules that outperform FDA (~1000’s) and non-FDA biomolecules (~million) from existing databases. A transfer learning strategy is deployed to retrain the MTNN surrogate as new candidate molecules are identified - this iterative search and retrain strategy is shown to accelerate the discovery of desired candidates. We perform detailed analysis using Lipinski's rules and also analyze the structural similarities between the various top performing candidates. We spilt the molecules using a molecular fragmenting algorithm and identify the common chemical fragments and patterns – such information is important to identify moieties that are responsible for improved performance. Although we focus on therapeutic biomolecules, our AI strategy is broadly applicable for accelerated design and discovery of any chemical molecules with user-desired functionality.


Sign in / Sign up

Export Citation Format

Share Document