scholarly journals Performance Studies on Distributed Virtual Screening

2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Jens Krüger ◽  
Richard Grunzke ◽  
Sonja Herres-Pawlis ◽  
Alexander Hoffmann ◽  
Luis de la Garza ◽  
...  

Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting maximizing the speedup while considering overhead and available cores on Distributed Computing Infrastructures (DCIs). We have conducted thorough performance studies accounting not only for the runtime of the docking itself, but also for structure preparation. Performance studies were conducted via the workflow-enabled science gateway MoSGrid (Molecular Simulation Grid). As input we used benchmark datasets for protein kinases. Our performance studies show that docking workflows can be made to scale almost linearly up to 500 concurrent processes distributed even over large DCIs, thus accelerating vHTS campaigns significantly.

2019 ◽  
Author(s):  
Xinhao Li ◽  
Denis Fourches

<p>Deep neural networks can directly learn from chemical structures without extensive, user-driven selection of descriptors in order to predict molecular properties/activities with high reliability. But these approaches typically require very large training sets to truly learn the best endpoint-specific structural features and ensure reasonable prediction accuracy. Even though large datasets are becoming the new normal in drug discovery, especially when it comes to high-throughput screening or metabolomics datasets, one should also consider smaller datasets with very challenging endpoints to model and forecast. Thus, it would be highly relevant to better utilize the tremendous compendium of unlabeled compounds from publicly-available datasets for improving the model performances for the user’s particular series of compounds. In this study, we propose the <b>M</b><b>ol</b>ecular <b>P</b>rediction <b>M</b><b>o</b>del <b>Fi</b>ne-<b>T</b>uning (<b>MolPMoFiT</b>) approach, an effective transfer learning method that can be applied to any QSPR/QSAR problems. A large-scale molecular structure prediction model is pre-trained using one million unlabeled molecules from ChEMBL in a self-supervised learning manor, and can then be fine-tuned on various QSPR/QSAR tasks for smaller chemical datasets with a specific endpoints. Herein, the method is evaluated on three benchmark datasets (lipophilicity, HIV, and blood-brain barrier penetration). The results showed the method can achieve comparable or better prediction performances on all three datasets compared to <i>state-of-the-art</i> prediction techniques reported in the literature so far. </p>


2012 ◽  
Vol 9 (77) ◽  
pp. 3196-3207 ◽  
Author(s):  
Pedro J. Ballester ◽  
Martina Mangold ◽  
Nigel I. Howard ◽  
Richard L. Marchese Robinson ◽  
Chris Abell ◽  
...  

One of the initial steps of modern drug discovery is the identification of small organic molecules able to inhibit a target macromolecule of therapeutic interest. A small proportion of these hits are further developed into lead compounds, which in turn may ultimately lead to a marketed drug. A commonly used screening protocol used for this task is high-throughput screening (HTS). However, the performance of HTS against antibacterial targets has generally been unsatisfactory, with high costs and low rates of hit identification. Here, we present a novel computational methodology that is able to identify a high proportion of structurally diverse inhibitors by searching unusually large molecular databases in a time-, cost- and resource-efficient manner. This virtual screening methodology was tested prospectively on two versions of an antibacterial target (type II dehydroquinase from Mycobacterium tuberculosis and Streptomyces coelicolor ), for which HTS has not provided satisfactory results and consequently practically all known inhibitors are derivatives of the same core scaffold. Overall, our protocols identified 100 new inhibitors, with calculated K i ranging from 4 to 250 μM (confirmed hit rates are 60% and 62% against each version of the target). Most importantly, over 50 new active molecular scaffolds were discovered that underscore the benefits that a wide application of prospectively validated in silico screening tools is likely to bring to antibacterial hit identification.


2019 ◽  
Vol 16 (4) ◽  
pp. 417-426
Author(s):  
Vimee Raturi ◽  
Kumar Abhishek ◽  
Subhashis Jana ◽  
Subhendu Sekhar Bag ◽  
Vishal Trivedi

Background: Malaria Parasite relies heavily on signal transduction pathways to control growth, the progression of the life cycle and sustaining stress for its survival. Unlike kinases, Plasmodium&#039;s phosphatome is one of the smallest and least explored for identifying drug target for clinical intervention. PF14_0660 is a putative protein present on the chromosome 14 of Plasmodium falciparum genome. Methods: Multiple sequence alignment of PF14_0660 with other known protein phosphatase indicate the presence of phosphatase motif with specific residues essential for metal binding, catalysis and providing structural stability. PF14_0660 is a mixed &#945;/&#946; type of protein with several &#946; -sheet and α-helix arranged to form βαβαβα sub-structure. The surface properties of PF14_0660 is conserved with another phosphate of this family, but it profoundly diverges from the host protein tyrosine phosphatase. PF14_0660 was cloned, over-expressed and protein is exhibiting phosphatase activity in a dose-dependent manner. Docking of Heterocyclic compounds from chemical libraries into the PF14_0660 active site found nice fitting of several candidate molecules. Results: Compound PPinh6, PPinh 7 and PPinh 5 are exhibiting antimalarial activity with an IC50 of 1.4 &#177; 0.2&#181;M, 3.8 &#177; 0.3 &#181;M and 9.4 ± 0.6&#181M respectively. Compound PPinh 6 and PPinh 7 are inhibiting intracellular PF14_0660 phosphatase activity and killing parasite through the generation of reactive oxygen species. Conclusion: Hence, a combination of molecular modelling, virtual screening and biochemical study allowed us to explore the potentials of PF14_0660 as a drug target to design anti-malarials.


2012 ◽  
Vol 17 (4) ◽  
pp. 535-541 ◽  
Author(s):  
Gregory J. Crowther ◽  
S. Arshiya Quadri ◽  
Benjamin J. Shannon-Alferes ◽  
Wesley C. Van Voorhis ◽  
Henry Rosen

More than 20% of bacterial proteins are noncytoplasmic, and most of these pass through the SecYEG channel en route to the periplasm, cell membrane, or surrounding environment. The Sec pathway, encompassing SecYEG and several associated proteins (SecA, SecB, YidC, SecDFYajC), is of interest as a potential drug target because it is distinct from targets of current drugs, is essential for bacterial growth, and exhibits dissimilarities in eukaryotes and bacteria that increase the likelihood of selectively inhibiting the microbial pathway. As a step toward validating the pathway as a drug target, we have adapted a mechanism-based whole-cell assay in a manner suitable for high-throughput screening (HTS). The assay uses an engineered strain of Escherichia coli that accumulates beta-galactosidase (β-gal) in its cytoplasm if translocation through SecYEG is blocked. The assay should facilitate rapid identification of compounds that specifically block the Sec pathway because widely, toxic compounds and nonspecific protein synthesis inhibitors prevent β-gal production and thus do not register as hits. Testing of current antibiotics confirmed that they do not generally act through the Sec pathway. A mini-screen of 800 compounds indicated the assay’s readiness for larger screening projects.


2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Mamta Singh ◽  
Prabhakar Tiwari ◽  
Garima Arora ◽  
Sakshi Agarwal ◽  
Saqib Kidwai ◽  
...  

Abstract Inorganic polyphosphate (PolyP) plays an essential role in microbial stress adaptation, virulence and drug tolerance. The genome of Mycobacterium tuberculosis encodes for two polyphosphate kinases (PPK-1, Rv2984 and PPK-2, Rv3232c) and polyphosphatases (ppx-1, Rv0496 and ppx-2, Rv1026) for maintenance of intracellular PolyP levels. Microbial polyphosphate kinases constitute a molecular mechanism, whereby microorganisms utilize PolyP as phosphate donor for synthesis of ATP. In the present study we have constructed ppk-2 mutant strain of M. tuberculosis and demonstrate that PPK-2 enzyme contributes to its ability to cause disease in guinea pigs. We observed that ppk-2 mutant strain infected guinea pigs had significantly reduced bacterial loads and tissue pathology in comparison to wild type infected guinea pigs at later stages of infection. We also report that in comparison to the wild type strain, ppk-2 mutant strain was more tolerant to isoniazid and impaired for survival in THP-1 macrophages. In the present study we have standardized a luciferase based assay system to identify chemical scaffolds that are non-cytotoxic and inhibit M. tuberculosis PPK-2 enzyme. To the best of our knowledge this is the first study demonstrating feasibility of high throughput screening to obtain small molecule PPK-2 inhibitors.


2014 ◽  
Vol 70 (a1) ◽  
pp. C708-C708
Author(s):  
Cho Yeow Koh ◽  
Jasmine Nguyen ◽  
Sayaka Shibata ◽  
Zhongsheng Zhang ◽  
Ranae Ranade ◽  
...  

Infection by the protozoan parasite Trypanosoma brucei causes human African trypanosomiasis, commonly known as sleeping sickness. The disease is fatal without treatment; yet, current therapeutic options for the disease are inadequate due to toxicity, difficulty in administration and emerging resistance. Therefore, methionyl-tRNA synthetase of T. brucei (TbMetRS) is targeted for the development of new antitrypanosomal drugs. We have recently completed a high-throughput screening campaign against TbMetRS using a 364,131 compounds library in The Scripps Research Institute Molecular Screening Center. Here we outline our strategy to integrate the power of crystal structures with high-throughput screening in a drug discovery project. We applied the rapid crystal soaking procedure to obtain structures of TbMetRS in complex with inhibitors reported earlier[1] to approximately 70 high-throughput screening hits. This resulted in more than 20 crystal structures of TbMetRS·hit complexes. These hits cover a large diversity of chemical structures with IC50 values between 200 nM and 10 µM. Based on the solved structures and existing knowledge drawn from other in-house inhibitors, the IC50 value of the most promising hit has been improved. Further development of the compounds into potent TbMetRS inhibitors with desirable pharmacokinetic properties is on-going and will continue to benefit from information derived from crystal structures.


2021 ◽  
Author(s):  
Jeremy Feinstein ◽  
ganesh sivaraman ◽  
Kurt Picel ◽  
Brian Peters ◽  
Alvaro Vazquez-Mayagoitia ◽  
...  

In this article, we present our recent study on computational methodology for predicting the toxicity of PFAS known as “forever chemicals” based on chemical structures through evaluation of multiple machine learning methods. To address the scarcity of PFAS toxicity data, a deep “transfer learning” method has been investigated by leveraging toxicity information over the entire organic chemical domain and an uncertainty-informed workflow by incorporating SelectiveNet architecture, which can support future guidance of high throughput screening with knowledge of chemical structures, has been developed.


2018 ◽  
Author(s):  
Shengchao Liu ◽  
Moayad Alnammi ◽  
Spencer S. Ericksen ◽  
Andrew F. Voter ◽  
Gene E. Ananiev ◽  
...  

AbstractVirtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the dataset and evaluation strategy. We consider a wide range of ligand-based machine learning and docking-based approaches for virtual screening on two protein-protein interactions, PriA-SSB and RMI-FANCM, and present a strategy for choosing which algorithm is best for prospective compound prioritization. Our workflow identifies a random forest as the best algorithm for these targets over more sophisticated neural network-based models. The top 250 predictions from our selected random forest recover 37 of the 54 active compounds from a library of 22,434 new molecules assayed on PriA-SSB. We show that virtual screening methods that perform well in public datasets and synthetic benchmarks, like multi-task neural networks, may not always translate to prospective screening performance on a specific assay of interest.


2020 ◽  
Author(s):  
Xinhao Li ◽  
Denis Fourches

<p>Deep neural networks can directly learn from chemical structures without extensive, user-driven selection of descriptors in order to predict molecular properties/activities with high reliability. But these approaches typically require large training sets to learn the endpoint-specific structural features and ensure reasonable prediction accuracy. Even though large datasets are becoming the new normal in drug discovery, especially when it comes to high-throughput screening or metabolomics datasets, one should also consider smaller datasets with challenging endpoints to model and forecast. Thus, it would be highly relevant to better utilize the tremendous compendium of unlabeled compounds from publicly-available datasets for improving the model performances for the user’s particular series of compounds. In this study, we propose the <b>Mol</b>ecular <b>P</b>rediction <b>Mo</b>del <b>Fi</b>ne-<b>T</b>uning (<b>MolPMoFiT</b>) approach, an effective transfer learning method based on self-supervised pre-training + task-specific fine-tuning for QSPR/QSAR modeling. A large-scale molecular structure prediction model is pre-trained using one million unlabeled molecules from ChEMBL in a self-supervised learning manner, and can then be fine-tuned on various QSPR/QSAR tasks for smaller chemical datasets with specific endpoints. Herein, the method is evaluated on four benchmark datasets (lipophilicity, FreeSolv, HIV, and blood-brain barrier penetration). The results showed the method can achieve strong performances for all four datasets compared to other state-of-the-art machine learning modeling techniques reported in the literature so far. <br></p>


Sign in / Sign up

Export Citation Format

Share Document