scholarly journals Incorporating structural similarity into a scoring function to enhance the prediction of binding affinities

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Beihong Ji ◽  
Xibing He ◽  
Yuzhao Zhang ◽  
Jingchen Zhai ◽  
Viet Hoang Man ◽  
...  

AbstractIn this study, we developed a novel algorithm to improve the screening performance of an arbitrary docking scoring function by recalibrating the docking score of a query compound based on its structure similarity with a set of training compounds, while the extra computational cost is neglectable. Two popular docking methods, Glide and AutoDock Vina were adopted as the original scoring functions to be processed with our new algorithm and similar improvement performance was achieved. Predicted binding affinities were compared against experimental data from ChEMBL and DUD-E databases. 11 representative drug receptors from diverse drug target categories were applied to evaluate the hybrid scoring function. The effects of four different fingerprints (FP2, FP3, FP4, and MACCS) and the four different compound similarity effect (CSE) functions were explored. Encouragingly, the screening performance was significantly improved for all 11 drug targets especially when CSE = S4 (S is the Tanimoto structural similarity) and FP2 fingerprint were applied. The average predictive index (PI) values increased from 0.34 to 0.66 and 0.39 to 0.71 for the Glide and AutoDock vina scoring functions, respectively. To evaluate the performance of the calibration algorithm in drug lead identification, we also imposed an upper limit on the structural similarity to mimic the real scenario of screening diverse libraries for which query ligands are general-purpose screening compounds and they are not necessarily structurally similar to reference ligands. Encouragingly, we found our hybrid scoring function still outperformed the original docking scoring function. The hybrid scoring function was further evaluated using external datasets for two systems and we found the PI values increased from 0.24 to 0.46 and 0.14 to 0.42 for A2AR and CFX systems, respectively. In a conclusion, our calibration algorithm can significantly improve the virtual screening performance in both drug lead optimization and identification phases with neglectable computational cost.

2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Jamal Shamsara

Rescoring is a simple approach that theoretically could improve the original docking results. In this study AutoDock Vina was used as a docked engine and three other scoring functions besides the original scoring function, Vina, as well as their combinations as consensus scoring functions were employed to explore the effect of rescoring on virtual screenings that had been done on diverse targets. Rescoring by DrugScore produces the most number of cases with significant changes in screening power. Thus, the DrugScore results were used to build a simple model based on two binding site descriptors that could predict possible improvement by DrugScore rescoring. Furthermore, generally the screening power of all rescoring approach as well as original AutoDock Vina docking results correlated with the Maximum Theoretical Shape Complementarity (MTSC) and Maximum Distance from Center of Mass and all Alpha spheres (MDCMA). Therefore, it was suggested that, with a more complete set of binding site descriptors, it could be possible to find robust relationship between binding site descriptors and response to certain molecular docking programs and scoring functions. The results could be helpful for future researches aiming to do a virtual screening using AutoDock Vina and/or rescoring using DrugScore.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Andrew T. McNutt ◽  
Paul Francoeur ◽  
Rishal Aggarwal ◽  
Tomohide Masuda ◽  
Rocco Meli ◽  
...  

AbstractMolecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2Å root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. Gnina, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of Gnina under an open source license for use as a molecular docking tool at https://github.com/gnina/gnina.


2020 ◽  
Author(s):  
Rafael Blasco ◽  
Julio Coll

<p>The non-structural protein 7 (nsp7) of Severe Acute Respiratory Syndrome (SARS) coronaviruses was selected as a new target to potentially interfere with viral replication. The nsp7s are one of the most conserved, unique and small coronavirus proteins having a critical, yet intriguing participation on the replication of the long viral RNA genome after complexing with nsp8 and nsp12. Despite the difficulties of having no previous binding pocket, two high-throughput virtual blind screening of 158240 natural compounds > 400 Da by AutoDock Vina against nsp7.1ysy identified 655 leads displaying predicted binding affinities between 10 to 1100 nM. The leads were then screened against 14 available conformations of nsp7 by both AutoDock Vina and seeSAR programs employing different binding score algorithms, to identify 20 consensus top-leads. Further <i>in silico</i> predictive analysis of physiological and toxicity ADMET criteria (chemical properties, adsorption, metabolism, toxicity) narrowed top-leads to a few drug-like ligands many of them showing steroid-like structures. A final optimization by search for structural similarity to the top drug-like ligand that were also commercially available, yielded a collection of predicted novel ligands with ~100-fold higher-affinity whose antiviral activity may be experimentally validated. Additionally, these novel nsp7-interacting ligands and/or their further optimized derivatives, may offer new tools to investigate the intriguing role of nsp7 on replication of coronaviruses.</p>


2020 ◽  
Author(s):  
Rishal Aggarwal ◽  
David R. Koes

Docking algorithms are an essential part of the Structure Based Drug Design (SBDD) process as they aim to effectively identify the binding poses of chemical structures at the target site. These algorithms are reliant on scoring functions that evaluate the binding ability of a ligand conformation. Typically, scoring functions are designed to predict the binding affinity of various poses at the target site. In this work, we design a novel approach where the scoring function attempts to predict the Root Mean Square Deviation (RMSD) of a pose to the true binding pose. We show that a Convolutional Neural Network (CNN) can be trained to learn these RMSD values with high correlation between predicted and experimental values. Furthermore we show that this scoring function can improve pose selection performance when used in combination with orthogonal scoring functions like Autodock Vina.


2021 ◽  
Author(s):  
Andrew McNutt ◽  
Paul Francoeur ◽  
Rishal Aggarwal ◽  
Tomohide Masuda ◽  
Rocco Meli ◽  
...  

Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2A root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. Gnina, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of Gnina under and open source license for use as a molecular docking tool at https://github.com/gnina/gnina.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7362 ◽  
Author(s):  
Haiping Zhang ◽  
Linbu Liao ◽  
Konda Mani Saravanan ◽  
Peng Yin ◽  
Yanjie Wei

Proteins interact with small molecules to modulate several important cellular functions. Many acute diseases were cured by small molecule binding in the active site of protein either by inhibition or activation. Currently, there are several docking programs to estimate the binding position and the binding orientation of protein–ligand complex. Many scoring functions were developed to estimate the binding strength and predict the effective protein–ligand binding. While the accuracy of current scoring function is limited by several aspects, the solvent effect, entropy effect, and multibody effect are largely ignored in traditional machine learning methods. In this paper, we proposed a new deep neural network-based model named DeepBindRG to predict the binding affinity of protein–ligand complex, which learns all the effects, binding mode, and specificity implicitly by learning protein–ligand interface contact information from a large protein–ligand dataset. During the initial data processing step, the critical interface information was preserved to make sure the input is suitable for the proposed deep learning model. While validating our model on three independent datasets, DeepBindRG achieves root mean squared error (RMSE) value of pKa (−logKd or −logKi) about 1.6–1.8 and R value around 0.5–0.6, which is better than the autodock vina whose RMSE value is about 2.2–2.4 and R value is 0.42–0.57. We also explored the detailed reasons for the performance of DeepBindRG, especially for several failed cases by vina. Furthermore, DeepBindRG performed better for four challenging datasets from DUD.E database with no experimental protein–ligand complexes. The better performance of DeepBindRG than autodock vina in predicting protein–ligand binding affinity indicates that deep learning approach can greatly help with the drug discovery process. We also compare the performance of DeepBindRG with a 4D based deep learning method “pafnucy”, the advantage and limitation of both methods have provided clues for improving the deep learning based protein–ligand prediction model in the future.


2020 ◽  
Author(s):  
Natasha Kamerlin ◽  
Mickaël G. Delcey ◽  
Sergio Manzetti ◽  
David van der Spoel

Thousands of anthropogenic chemicals are released into the environment each year, posing potential hazards to human and environmental health. Toxic chemicals may cause a variety of adverse health effects, triggering immediate symptoms or delayed effects over longer periods of time. It is thus crucial to develop methods that can rapidly screen and predict the toxicity of chemicals, to limit the potential harmful impacts of chemical pollutants. Computational methods are being increasingly used in toxicity predictions. Here, the method of molecular docking is assessed for screening potential toxicity of a variety of xenobiotic compounds, including pesticides, pharmaceuticals, pollutants and toxins deriving from the chemical industry. The method predicts the binding energy of the pollutants to a set of carefully selected receptors, under the assumption that toxicity in many cases is related to interference with biochemical pathways. The strength of the applied method lies in its rapid generation of interaction maps between potential toxins and the targeted enzymes, which could quickly yield molecularlevel information and insight into potential perturbation pathways, aiding in the prioritisation of chemicals for further tests. Two scoring functions are compared, Autodock Vina and the machine-learning scoring function RF-Score-VS. The results are promising, though hampered by the accuracy of the scoring functions. The strengths and weaknesses of the docking protocol are discussed, as well as future directions for improving the accuracy for the purpose of toxicity predictions.<br>


2020 ◽  
Author(s):  
Baldomero Imbernón ◽  
Antonio Serrano ◽  
Andrés Bueno-Crespo ◽  
José L Abellán ◽  
Horacio Pérez-Sánchez ◽  
...  

Abstract Motivation Molecular docking methods are extensively used to predict the interaction between protein–ligand systems in terms of structure and binding affinity, through the optimization of a physics-based scoring function. However, the computational requirements of these simulations grow exponentially with: (i) the global optimization procedure, (ii) the number and degrees of freedom of molecular conformations generated and (iii) the mathematical complexity of the scoring function. Results In this work, we introduce a novel molecular docking method named METADOCK 2, which incorporates several novel features, such as (i) a ligand-dependent blind docking approach that exhaustively scans the whole protein surface to detect novel allosteric sites, (ii) an optimization method to enable the use of a wide branch of metaheuristics and (iii) a heterogeneous implementation based on multicore CPUs and multiple graphics processing units. Two representative scoring functions implemented in METADOCK 2 are extensively evaluated in terms of computational performance and accuracy using several benchmarks (such as the well-known DUD) against AutoDock 4.2 and AutoDock Vina. Results place METADOCK 2 as an efficient and accurate docking methodology able to deal with complex systems where computational demands are staggering and which outperforms both AutoDock Vina and AutoDock 4. Availability and implementation https://[email protected]/Baldoimbernon/metadock_2.git. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Rishal Aggarwal ◽  
David R. Koes

Docking algorithms are an essential part of the Structure Based Drug Design (SBDD) process as they aim to effectively identify the binding poses of chemical structures at the target site. These algorithms are reliant on scoring functions that evaluate the binding ability of a ligand conformation. Typically, scoring functions are designed to predict the binding affinity of various poses at the target site. In this work, we design a novel approach where the scoring function attempts to predict the Root Mean Square Deviation (RMSD) of a pose to the true binding pose. We show that a Convolutional Neural Network (CNN) can be trained to learn these RMSD values with high correlation between predicted and experimental values. Furthermore we show that this scoring function can improve pose selection performance when used in combination with orthogonal scoring functions like Autodock Vina.


2021 ◽  
Author(s):  
Andrew McNutt ◽  
Paul Francoeur ◽  
Rishal Aggarwal ◽  
Tomohide Masuda ◽  
Rocco Meli ◽  
...  

Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2A root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. Gnina, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of Gnina under and open source license for use as a molecular docking tool at https://github.com/gnina/gnina.


Sign in / Sign up

Export Citation Format

Share Document