scholarly journals Overview of the SAMPL6 host-guest binding affinity prediction challenge

2018 ◽  
Author(s):  
Andrea Rizzi ◽  
Steven Murkli ◽  
John N. McNeill ◽  
Wei Yao ◽  
Matthew Sullivan ◽  
...  

AbstractAccurately predicting the binding affinities of small organic molecules to biological macro-molecules can greatly accelerate drug discovery by reducing the number of compounds that must be synthesized to realize desired potency and selectivity goals. Unfortunately, the process of assessing the accuracy of current computational approaches to affinity prediction against binding data to biological macro-molecules is frustrated by several challenges, such as slow conformational dynamics, multiple titratable groups, and the lack of high-quality blinded datasets. Over the last several SAMPL blind challenge exercises, host-guest systems have emerged as a practical and effective way to circumvent these challenges in assessing the predictive performance of current-generation quantitative modeling tools, while still providing systems capable of possessing tight binding affinities. Here, we present an overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts: octa-acid (OA), the closely related tetra-endo-methyl-octa-acid (TEMOA), and cucurbit[8]uril (CB8), along with 21 small organic guest molecules. A total of 119 entries were received from 10 participating groups employing a variety of methods that spanned from electronic structure and movable type calculations in implicit solvent to alchemical and potential of mean force strategies using empirical force fields with explicit solvent models. While empirical models tended to obtain better performance than first-principle methods, it was not possible to identify a single approach that consistently provided superior results across all host-guest systems and statistical metrics. Moreover, the accuracy of the methodologies generally displayed a substantial dependence on the system considered, emphasizing the need for host diversity in blind evaluations. Several entries exploited previous experimental measurements of similar host-guest systems in an effort to improve their physical-based predictions via some manner of rudimentary machine learning; while this strategy succeeded in reducing systematic errors, it did not correspond to an improvement in statistical correlation. Comparison to previous rounds of the host-guest binding free energy challenge highlights an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds. The data suggests that further refinement of force field parameters, as well as improved treatment of chemical effects (e.g., buffer salt conditions, protonation states) may be required to further enhance predictive accuracy.

2020 ◽  
Vol 21 (22) ◽  
pp. 8424
Author(s):  
Yongbeom Kwon ◽  
Woong-Hee Shin ◽  
Junsu Ko ◽  
Juyong Lee

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. Therefore, many binding affinity prediction methods have been developed. In recent years, since deep learning technology has become powerful, it is also implemented to predict affinity. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3-D convolutional neural network layers. Our model was trained using the 3772 protein-ligand complexes from the refined set of the PDBbind-2016 database and tested using the core set of 285 complexes. The benchmark results show that the Pearson correlation coefficient between the predicted binding affinities by our model and the experimental data is 0.827, which is higher than the state-of-the-art binding affinity prediction scoring functions. Additionally, our method ranks the relative binding affinities of possible multiple binders of a protein quite accurately, comparable to the other scoring functions. Last, we measured which structural information is critical for predicting binding affinity and found that the complementarity between the protein and ligand is most important.


2020 ◽  
Author(s):  
Yongbeom Kwon ◽  
Woong-Hee Shin ◽  
Junsu Ko ◽  
Juyong Lee

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our new model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3D convolutional neural network layers. Our model was trained using the 3740 protein-ligand complexes from the refined set of the PDBbind database and tested using the 270 complexes from the core set. The benchmark results show that the correlation coefficient between the predicted binding affinities by our model and the experimental data is higher than 0.72, which is comparable with the state-of-the-art binding affinity prediction methods. In addition, our method also ranks the relative binding affinities of possible multiple binders of a protein quite accurately. Last, we measured which structural information is critical for predicting binding affinity.


2018 ◽  
Vol 32 (10) ◽  
pp. 937-963 ◽  
Author(s):  
Andrea Rizzi ◽  
Steven Murkli ◽  
John N. McNeill ◽  
Wei Yao ◽  
Matthew Sullivan ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jooyong Shim ◽  
Zhen-Yu Hong ◽  
Insuk Sohn ◽  
Changha Hwang

AbstractIdentifying novel drug–target interactions (DTIs) plays an important role in drug discovery. Most of the computational methods developed for predicting DTIs use binary classification, whose goal is to determine whether or not a drug–target (DT) pair interacts. However, it is more meaningful but also more challenging to predict the binding affinity that describes the strength of the interaction between a DT pair. If the binding affinity is not sufficiently large, such drug may not be useful. Therefore, the methods for predicting DT binding affinities are very valuable. The increase in novel public affinity data available in the DT-related databases enables advanced deep learning techniques to be used to predict binding affinities. In this paper, we propose a similarity-based model that applies 2-dimensional (2D) convolutional neural network (CNN) to the outer products between column vectors of two similarity matrices for the drugs and targets to predict DT binding affinities. To our best knowledge, this is the first application of 2D CNN in similarity-based DT binding affinity prediction. The validation results on multiple public datasets show that the proposed model is an effective approach for DT binding affinity prediction and can be quite helpful in drug development process.


2020 ◽  
Author(s):  
Yongbeom Kwon ◽  
Woong-Hee Shin ◽  
Junsu Ko ◽  
Juyong Lee

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our new model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3D convolutional neural network layers. Our model was trained using the 3740 protein-ligand complexes from the refined set of the PDBbind database and tested using the 270 complexes from the core set. The benchmark results show that the correlation coefficient between the predicted binding affinities by our model and the experimental data is higher than 0.72, which is comparable with the state-of-the-art binding affinity prediction methods. In addition, our method also ranks the relative binding affinities of possible multiple binders of a protein quite accurately. Last, we measured which structural information is critical for predicting binding affinity.


2019 ◽  
Author(s):  
Guanglei Cui ◽  
Alan P. Graves ◽  
Eric S. Manas

Relative binding affinity prediction is a critical component in computer aided drug design. Significant amount of effort has been dedicated to developing rapid and reliable in silico methods. However, robust assessment of their performance is still a complicated issue, as it requires a performance measure applicable in the prospective setting and more importantly a true null model that defines the expected performance of random in an objective manner. Although many performance metrics, such as correlation coefficient (r2), mean unsigned error (MUE), and room mean square error (RMSE), are frequently used in the literature, a true and non-trivial null model has yet been identified. To address this problem, here we introduce an interval estimate as an additional measure, namely prediction interval (PI), which can be estimated from the error distribution of the predictions. The benefits of using the interval estimate are 1) it provides the uncertainty range in the predicted activities, which is important in prospective applications; 2) a true null model with well-defined PI can be established. We provide one such example termed Gaussian Random Affinity Model (GRAM), which is based on the empirical observation that the affinity change in a typical lead optimization effort has the tendency to distribute normally N (0, s). Having an analytically defined PI that only depends on the variation in the activities, GRAM should in principle allow us to compare the performance of relative binding affinity prediction methods in a standard way, ultimately critical to measuring the progress made in algorithm development.<br>


2019 ◽  
Author(s):  
Mohammad Rezaei ◽  
Yanjun Li ◽  
Xiaolin Li ◽  
Chenglong Li

<b>Introduction:</b> The ability to discriminate among ligands binding to the same protein target in terms of their relative binding affinity lies at the heart of structure-based drug design. Any improvement in the accuracy and reliability of binding affinity prediction methods decreases the discrepancy between experimental and computational results.<br><b>Objectives:</b> The primary objectives were to find the most relevant features affecting binding affinity prediction, least use of manual feature engineering, and improving the reliability of binding affinity prediction using efficient deep learning models by tuning the model hyperparameters.<br><b>Methods:</b> The binding site of target proteins was represented as a grid box around their bound ligand. Both binary and distance-dependent occupancies were examined for how an atom affects its neighbor voxels in this grid. A combination of different features including ANOLEA, ligand elements, and Arpeggio atom types were used to represent the input. An efficient convolutional neural network (CNN) architecture, DeepAtom, was developed, trained and tested on the PDBbind v2016 dataset. Additionally an extended benchmark dataset was compiled to train and evaluate the models.<br><b>Results: </b>The best DeepAtom model showed an improved accuracy in the binding affinity prediction on PDBbind core subset (Pearson’s R=0.83) and is better than the recent state-of-the-art models in this field. In addition when the DeepAtom model was trained on our proposed benchmark dataset, it yields higher correlation compared to the baseline which confirms the value of our model.<br><b>Conclusions:</b> The promising results for the predicted binding affinities is expected to pave the way for embedding deep learning models in virtual screening and rational drug design fields.


Sign in / Sign up

Export Citation Format

Share Document