Overview of the SAMPL6 host-guest binding affinity prediction challenge

AbstractAccurately predicting the binding affinities of small organic molecules to biological macro-molecules can greatly accelerate drug discovery by reducing the number of compounds that must be synthesized to realize desired potency and selectivity goals. Unfortunately, the process of assessing the accuracy of current computational approaches to affinity prediction against binding data to biological macro-molecules is frustrated by several challenges, such as slow conformational dynamics, multiple titratable groups, and the lack of high-quality blinded datasets. Over the last several SAMPL blind challenge exercises, host-guest systems have emerged as a practical and effective way to circumvent these challenges in assessing the predictive performance of current-generation quantitative modeling tools, while still providing systems capable of possessing tight binding affinities. Here, we present an overview of the SAMPL6 host-guest binding affinity prediction challenge, which featured three supramolecular hosts: octa-acid (OA), the closely related tetra-endo-methyl-octa-acid (TEMOA), and cucurbit[8]uril (CB8), along with 21 small organic guest molecules. A total of 119 entries were received from 10 participating groups employing a variety of methods that spanned from electronic structure and movable type calculations in implicit solvent to alchemical and potential of mean force strategies using empirical force fields with explicit solvent models. While empirical models tended to obtain better performance than first-principle methods, it was not possible to identify a single approach that consistently provided superior results across all host-guest systems and statistical metrics. Moreover, the accuracy of the methodologies generally displayed a substantial dependence on the system considered, emphasizing the need for host diversity in blind evaluations. Several entries exploited previous experimental measurements of similar host-guest systems in an effort to improve their physical-based predictions via some manner of rudimentary machine learning; while this strategy succeeded in reducing systematic errors, it did not correspond to an improvement in statistical correlation. Comparison to previous rounds of the host-guest binding free energy challenge highlights an overall improvement in the correlation obtained by the affinity predictions for OA and TEMOA systems, but a surprising lack of improvement regarding root mean square error over the past several challenge rounds. The data suggests that further refinement of force field parameters, as well as improved treatment of chemical effects (e.g., buffer salt conditions, protonation states) may be required to further enhance predictive accuracy.

Download Full-text

AK-Score: Accurate Protein-Ligand Binding Affinity Prediction Using an Ensemble of 3D-Convolutional Neural Networks

International Journal of Molecular Sciences ◽

10.3390/ijms21228424 ◽

2020 ◽

Vol 21 (22) ◽

pp. 8424

Author(s):

Yongbeom Kwon ◽

Woong-Hee Shin ◽

Junsu Ko ◽

Juyong Lee

Keyword(s):

Neural Network ◽

Binding Affinity ◽

Pearson Correlation ◽

Complex Structure ◽

Rational Drug Design ◽

Scoring Functions ◽

Binding Affinities ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. Therefore, many binding affinity prediction methods have been developed. In recent years, since deep learning technology has become powerful, it is also implemented to predict affinity. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3-D convolutional neural network layers. Our model was trained using the 3772 protein-ligand complexes from the refined set of the PDBbind-2016 database and tested using the core set of 285 complexes. The benchmark results show that the Pearson correlation coefficient between the predicted binding affinities by our model and the experimental data is 0.827, which is higher than the state-of-the-art binding affinity prediction scoring functions. Additionally, our method ranks the relative binding affinities of possible multiple binders of a protein quite accurately, comparable to the other scoring functions. Last, we measured which structural information is critical for predicting binding affinity and found that the complementarity between the protein and ligand is most important.

Download Full-text

AK-Score: Accurate Protein-Ligand Binding Affinity Prediction Using the Ensemble of 3D-Convolutional Neural Network

10.26434/chemrxiv.12015045 ◽

2020 ◽

Author(s):

Yongbeom Kwon ◽

Woong-Hee Shin ◽

Junsu Ko ◽

Juyong Lee

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Binding Affinity ◽

Structural Information ◽

Complex Structure ◽

Rational Drug Design ◽

Binding Affinities ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our new model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3D convolutional neural network layers. Our model was trained using the 3740 protein-ligand complexes from the refined set of the PDBbind database and tested using the 270 complexes from the core set. The benchmark results show that the correlation coefficient between the predicted binding affinities by our model and the experimental data is higher than 0.72, which is comparable with the state-of-the-art binding affinity prediction methods. In addition, our method also ranks the relative binding affinities of possible multiple binders of a protein quite accurately. Last, we measured which structural information is critical for predicting binding affinity.

Download Full-text

Overview of the SAMPL6 host–guest binding affinity prediction challenge

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-018-0170-6 ◽

2018 ◽

Vol 32 (10) ◽

pp. 937-963 ◽

Cited By ~ 50

Author(s):

Andrea Rizzi ◽

Steven Murkli ◽

John N. McNeill ◽

Wei Yao ◽

Matthew Sullivan ◽

...

Keyword(s):

Binding Affinity ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Guest Binding

Download Full-text

Prediction of drug–target binding affinity using similarity-based convolutional neural network

Scientific Reports ◽

10.1038/s41598-021-83679-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jooyong Shim ◽

Zhen-Yu Hong ◽

Insuk Sohn ◽

Changha Hwang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Binding Affinity ◽

Drug Target ◽

Binary Classification ◽

Binding Affinities ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Target Binding ◽

Public Datasets

AbstractIdentifying novel drug–target interactions (DTIs) plays an important role in drug discovery. Most of the computational methods developed for predicting DTIs use binary classification, whose goal is to determine whether or not a drug–target (DT) pair interacts. However, it is more meaningful but also more challenging to predict the binding affinity that describes the strength of the interaction between a DT pair. If the binding affinity is not sufficiently large, such drug may not be useful. Therefore, the methods for predicting DT binding affinities are very valuable. The increase in novel public affinity data available in the DT-related databases enables advanced deep learning techniques to be used to predict binding affinities. In this paper, we propose a similarity-based model that applies 2-dimensional (2D) convolutional neural network (CNN) to the outer products between column vectors of two similarity matrices for the drugs and targets to predict DT binding affinities. To our best knowledge, this is the first application of 2D CNN in similarity-based DT binding affinity prediction. The validation results on multiple public datasets show that the proposed model is an effective approach for DT binding affinity prediction and can be quite helpful in drug development process.

Download Full-text

AK-Score: Accurate Protein-Ligand Binding Affinity Prediction Using the Ensemble of 3D-Convolutional Neural Network

10.26434/chemrxiv.12015045.v1 ◽

2020 ◽

Author(s):

Yongbeom Kwon ◽

Woong-Hee Shin ◽

Junsu Ko ◽

Juyong Lee

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Binding Affinity ◽

Structural Information ◽

Complex Structure ◽

Rational Drug Design ◽

Binding Affinities ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Download Full-text

GRAM: A True Null Model for Relative Binding Affinity Predictions

10.26434/chemrxiv.9956474 ◽

2019 ◽

Author(s):

Guanglei Cui ◽

Alan P. Graves ◽

Eric S. Manas

Keyword(s):

Binding Affinity ◽

Performance Metrics ◽

Null Model ◽

Performance Measure ◽

Interval Estimate ◽

Empirical Observation ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Relative Binding Affinity ◽

Relative Binding

Relative binding affinity prediction is a critical component in computer aided drug design. Significant amount of effort has been dedicated to developing rapid and reliable in silico methods. However, robust assessment of their performance is still a complicated issue, as it requires a performance measure applicable in the prospective setting and more importantly a true null model that defines the expected performance of random in an objective manner. Although many performance metrics, such as correlation coefficient (r2), mean unsigned error (MUE), and room mean square error (RMSE), are frequently used in the literature, a true and non-trivial null model has yet been identified. To address this problem, here we introduce an interval estimate as an additional measure, namely prediction interval (PI), which can be estimated from the error distribution of the predictions. The benefits of using the interval estimate are 1) it provides the uncertainty range in the predicted activities, which is important in prospective applications; 2) a true null model with well-defined PI can be established. We provide one such example termed Gaussian Random Affinity Model (GRAM), which is based on the empirical observation that the affinity change in a typical lead optimization effort has the tendency to distribute normally N (0, s). Having an analytically defined PI that only depends on the variation in the activities, GRAM should in principle allow us to compare the performance of relative binding affinity prediction methods in a standard way, ultimately critical to measuring the progress made in algorithm development.

Download Full-text

Improving the Accuracy of Protein-Ligand Binding Affinity Prediction by Deep Learning Models: Benchmark and Model

10.26434/chemrxiv.9866912 ◽

2019 ◽

Author(s):

Mohammad Rezaei ◽

Yanjun Li ◽

Xiaolin Li ◽

Chenglong Li

Keyword(s):

Deep Learning ◽

Drug Design ◽

Binding Affinity ◽

Benchmark Dataset ◽

Rational Drug Design ◽

Learning Models ◽

Structure Based Drug Design ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Rational Drug

Introduction: The ability to discriminate among ligands binding to the same protein target in terms of their relative binding affinity lies at the heart of structure-based drug design. Any improvement in the accuracy and reliability of binding affinity prediction methods decreases the discrepancy between experimental and computational results. Objectives: The primary objectives were to find the most relevant features affecting binding affinity prediction, least use of manual feature engineering, and improving the reliability of binding affinity prediction using efficient deep learning models by tuning the model hyperparameters. Methods: The binding site of target proteins was represented as a grid box around their bound ligand. Both binary and distance-dependent occupancies were examined for how an atom affects its neighbor voxels in this grid. A combination of different features including ANOLEA, ligand elements, and Arpeggio atom types were used to represent the input. An efficient convolutional neural network (CNN) architecture, DeepAtom, was developed, trained and tested on the PDBbind v2016 dataset. Additionally an extended benchmark dataset was compiled to train and evaluate the models. Results: The best DeepAtom model showed an improved accuracy in the binding affinity prediction on PDBbind core subset (Pearson’s R=0.83) and is better than the recent state-of-the-art models in this field. In addition when the DeepAtom model was trained on our proposed benchmark dataset, it yields higher correlation compared to the baseline which confirms the value of our model. Conclusions: The promising results for the predicted binding affinities is expected to pave the way for embedding deep learning models in virtual screening and rational drug design fields.

Download Full-text