Binding affinity prediction for protein-ligand complex using deep attention mechanism based on intermolecular interactions

AbstractAccurate prediction of protein-ligand binding affinity is important in that it can lower the overall cost of drug discovery in structure-based drug design. For more accurate prediction, many classical scoring functions and machine learning-based methods have been developed. However, these techniques tend to have limitations, mainly resulting from a lack of sufficient interactions energy terms to describe complex interactions between proteins and ligands. Recent deep-learning techniques show strong potential to solve this problem, but the search for more efficient and appropriate deep-learning architectures and methods to represent protein-ligand complexes continues. In this study, we proposed a deep-neural network for more accurate prediction of protein-ligand complex binding affinity. The proposed model has two important features, descriptor embeddings that contains embedded information about the local structures of a protein-ligand complex and an attention mechanism for highlighting important descriptors to binding affinity prediction. The proposed model showed better performance on most benchmark datasets than existing binding affinity prediction models. Moreover, we confirmed that an attention mechanism was able to capture binding sites in a protein-ligand complex and that it contributed to improvement in predictive performance. Our code is available at https://github.com/Blue1993/BAPA.Author summaryThe initial step in drug discovery is to identify drug candidates for a target protein using a scoring function. Existing scoring functions, however, lack the ability to accurately predict the binding affinity of protein-ligand complexes. In this study, we proposed a deep learning-based approach to extract patterns from the local structures of protein-ligand complexes and to highlight the important local structures via an attention mechanism. The proposed model showed good performance for various benchmark datasets compared to existing models.

Download Full-text

Binding affinity prediction for protein–ligand complex using deep attention mechanism based on intermolecular interactions

BMC Bioinformatics ◽

10.1186/s12859-021-04466-0 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Sangmin Seo ◽

Jonghwan Choi ◽

Sanghyun Park ◽

Jaegyoon Ahn

Keyword(s):

Deep Learning ◽

Binding Affinity ◽

Prediction Models ◽

Attention Mechanism ◽

Scoring Functions ◽

Ligand Complex ◽

Structure Based Drug Design ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Proposed Model

Abstract Background Accurate prediction of protein–ligand binding affinity is important for lowering the overall cost of drug discovery in structure-based drug design. For accurate predictions, many classical scoring functions and machine learning-based methods have been developed. However, these techniques tend to have limitations, mainly resulting from a lack of sufficient energy terms to describe the complex interactions between proteins and ligands. Recent deep-learning techniques can potentially solve this problem. However, the search for more efficient and appropriate deep-learning architectures and methods to represent protein–ligand complex is ongoing. Results In this study, we proposed a deep-neural network model to improve the prediction accuracy of protein–ligand complex binding affinity. The proposed model has two important features, descriptor embeddings with information on the local structures of a protein–ligand complex and an attention mechanism to highlight important descriptors for binding affinity prediction. The proposed model performed better than existing binding affinity prediction models on most benchmark datasets. Conclusions We confirmed that an attention mechanism can capture the binding sites in a protein–ligand complex to improve prediction performance. Our code is available at https://github.com/Blue1993/BAPA.

Download Full-text

AK-Score: Accurate Protein-Ligand Binding Affinity Prediction Using an Ensemble of 3D-Convolutional Neural Networks

International Journal of Molecular Sciences ◽

10.3390/ijms21228424 ◽

2020 ◽

Vol 21 (22) ◽

pp. 8424

Author(s):

Yongbeom Kwon ◽

Woong-Hee Shin ◽

Junsu Ko ◽

Juyong Lee

Keyword(s):

Neural Network ◽

Binding Affinity ◽

Pearson Correlation ◽

Complex Structure ◽

Rational Drug Design ◽

Scoring Functions ◽

Binding Affinities ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Accurate prediction of the binding affinity of a protein-ligand complex is essential for efficient and successful rational drug design. Therefore, many binding affinity prediction methods have been developed. In recent years, since deep learning technology has become powerful, it is also implemented to predict affinity. In this work, a new neural network model that predicts the binding affinity of a protein-ligand complex structure is developed. Our model predicts the binding affinity of a complex using the ensemble of multiple independently trained networks that consist of multiple channels of 3-D convolutional neural network layers. Our model was trained using the 3772 protein-ligand complexes from the refined set of the PDBbind-2016 database and tested using the core set of 285 complexes. The benchmark results show that the Pearson correlation coefficient between the predicted binding affinities by our model and the experimental data is 0.827, which is higher than the state-of-the-art binding affinity prediction scoring functions. Additionally, our method ranks the relative binding affinities of possible multiple binders of a protein quite accurately, comparable to the other scoring functions. Last, we measured which structural information is critical for predicting binding affinity and found that the complementarity between the protein and ligand is most important.

Download Full-text

Learning from the ligand: using ligand-based features to improve binding affinity prediction

Bioinformatics ◽

10.1093/bioinformatics/btz665 ◽

2019 ◽

Cited By ~ 7

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett M Morris

Keyword(s):

Machine Learning ◽

Binding Affinity ◽

Pearson Correlation ◽

Scoring Function ◽

Supplementary Information ◽

Scoring Functions ◽

Limited Information ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Abstract Motivation Machine learning scoring functions for protein–ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein–ligand complex, with limited information about the chemical or topological properties of the ligand itself. Results We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest (RF) combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.836, 0.780 and 0.821 on the PDBbind 2007, 2013 and 2016 core sets, respectively, compared to 0.790, 0.746 and 0.814 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a RF using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets. Availability and implementation Data and code to reproduce all the results are freely available at http://opig.stats.ox.ac.uk/resources. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Learning from the Ligand: Using Ligand-Based Features to Improve Binding Affinity Prediction

10.26434/chemrxiv.8174525.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett Morris

Keyword(s):

Machine Learning ◽

Random Forest ◽

Binding Affinity ◽

Pearson Correlation ◽

Scoring Function ◽

Scoring Functions ◽

Limited Information ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Machine learning scoring functions for protein-ligand binding affinity prediction have been found to consistently outperform classical scoring functions. Structure-based scoring functions for universal affinity prediction typically use features describing interactions derived from the protein-ligand complex, with limited information about the chemical or topological properties of the ligand itself. We demonstrate that the performance of machine learning scoring functions are consistently improved by the inclusion of diverse ligand-based features. For example, a Random Forest combining the features of RF-Score v3 with RDKit molecular descriptors achieved Pearson correlation coefficients of up to 0.831, 0.785, and 0.821 on the PDBbind 2007, 2013, and 2016 core sets respectively, compared to 0.790, 0.737, and 0.797 when using the features of RF-Score v3 alone. Excluding proteins and/or ligands that are similar to those in the test sets from the training set has a significant effect on scoring function performance, but does not remove the predictive power of ligand-based features. Furthermore a Random Forest using only ligand-based features is predictive at a level similar to classical scoring functions and it appears to be predicting the mean binding affinity of a ligand for its protein targets.

Download Full-text

Learning from the Ligand: Using Ligand-Based Features to Improve Binding Affinity Prediction

10.26434/chemrxiv.8174525 ◽

2019 ◽

Cited By ~ 1

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett Morris

Keyword(s):

Machine Learning ◽

Random Forest ◽

Binding Affinity ◽

Pearson Correlation ◽

Scoring Function ◽

Scoring Functions ◽

Limited Information ◽

Ligand Complex ◽

Binding Affinity Prediction ◽

Affinity Prediction

Download Full-text

Improving the Accuracy of Protein-Ligand Binding Affinity Prediction by Deep Learning Models: Benchmark and Model

10.26434/chemrxiv.9866912 ◽

2019 ◽

Author(s):

Mohammad Rezaei ◽

Yanjun Li ◽

Xiaolin Li ◽

Chenglong Li

Keyword(s):

Deep Learning ◽

Drug Design ◽

Binding Affinity ◽

Benchmark Dataset ◽

Rational Drug Design ◽

Learning Models ◽

Structure Based Drug Design ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Rational Drug

Introduction: The ability to discriminate among ligands binding to the same protein target in terms of their relative binding affinity lies at the heart of structure-based drug design. Any improvement in the accuracy and reliability of binding affinity prediction methods decreases the discrepancy between experimental and computational results. Objectives: The primary objectives were to find the most relevant features affecting binding affinity prediction, least use of manual feature engineering, and improving the reliability of binding affinity prediction using efficient deep learning models by tuning the model hyperparameters. Methods: The binding site of target proteins was represented as a grid box around their bound ligand. Both binary and distance-dependent occupancies were examined for how an atom affects its neighbor voxels in this grid. A combination of different features including ANOLEA, ligand elements, and Arpeggio atom types were used to represent the input. An efficient convolutional neural network (CNN) architecture, DeepAtom, was developed, trained and tested on the PDBbind v2016 dataset. Additionally an extended benchmark dataset was compiled to train and evaluate the models. Results: The best DeepAtom model showed an improved accuracy in the binding affinity prediction on PDBbind core subset (Pearson’s R=0.83) and is better than the recent state-of-the-art models in this field. In addition when the DeepAtom model was trained on our proposed benchmark dataset, it yields higher correlation compared to the baseline which confirms the value of our model. Conclusions: The promising results for the predicted binding affinities is expected to pave the way for embedding deep learning models in virtual screening and rational drug design fields.

Download Full-text

Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening

Wiley Interdisciplinary Reviews Computational Molecular Science ◽

10.1002/wcms.1225 ◽

2015 ◽

Vol 5 (6) ◽

pp. 405-424 ◽

Cited By ~ 101

Author(s):

Qurrat Ul Ain ◽

Antoniya Aleksandrova ◽

Florian D. Roessler ◽

Pedro J. Ballester

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Binding Affinity ◽

Scoring Functions ◽

Binding Affinity Prediction ◽

Affinity Prediction

Download Full-text

Development and evaluation of a deep learning model for protein–ligand binding affinity prediction

Bioinformatics ◽

10.1093/bioinformatics/bty374 ◽

2018 ◽

Vol 34 (21) ◽

pp. 3666-3674 ◽

Cited By ~ 62

Author(s):

Marta M Stepniewska-Dziubinska ◽

Piotr Zielenkiewicz ◽

Pawel Siedlecki

Keyword(s):

Deep Learning ◽

Ligand Binding ◽

Binding Affinity ◽

Learning Model ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Deep Learning Model

Download Full-text

The Impact of Protein Structure and Sequence Similarity on the Accuracy of Machine-Learning Scoring Functions for Binding Affinity Prediction

Biomolecules ◽

10.3390/biom8010012 ◽

2018 ◽

Vol 8 (1) ◽

pp. 12 ◽

Cited By ~ 24

Author(s):

Hongjian Li ◽

Jiangjun Peng ◽

Yee Leung ◽

Kwong-Sak Leung ◽

Man-Hon Wong ◽

...

Keyword(s):

Machine Learning ◽

Protein Structure ◽

Binding Affinity ◽

Sequence Similarity ◽

Scoring Functions ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

The Impact

Download Full-text

CSCORE: A SIMPLE YET EFFECTIVE SCORING FUNCTION FOR PROTEIN–LIGAND BINDING AFFINITY PREDICTION USING MODIFIED CMAC LEARNING ARCHITECTURE

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972001100577x ◽

2011 ◽

Vol 09 (supp01) ◽

pp. 1-14 ◽

Cited By ~ 20

Author(s):

XUCHANG OUYANG ◽

STEPHANUS DANIEL HANDOKO ◽

CHEE KEONG KWOH

Keyword(s):

Binding Affinity ◽

Scoring Function ◽

Binding Mode ◽

Computational Method ◽

Data Driven ◽

Machine Learning Techniques ◽

Ligand Docking ◽

Scoring Functions ◽

Binding Affinity Prediction ◽

Affinity Prediction

Protein–ligand docking is a computational method to identify the binding mode of a ligand and a target protein, and predict the corresponding binding affinity using a scoring function. This method has great value in drug design. After decades of development, scoring functions nowadays typically can identify the true binding mode, but the prediction of binding affinity still remains a major problem. Here we present CScore, a data-driven scoring function using a modified Cerebellar Model Articulation Controller (CMAC) learning architecture, for accurate binding affinity prediction. The performance of CScore in terms of correlation between predicted and experimental binding affinities is benchmarked under different validation approaches. CScore achieves a prediction with R = 0.7668 and RMSE = 1.4540 when tested on an independent dataset. To the best of our knowledge, this result outperforms other scoring functions tested on the same dataset. The performance of CScore varies on different clusters under the leave-cluster-out validation approach, but still achieves competitive result. Lastly, the target-specified CScore achieves an even better result with R = 0.8237 and RMSE = 1.0872, trained on a much smaller but more relevant dataset for each target. The large dataset of protein–ligand complexes structural information and advances of machine learning techniques enable the data-driven approach in binding affinity prediction. CScore is capable of accurate binding affinity prediction. It is also shown that CScore will perform better if sufficient and relevant data is presented. As there is growth of publicly available structural data, further improvement of this scoring scheme can be expected.

Download Full-text