OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity Based on Residue-Atom Contacting Shells

Accurate prediction of binding affinity between protein and ligand is a very important step in the field of drug discovery. Although there are many methods based on different assumptions and rules do exist, prediction performance of protein–ligand binding affinity is not satisfactory so far. This paper proposes a new cascade graph-based convolutional neural network architecture by dealing with non-Euclidean irregular data. We represent the molecule as a graph, and use a simple linear transformation to deal with the sparsity problem of the one-hot encoding of original data. The first stage adopts ARMA graph convolutional neural network to learn the characteristics of atomic space in the protein–ligand complex. In the second stage, one variant of the MPNN graph convolutional neural network is introduced with chemical bond information and interactive atomic features. Finally, the architecture passes through the global add pool and the fully connected layer, and outputs a constant value as the predicted binding affinity. Experiments on the PDBbind v2016 data set showed that our method is better than most of the current methods. Our method is also comparable to the state-of-the-art method on the data set, and is more intuitive and simple.

Download Full-text

Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free Energy Calculations using Lambda Dynamics

10.26434/chemrxiv.12781310.v1 ◽

2020 ◽

Author(s):

E. Prabhu Raman ◽

Thomas J. Paul ◽

Ryan L. Hayes ◽

Charles L. Brooks III

Keyword(s):

Free Energy ◽

Ligand Binding ◽

Binding Affinity ◽

Binding Free Energy ◽

Computational Cost ◽

Combinatorial Libraries ◽

Free Energy Calculations ◽

Lead Optimization ◽

Efficient Estimation ◽

Lead Compound

<p>Accurate predictions of changes to protein-ligand binding affinity in response to chemical modifications are of utility in small molecule lead optimization. Relative free energy perturbation (FEP) approaches are one of the most widely utilized for this goal, but involve significant computational cost, thus limiting their application to small sets of compounds. Lambda dynamics, also rigorously based on the principles of statistical mechanics, provides a more efficient alternative. In this paper, we describe the development of a workflow to setup, execute, and analyze Multi-Site Lambda Dynamics (MSLD) calculations run on GPUs with CHARMm implemented in BIOVIA Discovery Studio and Pipeline Pilot. The workflow establishes a framework for setting up simulation systems for exploratory screening of modifications to a lead compound, enabling the calculation of relative binding affinities of combinatorial libraries. To validate the workflow, a diverse dataset of congeneric ligands for seven proteins with experimental binding affinity data is examined. A protocol to automatically tailor fit biasing potentials iteratively to flatten the free energy landscape of any MSLD system is developed that enhances sampling and allows for efficient estimation of free energy differences. The protocol is first validated on a large number of ligand subsets that model diverse substituents, which shows accurate and reliable performance. The scalability of the workflow is also tested to screen more than a hundred ligands modeled in a single system, which also resulted in accurate predictions. With a cumulative sampling time of 150ns or less, the method results in average unsigned errors of under 1 kcal/mol in most cases for both small and large combinatorial libraries. For the multi-site systems examined, the method is estimated to be more than an order of magnitude more efficient than contemporary FEP applications. The results thus demonstrate the utility of the presented MSLD workflow to efficiently screen combinatorial libraries and explore chemical space around a lead compound, and thus are of utility in lead optimization.</p>

Download Full-text

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained On Docked Poses

10.26434/chemrxiv.13637756 ◽

2021 ◽

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett Morris

Keyword(s):

Machine Learning ◽

Ligand Binding ◽

Crystal Structures ◽

Binding Affinity ◽

Scoring Function ◽

Scoring Functions ◽

Data Set ◽

Core Sets ◽

Strong Performance

Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes.<br><br>We explore how the use of docked, rather than crystallographic, poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function fails to generalise to anew data set, demonstrating the need for improved scoring functions and additional validation benchmarks. <br><br>Code and data to reproduce our results are available from https://github.com/oxpig/learning-from-docked-poses.

Download Full-text

OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction

ACS Omega ◽

10.1021/acsomega.9b01997 ◽

2019 ◽

Vol 4 (14) ◽

pp. 15956-15965 ◽

Cited By ~ 12

Author(s):

Liangzhen Zheng ◽

Jingrong Fan ◽

Yuguang Mu

Keyword(s):

Neural Network ◽

Ligand Binding ◽

Convolutional Neural Network ◽

Binding Affinity ◽

Intermolecular Contact ◽

Binding Affinity Prediction ◽

Affinity Prediction ◽

Multiple Layer

Download Full-text

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained On Docked Poses

10.26434/chemrxiv.13637756.v1 ◽

2021 ◽

Author(s):

Fergus Boyles ◽

Charlotte M Deane ◽

Garrett Morris

Keyword(s):

Machine Learning ◽

Ligand Binding ◽

Crystal Structures ◽

Binding Affinity ◽

Scoring Function ◽

Scoring Functions ◽

Data Set ◽

Core Sets ◽

Strong Performance

Machine learning scoring functions for protein-ligand binding affinity have been found to consistently outperform classical scoring functions when trained and tested on crystal structures of bound protein-ligand complexes. However, it is less clear how these methods perform when applied to docked poses of complexes.<br><br>We explore how the use of docked, rather than crystallographic, poses for both training and testing affects the performance of machine learning scoring functions. Using the PDBbind Core Sets as benchmarks, we show that the performance of a structure-based machine learning scoring function trained and tested on docked poses is lower than that of the same scoring function trained and tested on crystallographic poses. We construct a hybrid scoring function by combining both structure-based and ligand-based features, and show that its ability to predict binding affinity using docked poses is comparable to that of purely structure-based scoring functions trained and tested on crystal poses. Despite strong performance on docked poses of the PDBbind Core Sets, we find that our hybrid scoring function fails to generalise to anew data set, demonstrating the need for improved scoring functions and additional validation benchmarks. <br><br>Code and data to reproduce our results are available from https://github.com/oxpig/learning-from-docked-poses.

Download Full-text

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

10.26434/chemrxiv.11750544 ◽

2020 ◽

Author(s):

conor parks ◽

Zied Gaieb ◽

Rommie Amaro

Keyword(s):

Neural Network ◽

Random Forest ◽

Ligand Binding ◽

Binding Affinity ◽

Prediction Intervals ◽

Grand Challenge ◽

Feed Forward Neural Network ◽

Data Set ◽

Conformal Prediction ◽

External Test

<div><div><div><p>Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns.</p></div></div></div>

Download Full-text

One-dimensional deep learning inversion of electromagnetic induction data using convolutional neural network

Geophysical Journal International ◽

10.1093/gji/ggaa161 ◽

2020 ◽

Vol 222 (1) ◽

pp. 247-259 ◽

Cited By ~ 2

Author(s):

Davood Moghadas

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Electromagnetic Induction ◽

Computational Cost ◽

Single Step ◽

Accurate Estimation ◽

Well Performance ◽

Convolutional Network ◽

Data Set

SUMMARY Conventional geophysical inversion techniques suffer from several limitations including computational cost, nonlinearity, non-uniqueness and dimensionality of the inverse problem. Successful inversion of geophysical data has been a major challenge for decades. Here, a novel approach based on deep learning (DL) inversion via convolutional neural network (CNN) is proposed to instantaneously estimate subsurface electrical conductivity (σ) layering from electromagnetic induction (EMI) data. In this respect, a fully convolutional network was trained on a large synthetic data set generated based on 1-D EMI forward model. The accuracy of the proposed approach was examined using several synthetic scenarios. Moreover, the trained network was used to find subsurface electromagnetic conductivity images (EMCIs) from EMI data measured along two transects from Chicken Creek catchment (Brandenburg, Germany). Dipole–dipole electrical resistivity tomography data were measured as well to obtain reference subsurface σ distributions down to a 6 m depth. The inversely estimated models were juxtaposed and compared with their counterparts obtained from a spatially constrained deterministic algorithm as a standard code. Theoretical simulations demonstrated a well performance of the algorithm even in the presence of noise in data. Moreover, application of the DL inversion for subsurface imaging from Chicken Creek catchment manifested the accuracy and robustness of the proposed approach for EMI inversion. This approach returns subsurface σ distribution directly from EMI data in a single step without any iterations. The proposed strategy simplifies considerably EMI inversion and allows for rapid and accurate estimation of subsurface EMCI from multiconfiguration EMI data.

Download Full-text

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

10.26434/chemrxiv.11750544.v1 ◽

2020 ◽

Author(s):

conor parks ◽

Zied Gaieb ◽

Rommie Amaro

Keyword(s):

Neural Network ◽

Random Forest ◽

Ligand Binding ◽

Binding Affinity ◽

Prediction Intervals ◽

Grand Challenge ◽

Feed Forward Neural Network ◽

Data Set ◽

Conformal Prediction ◽

External Test

<div><div><div><p>Protein-ligand binding affinity is a key pharmacodynamic endpoint in drug discovery. Sole reliance on experimental design, make, and test cycles is costly and time consuming, providing an opportunity for computational methods to assist. Herein, we present results comparing random forest and feed-forward neural network proteochemometric models for their ability to predict pIC50 measurements for held out generic Bemis-Murcko scaffolds. In addition, we assess the ability of conformal prediction to provide calibrated prediction intervals in both a retrospective and semi-prospective test using the recently released Grand Challenge 4 data set as an external test set. In total, random forest and deep neural network proteochemometric models show quality retrospective performance but suffer in the semi-prospective setting. However, the conformal predictor prediction intervals prove to be well calibrated both retrospectively and semi-prospectively showing that they can be used to guide hit discovery and lead optimization campaigns.</p></div></div></div>

Download Full-text

Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free Energy Calculations using Lambda Dynamics

10.26434/chemrxiv.12781310 ◽

2020 ◽

Author(s):

E. Prabhu Raman ◽

Thomas J. Paul ◽

Ryan L. Hayes ◽

Charles L. Brooks III

Keyword(s):

Free Energy ◽

Ligand Binding ◽

Binding Affinity ◽

Binding Free Energy ◽

Computational Cost ◽

Combinatorial Libraries ◽

Free Energy Calculations ◽

Lead Optimization ◽

Efficient Estimation ◽

Lead Compound

<p>Accurate predictions of changes to protein-ligand binding affinity in response to chemical modifications are of utility in small molecule lead optimization. Relative free energy perturbation (FEP) approaches are one of the most widely utilized for this goal, but involve significant computational cost, thus limiting their application to small sets of compounds. Lambda dynamics, also rigorously based on the principles of statistical mechanics, provides a more efficient alternative. In this paper, we describe the development of a workflow to setup, execute, and analyze Multi-Site Lambda Dynamics (MSLD) calculations run on GPUs with CHARMm implemented in BIOVIA Discovery Studio and Pipeline Pilot. The workflow establishes a framework for setting up simulation systems for exploratory screening of modifications to a lead compound, enabling the calculation of relative binding affinities of combinatorial libraries. To validate the workflow, a diverse dataset of congeneric ligands for seven proteins with experimental binding affinity data is examined. A protocol to automatically tailor fit biasing potentials iteratively to flatten the free energy landscape of any MSLD system is developed that enhances sampling and allows for efficient estimation of free energy differences. The protocol is first validated on a large number of ligand subsets that model diverse substituents, which shows accurate and reliable performance. The scalability of the workflow is also tested to screen more than a hundred ligands modeled in a single system, which also resulted in accurate predictions. With a cumulative sampling time of 150ns or less, the method results in average unsigned errors of under 1 kcal/mol in most cases for both small and large combinatorial libraries. For the multi-site systems examined, the method is estimated to be more than an order of magnitude more efficient than contemporary FEP applications. The results thus demonstrate the utility of the presented MSLD workflow to efficiently screen combinatorial libraries and explore chemical space around a lead compound, and thus are of utility in lead optimization.</p>

Download Full-text

Epilepsy Detection by Using Scalogram Based Convolutional Neural Network from EEG Signals

Brain Sciences ◽

10.3390/brainsci9050115 ◽

2019 ◽

Vol 9 (5) ◽

pp. 115 ◽

Cited By ~ 11

Author(s):

Ömer Türk ◽

Mehmet Siraç Özerdem

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Classification Performance ◽

Data Sets ◽

Great Success ◽

Success Rates ◽

Learning Networks ◽

Eeg Signals ◽

Data Set ◽

Conventional Methods

The studies implemented with Electroencephalogram (EEG) signals are progressing very rapidly and brain computer interfaces (BCI) and disease determinations are carried out at certain success rates thanks to new methods developed in this field. The effective use of these signals, especially in disease detection, is very important in terms of both time and cost. Currently, in general, EEG studies are used in addition to conventional methods as well as deep learning networks that have recently achieved great success. The most important reason for this is that in conventional methods, increasing classification accuracy is based on too many human efforts as EEG is being processed, obtaining the features is the most important step. This stage is based on both the time-consuming and the investigation of many feature methods. Therefore, there is a need for methods that do not require human effort in this area and can learn the features themselves. Based on that, two-dimensional (2D) frequency-time scalograms were obtained in this study by applying Continuous Wavelet Transform to EEG records containing five different classes. Convolutional Neural Network structure was used to learn the properties of these scalogram images and the classification performance of the structure was compared with the studies in the literature. In order to compare the performance of the proposed method, the data set of the University of Bonn was used. The data set consists of five EEG records containing healthy and epilepsy disease which are labeled as A, B, C, D, and E. In the study, A-E and B-E data sets were classified as 99.50%, A-D and B-D data sets were classified as 100% in binary classifications, A-D-E data sets were 99.00% in triple classification, A-C-D-E data sets were 90.50%, B-C-D-E data sets were 91.50% in quaternary classification, and A-B-C-D-E data sets were in the fifth class classification with an accuracy of 93.60%.

Download Full-text

OnionNet-2: A Convolutional Neural Network Model for Predicting Protein-Ligand Binding Affinity Based on Residue-Atom Contacting Shells

A Cascade Graph Convolutional Network for Predicting Protein–Ligand Binding Affinity

Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free Energy Calculations using Lambda Dynamics

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained On Docked Poses

OnionNet: a Multiple-Layer Intermolecular-Contact-Based Convolutional Neural Network for Protein–Ligand Binding Affinity Prediction

Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained On Docked Poses

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

One-dimensional deep learning inversion of electromagnetic induction data using convolutional neural network

An Analysis of Proteochemometric and Conformal Prediction Machine Learning Protein-Ligand Binding Affinity Models

Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free Energy Calculations using Lambda Dynamics

Epilepsy Detection by Using Scalogram Based Convolutional Neural Network from EEG Signals

Export Citation Format