Improving ΔΔG predictions with a multi-task convolutional Siamese Network

Mapping Intimacies ◽

10.26434/chemrxiv-2021-vcmzz ◽

2021 ◽

Author(s):

Andrew McNutt ◽

David Koes

Keyword(s):

Free Energy ◽

Network Architecture ◽

Binding Free Energy ◽

Experimental Testing ◽

Predictive Performance ◽

Protein Family ◽

Siamese Network ◽

Pearson's R ◽

Model Training ◽

Pearson’S R

The lead optimization phase of drug discovery refines an initial hit molecule for desired properties, especially potency. Synthesis and experimental testing of the small perturbations during this refinement can be quite costly and time consuming. Relative binding free energy (RBFE, also referred to as ∆∆G) methods allow the estimation of binding free energy changes after small changes to a ligand scaffold. Here we propose and evaluate a Convolutional Neural Network (CNN) Siamese network for the prediction of RBFE between two bound ligands. We show that our multi-task loss is able to improve on a previous state-of-the-art Siamese network for RBFE prediction via increased regularization of the latent space. The Siamese network architecture is well suited to the prediction of RBFE in comparison to a standard CNN trained on the same data (Pearson’s R of 0.553 and 0.5, respectively). When evaluated on a left-out protein family, our CNN Siamese network shows variability in its RBFE predictive performance depending on the protein family being evaluated (Pearson’s R ranging from-0.44 to 0.97). RBFE prediction performance can be improved during generalization by injecting only a few examples (few-shot learning) from the evaluation dataset during model training.

Download Full-text

QSAR Modeling Based on Conformation Ensembles Using a Multi-Instance Learning Approach

10.26434/chemrxiv.13456277 ◽

2020 ◽

Author(s):

Dmitry V. Zankov ◽

Mariia Matveieva ◽

Aleksandra Nikonenko ◽

Ramil Nugmanov ◽

Alexandre Varnek ◽

...

Keyword(s):

Experimental Testing ◽

Predictive Performance ◽

Bioactive Molecules ◽

Learning Approach ◽

Qsar Modeling ◽

Practical Applications ◽

Single Instance ◽

Model Training ◽

Multiple Conformations ◽

3D Descriptors

Modern QSAR approaches have wide practical applications in drug discovery for screening potentially bioactive molecules before their experimental testing. Most models predicting the bioactivity of compounds are based on molecular descriptors derived from 2D structure losing explicit information about the spatial structure of molecules which is important for protein-ligand recognition. The major problem in constructing models using 3D descriptors is the choice of a probable bioactive conformation that affects the predictive performance. Multi-instance (MI) learning approach considering multiple conformations upon the model training can be a reasonable solution to the above problem. Here, we compared MI-QSAR with the classical single-instance QSAR (SI-QSAR) approach, where each molecule was encoded by either 2D descriptors or 3D descriptors issued from the single lowest-energy conformation. The calculations were carried out on a sample of 175 datasets extracted from the ChEMBL23 database. It was demonstrated that (<i>i</i>) MI-QSAR outperforms SI-QSAR in numerous cases and (<i>ii</i>) MI algorithms can automatically identify plausible bioactive conformations. Instance-attention based network can be applied for most important conformer selection which was shown to correspond PDB conformer in 50-84% of molecules.

Download Full-text

QSAR Modeling Based on Conformation Ensembles Using a Multi-Instance Learning Approach

10.26434/chemrxiv.13456277.v1 ◽

2020 ◽

Author(s):

Dmitry V. Zankov ◽

Mariia Matveieva ◽

Aleksandra Nikonenko ◽

Ramil Nugmanov ◽

Alexandre Varnek ◽

...

Keyword(s):

Experimental Testing ◽

Predictive Performance ◽

Bioactive Molecules ◽

Learning Approach ◽

Qsar Modeling ◽

Practical Applications ◽

Single Instance ◽

Model Training ◽

Multiple Conformations ◽

3D Descriptors

Download Full-text

CODEM Instrument

GeroPsych ◽

10.1024/1662-9647/a000100 ◽

2014 ◽

Vol 27 (1) ◽

pp. 23-31 ◽

Cited By ~ 4

Author(s):

Anne Kuemmel (This author contributed eq ◽

Julia Haberstroh (This author contributed ◽

Johannes Pantel

Keyword(s):

Convergent Validity ◽

Interrater Reliability ◽

Discriminant Validity ◽

Assessment Tool ◽

Intraclass Correlation ◽

Well Being ◽

Communication Behavior ◽

People With Dementia ◽

Pearson's R ◽

Pearson’S R

Communication and communication behaviors in situational contexts are essential conditions for well-being and quality of life in people with dementia. Measuring methods, however, are limited. The CODEM instrument, a standardized observational communication behavior assessment tool, was developed and evaluated on the basis of the current state of research in dementia care and social-communicative behavior. Initially, interrater reliability was examined by means of videoratings (N = 10 people with dementia). Thereupon, six caregivers in six German nursing homes observed 69 residents suffering from dementia and used CODEM to rate their communication behavior. The interrater reliability of CODEM was excellent (mean κ = .79; intraclass correlation = .91). Statistical analysis indicated that CODEM had excellent internal consistency (Cronbach’s α = .95). CODEM also showed excellent convergent validity (Pearson’s R = .88) as well as discriminant validity (Pearson’s R = .63). Confirmatory factor analysis verified the two-factor solution of verbal/content aspects and nonverbal/relationship aspects. With regard to the severity of the disease, the content and relational aspects of communication exhibited different trends. CODEM proved to be a reliable, valid, and sensitive assessment tool for examining communication behavior in the field of dementia. CODEM also provides researchers a feasible examination tool for measuring effects of psychosocial intervention studies that strive to improve communication behavior and well-being in dementia.

Download Full-text

Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects

10.26434/chemrxiv.11364884.v2 ◽

2020 ◽

Cited By ~ 3

Author(s):

Christina Schindler ◽

Hannah Baumann ◽

Andreas Blum ◽

Dietrich Böse ◽

Hans-Peter Buchstaller ◽

...

Keyword(s):

Free Energy ◽

Drug Discovery ◽

Large Scale ◽

Active Drug ◽

Binding Free Energy ◽

Free Energy Calculations ◽

Energy Calculation ◽

Large Scale Assessment ◽

New Public ◽

Binding Free Energy Calculations

Here we present an evaluation of the binding affinity prediction accuracy of the free energy calculation method FEP+ on internal active drug discovery projects and on a large new public benchmark set.<br>

Download Full-text

Automated, Accurate, and Scalable Relative Protein-Ligand Binding Free Energy Calculations using Lambda Dynamics

10.26434/chemrxiv.12781310.v1 ◽

2020 ◽

Author(s):

E. Prabhu Raman ◽

Thomas J. Paul ◽

Ryan L. Hayes ◽

Charles L. Brooks III

Keyword(s):

Free Energy ◽

Ligand Binding ◽

Binding Affinity ◽

Binding Free Energy ◽

Computational Cost ◽

Combinatorial Libraries ◽

Free Energy Calculations ◽

Lead Optimization ◽

Efficient Estimation ◽

Lead Compound

<p>Accurate predictions of changes to protein-ligand binding affinity in response to chemical modifications are of utility in small molecule lead optimization. Relative free energy perturbation (FEP) approaches are one of the most widely utilized for this goal, but involve significant computational cost, thus limiting their application to small sets of compounds. Lambda dynamics, also rigorously based on the principles of statistical mechanics, provides a more efficient alternative. In this paper, we describe the development of a workflow to setup, execute, and analyze Multi-Site Lambda Dynamics (MSLD) calculations run on GPUs with CHARMm implemented in BIOVIA Discovery Studio and Pipeline Pilot. The workflow establishes a framework for setting up simulation systems for exploratory screening of modifications to a lead compound, enabling the calculation of relative binding affinities of combinatorial libraries. To validate the workflow, a diverse dataset of congeneric ligands for seven proteins with experimental binding affinity data is examined. A protocol to automatically tailor fit biasing potentials iteratively to flatten the free energy landscape of any MSLD system is developed that enhances sampling and allows for efficient estimation of free energy differences. The protocol is first validated on a large number of ligand subsets that model diverse substituents, which shows accurate and reliable performance. The scalability of the workflow is also tested to screen more than a hundred ligands modeled in a single system, which also resulted in accurate predictions. With a cumulative sampling time of 150ns or less, the method results in average unsigned errors of under 1 kcal/mol in most cases for both small and large combinatorial libraries. For the multi-site systems examined, the method is estimated to be more than an order of magnitude more efficient than contemporary FEP applications. The results thus demonstrate the utility of the presented MSLD workflow to efficiently screen combinatorial libraries and explore chemical space around a lead compound, and thus are of utility in lead optimization.</p>

Download Full-text

Application of the ESMACS Binding Free Energy Protocol to a Highly Varied Ligand Dataset: Lactate Dehydogenase A

10.26434/chemrxiv.8398055 ◽

2019 ◽

Author(s):

David Wright ◽

Fouad Husseini ◽

Shunzhou Wan ◽

Christophe Meyer ◽

Herman Van Vlijmen ◽

...

Keyword(s):

Free Energy ◽

Surface Area ◽

Binding Free Energy ◽

Normal Mode Analysis ◽

Binding Mode ◽

Accessible Surface Area ◽

Solvent Accessible Surface Area ◽

Mode Analysis ◽

Energy Calculation ◽

Accessible Surface

<div>Here, we evaluate the performance of our range of ensemble simulation based binding free energy calculation protocols, called ESMACS (enhanced sampling of molecular dynamics with approximation of continuum solvent) for use in fragment based drug design scenarios. ESMACS is designed to generate reproducible binding affinity predictions from the widely used molecular mechanics Poisson-Boltzmann surface area (MMPBSA) approach. We study ligands designed to target two binding pockets in the lactate dehydogenase A target protein, which vary in size, charge and binding mode. When comparing to experimental results, we obtain excellent statistical rankings across this highly diverse set of ligands. In addition, we investigate three approaches to account for entropic contributions not captured by standard MMPBSA calculations: (1) normal mode analysis, (2) weighted solvent accessible surface area (WSAS) and (3) variational entropy. </div>

Download Full-text

Association/dissociation Mechanisms of Intrinsically Disordered Region of Protein Beyond Conformational Selection and Induced Fit

10.26434/chemrxiv.9874379 ◽

2019 ◽

Author(s):

Duy Phuoc Tran ◽

Akio Kitao

Keyword(s):

Free Energy ◽

Transcriptional Activation ◽

Binding Free Energy ◽

Energy Structure ◽

Induced Fit ◽

Hydrogen Bond Formation ◽

Conformational Selection ◽

Intrinsically Disordered ◽

Intrinsically Disordered Region ◽

Dissociation Mechanisms

<p>We investigate association and dissociation mechanisms of a typical intrinsically disordered region (IDR), transcriptional activation subdomain of tumor repressor protein p53 (TAD-p53) with murine double-minute clone 2 protein (MDM2). Using the combination of cycles of association and dissociation parallel cascade molecular dynamics, multiple standard MD, and Markov state model, we are successful in obtaining the lowest free energy structure of MDM2/TAD-p53 complex as the structure very close to that in crystal without prior knowledge. This method also reproduces the experimentally measured standard binding free energy, and association and dissociation rate constants solely with the accumulated MD simulation cost of 11.675 μs, in spite of the fact that actual dissociation occurs in the order of a second. Although there exist a few complex intermediates with similar free energies, TAD-p53 first binds MDM2 as the second lowest free energy intermediate dominantly (> 90% in flux), taking a form similar to one of the intermediate structures in its monomeric state. The mechanism of this step has a feature of conformational selection. In the second step, dehydration of the interface, formation of π-π stackings of the side-chains, and main-chain relaxation/hydrogen bond formation to complete α-helix take place, showing features of induced fit. In addition, dehydration (dewetting) is a key process for the final relaxation around the complex interface. These results demonstrate a more fine-grained view of the IDR association/dissociation beyond classical views of protein conformational change upon binding.</p>

Download Full-text

Automation of absolute protein-ligand binding free energy calculations for docking refinement and compound evaluation

Scientific Reports ◽

10.1038/s41598-020-80769-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Germano Heinzelmann ◽

Michael K. Gilson

Keyword(s):

Free Energy ◽

Early Stage ◽

Binding Free Energy ◽

Low Cost ◽

Free Energy Calculations ◽

Free Energies ◽

Energy Calculations ◽

Drug Candidates ◽

Binding Free Energy Calculations ◽

Binding Free Energies

AbstractAbsolute binding free energy calculations with explicit solvent molecular simulations can provide estimates of protein-ligand affinities, and thus reduce the time and costs needed to find new drug candidates. However, these calculations can be complex to implement and perform. Here, we introduce the software BAT.py, a Python tool that invokes the AMBER simulation package to automate the calculation of binding free energies for a protein with a series of ligands. The software supports the attach-pull-release (APR) and double decoupling (DD) binding free energy methods, as well as the simultaneous decoupling-recoupling (SDR) method, a variant of double decoupling that avoids numerical artifacts associated with charged ligands. We report encouraging initial test applications of this software both to re-rank docked poses and to estimate overall binding free energies. We also show that it is practical to carry out these calculations cheaply by using graphical processing units in common machines that can be built for this purpose. The combination of automation and low cost positions this procedure to be applied in a relatively high-throughput mode and thus stands to enable new applications in early-stage drug discovery.

Download Full-text

Pre-Planned and Non-Planned Agility in Patients Ongoing Rehabilitation after Knee Surgery: Design, Reliability and Validity of the Newly Developed Testing Protocols

Diagnostics ◽

10.3390/diagnostics11010146 ◽

2021 ◽

Vol 11 (1) ◽

pp. 146

Author(s):

Ivan Peric ◽

Miodrag Spasic ◽

Dario Novak ◽

Sergej Ostojic ◽

Damir Sekulic

Keyword(s):

Total Knee Arthroplasty ◽

Body Mass ◽

Knee Arthroplasty ◽

Arthroscopic Surgery ◽

The Body ◽

Clinical Population ◽

Pearson's R ◽

Total Knee ◽

Type Of Surgery ◽

Pearson’S R

Background: Due to its association with the risk of falling and consequent injury, the importance of agility is widely recognized, but no study so far has examined the different facets of agility in an untrained/clinical population. The aim of this study was to evaluate the reliability, validity, and correlates of newly developed tests of non-planned agility (NPA) and pre-planned agility (PPA) in an untrained/clinical sample. Methods: The sample comprised 38 participants older than 40 years (22 females, age: 56.1 ± 17.3 years, height: 170.4 ± 10.8 cm, mass: 82.54 ± 14.79 kg) who were involved in a rehabilitation program following total knee arthroplasty and knee arthroscopy. Variables included age, gender, type of surgery, history of fall, anthropometrics/body composition, and newly developed tests of NPA and PPA. Results: The results showed the high inter-testing- (ICC > 0.95, CV < 9%), and intra-testing-reliability (ICC > 0.96, CV < 9) of the newly developed tests. PPA and NPA were found to be valid in differentiation between age groups (>50 yrs. vs. <50 yrs.), and genders, with better performance in younger participants and males. Only NPA differentiated participants according to type of surgery, with better performance in those who had arthroscopic surgery, than those who had total knee arthroplasty. No differences in NPA and PPA were established between groups based on fall-history. In females, the body mass (Pearson’s r = 0.58 and 0.59, p < 0.001) and body fatness (Pearson’s r = 0.64 and 0.66, p < 0.001) were negatively correlated, while the lean body mass (Pearson’s r = 0.70 and 0.68, p < 0.001) was positively correlated with PPA and NPA. The NPA and PPA were highly correlated (Pearson’s r = 0.98, p < 0.001). Conclusions: We found that the proposed tests are reliable when evaluating agility characteristics in an untrained/clinical population after knee arthroplasty/arthroscopy. Further evaluation of the specific validity of the proposed tests in other specific subsamples is warranted.

Download Full-text

Binding free energy predictions in host-guest systems using Autodock4. A retrospective analysis on SAMPL6, SAMPL7 and SAMPL8 challenges

Journal of Computer-Aided Molecular Design ◽

10.1007/s10822-021-00388-4 ◽

2021 ◽

Author(s):

Lorenzo Casbarra ◽

Piero Procacci

Keyword(s):

Molecular Dynamics ◽

Free Energy ◽

Drug Discovery ◽

Binding Free Energy ◽

Cost Ratio ◽

Benefit Cost Ratio ◽

Docking Program ◽

Benefit Cost ◽

Overall Reliability ◽

Absolute Binding Free Energy

AbstractWe systematically tested the Autodock4 docking program for absolute binding free energy predictions using the host-guest systems from the recent SAMPL6, SAMPL7 and SAMPL8 challenges. We found that Autodock4 behaves surprisingly well, outperforming in many instances expensive molecular dynamics or quantum chemistry techniques, with an extremely favorable benefit-cost ratio. Some interesting features of Autodock4 predictions are revealed, yielding valuable hints on the overall reliability of docking screening campaigns in drug discovery projects.

Download Full-text