Application of asymmetric statistical potentials to antibody–protein docking

Ryan Brenke; David R. Hall; Gwo-Yu Chuang; Stephen R. Comeau; Tanggis Bohnuud; Dmitri Beglov; Ora Schueler-Furman; Sandor Vajda; Dima Kozakov

doi:10.1093/bioinformatics/bts493

Application of asymmetric statistical potentials to antibody–protein docking

Bioinformatics ◽

10.1093/bioinformatics/bts493 ◽

2012 ◽

Vol 28 (20) ◽

pp. 2608-2614 ◽

Cited By ~ 86

Author(s):

Ryan Brenke ◽

David R. Hall ◽

Gwo-Yu Chuang ◽

Stephen R. Comeau ◽

Tanggis Bohnuud ◽

...

Keyword(s):

Protein Complexes ◽

Protein Docking ◽

Protein Antigen ◽

Supplementary Information ◽

Sequence Information ◽

New Class ◽

Knowledge Based ◽

Docking Program ◽

Antigen Protein ◽

Multi Body

Abstract Motivation: An effective docking algorithm for antibody–protein antigen complex prediction is an important first step toward design of biologics and vaccines. We have recently developed a new class of knowledge-based interaction potentials called Decoys as the Reference State (DARS) and incorporated DARS into the docking program PIPER based on the fast Fourier transform correlation approach. Although PIPER was the best performer in the latest rounds of the CAPRI protein docking experiment, it is much less accurate for docking antibody–protein antigen pairs than other types of complexes, in spite of incorporating sequence-based information on the location of the paratope. Analysis of antibody–protein antigen complexes has revealed an inherent asymmetry within these interfaces. Specifically, phenylalanine, tryptophan and tyrosine residues highly populate the paratope of the antibody but not the epitope of the antigen. Results: Since this asymmetry cannot be adequately modeled using a symmetric pairwise potential, we have removed the usual assumption of symmetry. Interaction statistics were extracted from antibody–protein complexes under the assumption that a particular atom on the antibody is different from the same atom on the antigen protein. The use of the new potential significantly improves the performance of docking for antibody–protein antigen complexes, even without any sequence information on the location of the paratope. We note that the asymmetric potential captures the effects of the multi-body interactions inherent to the complex environment in the antibody–protein antigen interface. Availability: The method is implemented in the ClusPro protein docking server, available at http://cluspro.bu.edu. Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

UNRES-Dock—protein–protein and peptide–protein docking by coarse-grained replica-exchange MD simulations

Bioinformatics ◽

10.1093/bioinformatics/btaa897 ◽

2020 ◽

Cited By ~ 1

Author(s):

Paweł Krupa ◽

Agnieszka S Karczyńska ◽

Magdalena A Mozolewska ◽

Adam Liwo ◽

Cezary Czaplewski

Keyword(s):

Protein Complexes ◽

Md Simulations ◽

Protein Docking ◽

Conformational Space ◽

Coarse Grained ◽

Supplementary Information ◽

Replica Exchange ◽

Variable Degree ◽

Single Chain ◽

Simulation Speed

Abstract Motivation The majority of the proteins in living organisms occur as homo- or hetero-multimeric structures. Although there are many tools to predict the structures of single-chain proteins or protein complexes with small ligands, peptide–protein and protein–protein docking is more challenging. In this work, we utilized multiplexed replica-exchange molecular dynamics (MREMD) simulations with the physics-based heavily coarse-grained UNRES model, which provides more than a 1000-fold simulation speed-up compared with all-atom approaches to predict structures of protein complexes. Results We present a new protein–protein and peptide–protein docking functionality of the UNRES package, which includes a variable degree of conformational flexibility. UNRES-Dock protocol was tested on a set of 55 complexes with size from 43 to 587 amino-acid residues, showing that structures of the complexes can be predicted with good quality, if the sampling of the conformational space is sufficient, especially for flexible peptide–protein systems. The developed automatized protocol has been implemented in the standalone UNRES package and in the UNRES server. Availability and implementation UNRES server: http://unres-server.chem.ug.edu.pl; UNRES package and data used in testing of UNRES-Dock: http://unres.pl. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

InterPep2: global peptide–protein docking using interaction surface templates

Bioinformatics ◽

10.1093/bioinformatics/btaa005 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2458-2465 ◽

Cited By ~ 2

Author(s):

Isak Johansson-Åkhe ◽

Claudio Mirabello ◽

Björn Wallner

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Structural Features ◽

Protein Docking ◽

Supplementary Information ◽

Peptide Ligand ◽

Protein Protein Interactions ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Improved Performance

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

iScore: a novel graph kernel-based function for scoring protein–protein docking models

Bioinformatics ◽

10.1093/bioinformatics/btz496 ◽

2019 ◽

Vol 36 (1) ◽

pp. 112-121 ◽

Cited By ~ 9

Author(s):

Cunliang Geng ◽

Yong Jung ◽

Nicolas Renaud ◽

Vasant Honavar ◽

Alexandre M J J Bonvin ◽

...

Keyword(s):

Protein Complexes ◽

Three Dimensional ◽

Protein Docking ◽

Graph Representation ◽

Supplementary Information ◽

Scoring Functions ◽

Computational Docking ◽

3D Structures ◽

Novel Approach ◽

Protein Interfaces

Abstract Motivation Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge. Results Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein–protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes. Availability and implementation The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Integrating ab initio and template-based algorithms for protein–protein complex structure prediction

Bioinformatics ◽

10.1093/bioinformatics/btz623 ◽

2019 ◽

Vol 36 (3) ◽

pp. 751-757 ◽

Cited By ~ 1

Author(s):

Sweta Vangaveti ◽

Thom Vreven ◽

Yang Zhang ◽

Zhiping Weng

Keyword(s):

Protein Complex ◽

Structure Prediction ◽

Protein Complexes ◽

Complex Structure ◽

Protein Docking ◽

Supplementary Information ◽

Test Case ◽

Binding Modes ◽

Success Rates ◽

Template Free

Abstract Motivation Template-based and template-free methods have both been widely used in predicting the structures of protein–protein complexes. Template-based modeling is effective when a reliable template is available, while template-free methods are required for predicting the binding modes or interfaces that have not been previously observed. Our goal is to combine the two methods to improve computational protein–protein complex structure prediction. Results Here, we present a method to identify and combine high-confidence predictions of a template-based method (SPRING) with a template-free method (ZDOCK). Cross-validated using the protein–protein docking benchmark version 5.0, our method (ZING) achieved a success rate of 68.2%, outperforming SPRING and ZDOCK, with success rates of 52.1% and 35.9% respectively, when the top 10 predictions were considered per test case. In conclusion, a statistics-based method that evaluates and integrates predictions from template-based and template-free methods is more successful than either method independently. Availability and implementation ZING is available for download as a Github repository (https://github.com/weng-lab/ZING.git). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ProteinFishing: a protein complex generator within the ModelX toolsuite

Bioinformatics ◽

10.1093/bioinformatics/btaa533 ◽

2020 ◽

Vol 36 (14) ◽

pp. 4208-4210

Author(s):

Damiano Cianferoni ◽

Leandro G Radusky ◽

Sarah A Head ◽

Luis Serrano ◽

Javier Delgado

Keyword(s):

Force Field ◽

Protein Interactions ◽

Protein Complex ◽

Interface Design ◽

Protein Complexes ◽

Interleukin 10 ◽

Supplementary Information ◽

For Profit ◽

Knowledge Based ◽

Command Line Tool

Abstract Summary Accurate 3D modelling of protein–protein interactions (PPI) is essential to compensate for the absence of experimentally determined complex structures. Here, we present a new set of commands within the ModelX toolsuite capable of generating atomic-level protein complexes suitable for interface design. Among these commands, the new tool ProteinFishing proposes known and/or putative alternative 3D PPI for a given protein complex. The algorithm exploits backbone compatibility of protein fragments to generate mutually exclusive protein interfaces that are quickly evaluated with a knowledge-based statistical force field. Using interleukin-10-R2 co-crystalized with interferon-lambda-3, and a database of X-ray structures containing interleukin-10, this algorithm was able to generate interleukin-10-R2/interleukin-10 structural models in agreement with experimental data. Availability and implementation ProteinFishing is a portable command-line tool included in the ModelX toolsuite, written in C++, that makes use of an SQL (tested for MySQL and MariaDB) relational database delivered with a template SQL dump called FishXDB. FishXDB contains the empty tables of ModelX fragments and the data used by the embedded statistical force field. ProteinFishing is compiled for Linux-64bit, MacOS-64bit and Windows-32bit operating systems. This software is a proprietary license and is distributed as an executable with its correspondent database dumps. It can be downloaded publicly at http://modelx.crg.es/. Licenses are freely available for academic users after registration on the website and are available under commercial license for for-profit organizations or companies. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Text mining for modeling of protein complexes enhanced by machine learning

Bioinformatics ◽

10.1093/bioinformatics/btaa823 ◽

2020 ◽

Author(s):

Varsha D Badal ◽

Petras J Kundrotas ◽

Ilya A Vakser

Keyword(s):

Machine Learning ◽

Text Mining ◽

Protein Interactions ◽

Full Text ◽

Protein Complexes ◽

Protein Docking ◽

Supplementary Information ◽

Support Vector ◽

Learning Approaches ◽

Protein Protein Interactions

Abstract Motivation Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. Results We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. Availability The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

pyDockEneRes: per-residue decomposition of protein–protein docking energy

Bioinformatics ◽

10.1093/bioinformatics/btz884 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2284-2285 ◽

Cited By ~ 1

Author(s):

Miguel Romero-Durana ◽

Brian Jiménez-García ◽

Juan Fernández-Recio

Keyword(s):

Binding Affinity ◽

Protein Interactions ◽

Structural Model ◽

Protein Complexes ◽

Complex Structure ◽

Protein Docking ◽

Supplementary Information ◽

Scoring Functions ◽

Residue Decomposition ◽

Docking Energy

Abstract Motivation Protein–protein interactions are key to understand biological processes at the molecular level. As a complement to experimental characterization of protein interactions, computational docking methods have become useful tools for the structural and energetics modeling of protein–protein complexes. A key aspect of such algorithms is the use of scoring functions to evaluate the generated docking poses and try to identify the best models. When the scoring functions are based on energetic considerations, they can help not only to provide a reliable structural model for the complex, but also to describe energetic aspects of the interaction. This is the case of the scoring function used in pyDock, a combination of electrostatics, desolvation and van der Waals energy terms. Its correlation with experimental binding affinity values of protein–protein complexes was explored in the past, but the per-residue decomposition of the docking energy was never systematically analyzed. Results Here, we present pyDockEneRes (pyDock Energy per-Residue), a web server that provides pyDock docking energy partitioned at the residue level, giving a much more detailed description of the docking energy landscape. Additionally, pyDockEneRes computes the contribution to the docking energy of the side-chain atoms. This fast approach can be applied to characterize a complex structure in order to identify energetically relevant residues (hot-spots) and estimate binding affinity changes upon mutation to alanine. Availability and implementation The server does not require registration by the user and is freely accessible for academics at https://life.bsc.es/pid/pydockeneres. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

PatchMAN docking: Modeling peptide-protein interactions in the context of the receptor surface

10.1101/2021.09.02.458699 ◽

2021 ◽

Author(s):

Alisa Khramushin ◽

Tomer Tsaban ◽

Julia Varga ◽

Orly Avraham ◽

Ora Schueler-Furman

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Protein Structures ◽

Binding Pocket ◽

Protein Docking ◽

Peptide Sequence ◽

Sequence Information ◽

Peptide Docking ◽

Docking Approach ◽

Conformer Ensemble

AbstractPeptide docking can be perceived as a subproblem of protein-protein docking. However, due to the short length and flexible nature of peptides, many do not adopt one defined conformation prior to binding. Therefore, to tackle a peptide docking problem, not only the relative orientation between the two partners, but also the bound conformation of the peptide needs to be modeled. Traditional peptide-centered approaches use information about the peptide sequence to generate a representative conformer ensemble, which can then be rigid body docked to the receptor. Alternatively, one may look at this problem from the viewpoint of the receptor, namely that the protein surface defines the peptide bound conformation.We present PatchMAN (Patch-Motif AligNments), a novel peptide docking approach which uses structural motifs to map the receptor surface with backbone scaffolds extracted from protein structures. On a non-redundant set of protein-peptide complexes, starting from free receptor structures, PatchMAN successfully models and identifies near-native peptide-protein complexes in 62% / 81% within 2.5Å / 5Å RMSD, with corresponding sampling in 81% / 100% of the cases, outperforming other approaches. PatchMAN leverages the observation that structural units of peptides with their binding pocket can be found not only within interfaces, but also within monomers. We show that the conformation of the bound peptide is sampled based on the structural context of the receptor only, without taking into account any sequence information. Beyond peptide docking, this approach opens exciting new avenues to study principles of peptide-protein association, and to the design of new peptide binders.

Download Full-text

SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences

Bioinformatics ◽

10.1093/bioinformatics/btz324 ◽

2019 ◽

Vol 35 (14) ◽

pp. i343-i353 ◽

Cited By ~ 10

Author(s):

Jian Zhang ◽

Lukasz Kurgan

Keyword(s):

Protein Binding ◽

Protein Interactions ◽

Rna Binding ◽

Protein Complexes ◽

Predictive Performance ◽

Protein Docking ◽

Supplementary Information ◽

Binding Residue ◽

Binding Residues ◽

The Cross

AbstractMotivationAccurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use.ResultsWe propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins.Availability and implementationSCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

Patch-DCA: improved protein interface prediction by utilizing structural information and clustering DCA scores

Bioinformatics ◽

10.1093/bioinformatics/btz791 ◽

2019 ◽

Cited By ~ 1

Author(s):

Amir Vajdi ◽

Kourosh Zarringhalam ◽

Nurit Haspel

Keyword(s):

Structural Information ◽

Protein Complexes ◽

Supplementary Information ◽

Evolutionary Information ◽

Sequence Information ◽

Protein Interface ◽

Residue Contacts ◽

Interface Prediction ◽

Sequential Information ◽

The Individual

Abstract Motivation Over the past decade, there have been impressive advances in determining the 3D structures of protein complexes. However, there are still many complexes with unknown structures, even when the structures of the individual proteins are known. The advent of protein sequence information provides an opportunity to leverage evolutionary information to enhance the accuracy of protein–protein interface prediction. To this end, several statistical and machine learning methods have been proposed. In particular, direct coupling analysis has recently emerged as a promising approach for identification of protein contact maps from sequential information. However, the ability of these methods to detect protein–protein inter-residue contacts remains relatively limited. Results In this work, we propose a method to integrate sequential and co-evolution information with structural and functional information to increase the performance of protein–protein interface prediction. Further, we present a post-processing clustering method that improves the average relative F1 score by 70% and 24% and the average relative precision by 80% and 36% in comparison with two state-of-the-art methods, PSICOV and GREMLIN. Availability and implementation https://github.com/BioMLBoston/PatchDCA Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text