pyDockEneRes: per-residue decomposition of protein–protein docking energy

Miguel Romero-Durana; Brian Jiménez-García; Juan Fernández-Recio

doi:10.1093/bioinformatics/btz884

pyDockEneRes: per-residue decomposition of protein–protein docking energy

Bioinformatics ◽

10.1093/bioinformatics/btz884 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2284-2285 ◽

Cited By ~ 1

Author(s):

Miguel Romero-Durana ◽

Brian Jiménez-García ◽

Juan Fernández-Recio

Keyword(s):

Binding Affinity ◽

Protein Interactions ◽

Structural Model ◽

Protein Complexes ◽

Complex Structure ◽

Protein Docking ◽

Supplementary Information ◽

Scoring Functions ◽

Residue Decomposition ◽

Docking Energy

Abstract Motivation Protein–protein interactions are key to understand biological processes at the molecular level. As a complement to experimental characterization of protein interactions, computational docking methods have become useful tools for the structural and energetics modeling of protein–protein complexes. A key aspect of such algorithms is the use of scoring functions to evaluate the generated docking poses and try to identify the best models. When the scoring functions are based on energetic considerations, they can help not only to provide a reliable structural model for the complex, but also to describe energetic aspects of the interaction. This is the case of the scoring function used in pyDock, a combination of electrostatics, desolvation and van der Waals energy terms. Its correlation with experimental binding affinity values of protein–protein complexes was explored in the past, but the per-residue decomposition of the docking energy was never systematically analyzed. Results Here, we present pyDockEneRes (pyDock Energy per-Residue), a web server that provides pyDock docking energy partitioned at the residue level, giving a much more detailed description of the docking energy landscape. Additionally, pyDockEneRes computes the contribution to the docking energy of the side-chain atoms. This fast approach can be applied to characterize a complex structure in order to identify energetically relevant residues (hot-spots) and estimate binding affinity changes upon mutation to alanine. Availability and implementation The server does not require registration by the user and is freely accessible for academics at https://life.bsc.es/pid/pydockeneres. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

InterPep2: global peptide–protein docking using interaction surface templates

Bioinformatics ◽

10.1093/bioinformatics/btaa005 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2458-2465 ◽

Cited By ~ 2

Author(s):

Isak Johansson-Åkhe ◽

Claudio Mirabello ◽

Björn Wallner

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Structural Features ◽

Protein Docking ◽

Supplementary Information ◽

Peptide Ligand ◽

Protein Protein Interactions ◽

Intrinsically Disordered ◽

Intrinsically Disordered Regions ◽

Improved Performance

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

UEP: an open-source and fast classifier for predicting the impact of mutations in protein–protein complexes

Bioinformatics ◽

10.1093/bioinformatics/btaa708 ◽

2020 ◽

Author(s):

Pep Amengual-Rigo ◽

Juan Fernández-Recio ◽

Victor Guallar

Keyword(s):

Open Source ◽

Binding Affinity ◽

Protein Interactions ◽

Protein Complexes ◽

Selection Procedure ◽

Supplementary Information ◽

Computational Time ◽

Contact Potential ◽

Three Body ◽

The Impact

Abstract Motivation Single protein residue mutations may reshape the binding affinity of protein–protein interactions. Therefore, predicting its effects is of great interest in biotechnology and biomedicine. Unfortunately, the availability of experimental data on binding affinity changes upon mutation is limited, which hampers the development of new and more precise algorithms. Here, we propose UEP, a classifier for predicting beneficial and detrimental mutations in protein–protein complexes trained on interactome data. Results Regardless of the simplicity of the UEP algorithm, which is based on a simple three-body contact potential derived from interactome data, we report competitive results with the gold standard methods in this field with the advantage of being faster in terms of computational time. Moreover, we propose a consensus selection procedure by involving the combination of three predictors that showed higher classification accuracy in our benchmark: UEP, pyDock and EvoEF1/FoldX. Overall, we demonstrate that the analysis of interactome data allows predicting the impact of protein–protein mutations using UEP, a fast and reliable open-source code. Availability and implementation UEP algorithm can be found at: https://github.com/pepamengual/UEP. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

iScore: a novel graph kernel-based function for scoring protein–protein docking models

Bioinformatics ◽

10.1093/bioinformatics/btz496 ◽

2019 ◽

Vol 36 (1) ◽

pp. 112-121 ◽

Cited By ~ 9

Author(s):

Cunliang Geng ◽

Yong Jung ◽

Nicolas Renaud ◽

Vasant Honavar ◽

Alexandre M J J Bonvin ◽

...

Keyword(s):

Protein Complexes ◽

Three Dimensional ◽

Protein Docking ◽

Graph Representation ◽

Supplementary Information ◽

Scoring Functions ◽

Computational Docking ◽

3D Structures ◽

Novel Approach ◽

Protein Interfaces

Abstract Motivation Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge. Results Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein–protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes. Availability and implementation The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Integrating ab initio and template-based algorithms for protein–protein complex structure prediction

Bioinformatics ◽

10.1093/bioinformatics/btz623 ◽

2019 ◽

Vol 36 (3) ◽

pp. 751-757 ◽

Cited By ~ 1

Author(s):

Sweta Vangaveti ◽

Thom Vreven ◽

Yang Zhang ◽

Zhiping Weng

Keyword(s):

Protein Complex ◽

Structure Prediction ◽

Protein Complexes ◽

Complex Structure ◽

Protein Docking ◽

Supplementary Information ◽

Test Case ◽

Binding Modes ◽

Success Rates ◽

Template Free

Abstract Motivation Template-based and template-free methods have both been widely used in predicting the structures of protein–protein complexes. Template-based modeling is effective when a reliable template is available, while template-free methods are required for predicting the binding modes or interfaces that have not been previously observed. Our goal is to combine the two methods to improve computational protein–protein complex structure prediction. Results Here, we present a method to identify and combine high-confidence predictions of a template-based method (SPRING) with a template-free method (ZDOCK). Cross-validated using the protein–protein docking benchmark version 5.0, our method (ZING) achieved a success rate of 68.2%, outperforming SPRING and ZDOCK, with success rates of 52.1% and 35.9% respectively, when the top 10 predictions were considered per test case. In conclusion, a statistics-based method that evaluates and integrates predictions from template-based and template-free methods is more successful than either method independently. Availability and implementation ZING is available for download as a Github repository (https://github.com/weng-lab/ZING.git). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Atomic-level evolutionary information improves protein-protein interface scoring

Bioinformatics ◽

10.1093/bioinformatics/btab254 ◽

2021 ◽

Author(s):

Chloé Quignot ◽

Pierre Granger ◽

Pablo Chacón ◽

Raphael Guerois ◽

Jessica Andreani

Keyword(s):

Success Rate ◽

Protein Interactions ◽

Protein Docking ◽

Atomic Level ◽

Supplementary Information ◽

Evolutionary Information ◽

General Strategy ◽

Scoring Functions ◽

Success Rates ◽

Novel Strategy

Abstract Motivation The crucial role of protein interactions and the difficulty in characterising them experimentally strongly motivates the development of computational approaches for structural prediction. Even when protein-protein docking samples correct models, current scoring functions struggle to discriminate them from incorrect decoys. The previous incorporation of conservation and coevolution information has shown promise for improving protein-protein scoring. Here, we present a novel strategy to integrate atomic-level evolutionary information into different types of scoring functions to improve their docking discrimination. Results : We applied this general strategy to our residue-level statistical potential from InterEvScore and to two atomic-level scores, SOAP-PP and Rosetta interface score (ISC). Including evolutionary information from as few as ten homologous sequences improves the top 10 success rates of individual atomic-level scores SOAP-PP and Rosetta ISC by respectively 6 and 13.5 percentage points, on a large benchmark of 752 docking cases. The best individual homology-enriched score reaches a top 10 success rate of 34.4%. A consensus approach based on the complementarity between different homology-enriched scores further increases the top 10 success rate to 40%. Availability All data used for benchmarking and scoring results, as well as a Singularity container of the pipeline, are available at http://biodev.cea.fr/interevol/interevdata/ Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Text mining for modeling of protein complexes enhanced by machine learning

Bioinformatics ◽

10.1093/bioinformatics/btaa823 ◽

2020 ◽

Author(s):

Varsha D Badal ◽

Petras J Kundrotas ◽

Ilya A Vakser

Keyword(s):

Machine Learning ◽

Text Mining ◽

Protein Interactions ◽

Full Text ◽

Protein Complexes ◽

Protein Docking ◽

Supplementary Information ◽

Support Vector ◽

Learning Approaches ◽

Protein Protein Interactions

Abstract Motivation Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. Results We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. Availability The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ProAffiMuSeq: sequence-based method to predict the binding free energy change of protein–protein complexes upon mutation using functional classification

Bioinformatics ◽

10.1093/bioinformatics/btz829 ◽

2019 ◽

Cited By ~ 1

Author(s):

Sherlyn Jemimah ◽

Masakazu Sekijima ◽

M Michael Gromiha

Keyword(s):

Free Energy ◽

Protein Interactions ◽

Large Scale ◽

Structural Information ◽

Protein Complexes ◽

Binding Free Energy ◽

Complex Structure ◽

External Validation ◽

Functional Class ◽

Supplementary Information

Abstract Motivation Protein–protein interactions are essential for the cell and mediate various functions. However, mutations can disrupt these interactions and may cause diseases. Currently available computational methods require a complex structure as input for predicting the change in binding affinity. Further, they have not included the functional class information for the protein–protein complex. To address this, we have developed a method, ProAffiMuSeq, which predicts the change in binding free energy using sequence-based features and functional class. Results Our method shows an average correlation between predicted and experimentally determined ΔΔG of 0.73 and mean absolute error (MAE) of 0.86 kcal/mol in 10-fold cross-validation and correlation of 0.75 with MAE of 0.94 kcal/mol in the test dataset. ProAffiMuSeq was also tested on an external validation set and showed results comparable to structure-based methods. Our method can be used for large-scale analysis of disease-causing mutations in protein–protein complexes without structural information. Availability and implementation Users can access the method at https://web.iitm.ac.in/bioinfo2/proaffimuseq/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences

Bioinformatics ◽

10.1093/bioinformatics/btz324 ◽

2019 ◽

Vol 35 (14) ◽

pp. i343-i353 ◽

Cited By ~ 10

Author(s):

Jian Zhang ◽

Lukasz Kurgan

Keyword(s):

Protein Binding ◽

Protein Interactions ◽

Rna Binding ◽

Protein Complexes ◽

Predictive Performance ◽

Protein Docking ◽

Supplementary Information ◽

Binding Residue ◽

Binding Residues ◽

The Cross

AbstractMotivationAccurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use.ResultsWe propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins.Availability and implementationSCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

A knowledge-based scoring function to assess quaternary associations of proteins

Bioinformatics ◽

10.1093/bioinformatics/btaa207 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3739-3748

Author(s):

Abhilesh S Dhawanjewar ◽

Ankit A Roy ◽

Mallur S Madhusudhan

Keyword(s):

Protein Interactions ◽

Statistical Physics ◽

Binary Classification ◽

Scoring Function ◽

Protein Docking ◽

Supplementary Information ◽

Scoring Functions ◽

Biological Interactions ◽

Protein Protein Interactions ◽

Knowledge Based

Abstract Motivation The elucidation of all inter-protein interactions would significantly enhance our knowledge of cellular processes at a molecular level. Given the enormity of the problem, the expenses and limitations of experimental methods, it is imperative that this problem is tackled computationally. In silico predictions of protein interactions entail sampling different conformations of the purported complex and then scoring these to assess for interaction viability. In this study, we have devised a new scheme for scoring protein–protein interactions. Results Our method, PIZSA (Protein Interaction Z-Score Assessment), is a binary classification scheme for identification of native protein quaternary assemblies (binders/nonbinders) based on statistical potentials. The scoring scheme incorporates residue–residue contact preference on the interface with per residue-pair atomic contributions and accounts for clashes. PIZSA can accurately discriminate between native and non-native structural conformations from protein docking experiments and outperform other contact-based potential scoring functions. The method has been extensively benchmarked and is among the top 6 methods, outperforming 31 other statistical, physics based and machine learning scoring schemes. The PIZSA potentials can also distinguish crystallization artifacts from biological interactions. Availability and implementation PIZSA is implemented as a web server at http://cospi.iiserpune.ac.in/pizsa and can be downloaded as a standalone package from http://cospi.iiserpune.ac.in/pizsa/Download/Download.html. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Topology independent structural matching discovers novel templates for protein interfaces

10.1101/235812 ◽

2017 ◽

Author(s):

Claudio Mirabello ◽

Björn Wallner

Keyword(s):

Protein Interactions ◽

Protein Function ◽

Protein Complexes ◽

Structural Alignment ◽

Complex Structure ◽

Potential Interaction ◽

Supplementary Information ◽

Alignment Algorithms ◽

Protein Interfaces ◽

Wide Range

AbstractMotivationProtein-protein interactions (PPI) are essential for the function of the cellular machinery. The rapid growth of protein-protein complexes with known 3D structures offers a unique opportunity to study PPI to gain crucial insights into protein function and the causes of many diseases. In particular, it would be extremely useful to compare interaction surfaces of monomers, as this would enable the pinpointing of potential interaction surfaces based solely on the monomer structure, without the need to predict the complete complex structure. While there are many structural alignment algorithms for individual proteins, very few have been developed for protein interfaces, and none that can align only the interface residues to other interfaces or surfaces of interacting monomer subunits in a topology independent (non-sequential) manner.ResultsWe present InterComp, a method for topology and sequence-order independent structural comparisons. The method is general and can be applied to various structural comparison applications. By representing residues as independent points in space rather than as a sequence of residues, InterComp can can be applied to a wide range of problems including: interface-surface comparisons, interface-interface comparisons and even comparisons of small molecule ligands. We demonstrate a use-case by applying InterComp to find similar protein interfaces on the surface of proteins. We show that InterComp pinpoints the correct interface for almost half of the targets (283 of 586) when considering the top 10 hits, and for 24% of the top 1, even when no templates can be found with the already available sequence-order dependent methods like TM-align.AvailabilityThe program is available from: http://wallnerlab.org/[email protected] informationSupplementary data included in the pdf.

Download Full-text