scholarly journals InterPepRank: Assessment of Docked Peptide Conformations by a Deep Graph Network

2020 ◽  
Author(s):  
Isak Johansson-Åkhe ◽  
Claudio Mirabello ◽  
Björn Wallner

AbstractMotivationPeptide-protein interactions between a smaller or disordered peptide stretch and a folded receptor make up a large part of all protein-protein interactions. A common approach for modelling such interactions is to exhaustively sample the conformational space by fast-fourier-transform docking, and then refine a top percentage of decoys. Commonly, methods capable of ranking the decoys for selection in short enough time for larger scale studies rely on first-principle energy terms such as electrostatics, Van der Waals forces, or on pre-calculated statistical pairwise potentials.ResultsWe present InterPepRank for peptide-protein complex scoring and ranking. InterPepRank is a machine-learning based method which encodes the structure of the complex as a graph; with physical pairwise interactions as edges and evolutionary and sequence features as nodes. The graph-network is trained to predict the LRMSD of decoys by using edge-conditioned graph convolutions on a large set of peptide-protein complex decoys. InterPepRank is tested on a massive independent test set with no targets sharing CATH annotation nor 30% sequence identity with any target in training or validation data. On this set, InterPepRank has a median AUC of 0.86 for finding coarse peptide-protein complexes with LRMSD<4Å. This is an improvement compared to other state-of-the-art ranking methods that have a median AUC of circa 0.69. When included as selection-method for selecting decoys for refinement in a previously established peptide docking pipeline, InterPepRank improves the number of Medium and High quality models produced by 80% and 40%, respectively.AvailabilityThe program is available from: http://wallnerlab.org/InterPepRankContactBjörn Wallner [email protected] informationSupplementary data are available at BioRxiv online.

2020 ◽  
Vol 36 (8) ◽  
pp. 2458-2465 ◽  
Author(s):  
Isak Johansson-Åkhe ◽  
Claudio Mirabello ◽  
Björn Wallner

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Lindsey R. Pack ◽  
Leighton H. Daigh ◽  
Mingyu Chung ◽  
Tobias Meyer

Abstract Understanding the stability or binding affinity of protein complex members is important for understanding their regulation and roles in cells. While there are many biochemical methods to measure protein-protein interactions in vitro, these methods often rely on the ability to robustly purify components individually. Moreover, few methods have been developed to study protein complexes within live cells. Binding parameters for cyclin-dependent kinase (CDK) complexes have been challenging to measure due to difficulty expressing and purifying CDKs separately from activating cyclins. Here, we develop a method to measure off-rates of protein complex components in live-cells. Our method relies on the stable tethering of CDK to the inner nuclear membrane (Figure 1), and the utilization of FRAP to measure the off-rate of soluble, fluorescently-tagged CDK binding proteins. We use this method to study dimeric CDK complexes, measuring the off-rates of cyclins or INK4 CDK inhibitor p16 from CDKs, and trimeric CDK complexes, measuring the off-rate of cyclins and CIP/KIP CDK inhibitors p21 and p27 when bound together.


2020 ◽  
Vol 36 (14) ◽  
pp. 4208-4210
Author(s):  
Damiano Cianferoni ◽  
Leandro G Radusky ◽  
Sarah A Head ◽  
Luis Serrano ◽  
Javier Delgado

Abstract Summary Accurate 3D modelling of protein–protein interactions (PPI) is essential to compensate for the absence of experimentally determined complex structures. Here, we present a new set of commands within the ModelX toolsuite capable of generating atomic-level protein complexes suitable for interface design. Among these commands, the new tool ProteinFishing proposes known and/or putative alternative 3D PPI for a given protein complex. The algorithm exploits backbone compatibility of protein fragments to generate mutually exclusive protein interfaces that are quickly evaluated with a knowledge-based statistical force field. Using interleukin-10-R2 co-crystalized with interferon-lambda-3, and a database of X-ray structures containing interleukin-10, this algorithm was able to generate interleukin-10-R2/interleukin-10 structural models in agreement with experimental data. Availability and implementation ProteinFishing is a portable command-line tool included in the ModelX toolsuite, written in C++, that makes use of an SQL (tested for MySQL and MariaDB) relational database delivered with a template SQL dump called FishXDB. FishXDB contains the empty tables of ModelX fragments and the data used by the embedded statistical force field. ProteinFishing is compiled for Linux-64bit, MacOS-64bit and Windows-32bit operating systems. This software is a proprietary license and is distributed as an executable with its correspondent database dumps. It can be downloaded publicly at http://modelx.crg.es/. Licenses are freely available for academic users after registration on the website and are available under commercial license for for-profit organizations or companies. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Varsha D Badal ◽  
Petras J Kundrotas ◽  
Ilya A Vakser

Abstract Motivation Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. Results We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. Availability The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. Supplementary information Supplementary data are available at Bioinformatics online.


2014 ◽  
Vol 169 ◽  
pp. 425-441 ◽  
Author(s):  
Guillaume Levieux ◽  
Guillaume Tiger ◽  
Stéphanie Mader ◽  
Jean-François Zagury ◽  
Stéphane Natkin ◽  
...  

Protein–protein interactions play a crucial role in biological processes. Protein docking calculations' goal is to predict, given two proteins of known structures, the associate conformation of the corresponding complex. Here, we present a new interactive protein docking system, Udock, that makes use of users' cognitive capabilities added up. In Udock, the users tackle simplified representations of protein structures and explore protein–protein interfaces’ conformational space using a gamified interactive docking system with on the fly scoring. We assumed that if given appropriate tools, a naïve user's cognitive capabilities could provide relevant data for (1) the prediction of correct interfaces in binary protein complexes and (2) the identification of the experimental partner in interaction among a set of decoys. To explore this approach experimentally, we conducted a preliminary two week long playtest where the registered users could perform a cross-docking on a dataset comprising 4 binary protein complexes. The users explored almost all the surface of the proteins that were available in the dataset but favored certain regions that seemed more attractive as potential docking spots. These favored regions were located inside or nearby the experimental binding interface for 5 out of the 8 proteins in the dataset. For most of them, the best scores were obtained with the experimental partner. The alpha version of Udock is freely accessible at http://udock.fr.


2008 ◽  
Vol 06 (03) ◽  
pp. 435-466 ◽  
Author(s):  
HON NIAN CHUA ◽  
KANG NING ◽  
WING-KIN SUNG ◽  
HON WAI LEONG ◽  
LIMSOON WONG

Protein complexes are fundamental for understanding principles of cellular organizations. As the sizes of protein–protein interaction (PPI) networks are increasing, accurate and fast protein complex prediction from these PPI networks can serve as a guide for biological experiments to discover novel protein complexes. However, it is not easy to predict protein complexes from PPI networks, especially in situations where the PPI network is noisy and still incomplete. Here, we study the use of indirect interactions between level-2 neighbors (level-2 interactions) for protein complex prediction. We know from previous work that proteins which do not interact but share interaction partners (level-2 neighbors) often share biological functions. We have proposed a method in which all direct and indirect interactions are first weighted using topological weight (FS-Weight), which estimates the strength of functional association. Interactions with low weight are removed from the network, while level-2 interactions with high weight are introduced into the interaction network. Existing clustering algorithms can then be applied to this modified network. We have also proposed a novel algorithm that searches for cliques in the modified network, and merge cliques to form clusters using a "partial clique merging" method. Experiments show that (1) the use of indirect interactions and topological weight to augment protein–protein interactions can be used to improve the precision of clusters predicted by various existing clustering algorithms; and (2) our complex-finding algorithm performs very well on interaction networks modified in this way. Since no other information except the original PPI network is used, our approach would be very useful for protein complex prediction, especially for prediction of novel protein complexes.


2019 ◽  
Vol 26 (21) ◽  
pp. 3890-3910 ◽  
Author(s):  
Branislava Gemovic ◽  
Neven Sumonja ◽  
Radoslav Davidovic ◽  
Vladimir Perovic ◽  
Nevena Veljkovic

Background: The significant number of protein-protein interactions (PPIs) discovered by harnessing concomitant advances in the fields of sequencing, crystallography, spectrometry and two-hybrid screening suggests astonishing prospects for remodelling drug discovery. The PPI space which includes up to 650 000 entities is a remarkable reservoir of potential therapeutic targets for every human disease. In order to allow modern drug discovery programs to leverage this, we should be able to discern complete PPI maps associated with a specific disorder and corresponding normal physiology. Objective: Here, we will review community available computational programs for predicting PPIs and web-based resources for storing experimentally annotated interactions. Methods: We compared the capacities of prediction tools: iLoops, Struck2Net, HOMCOS, COTH, PrePPI, InterPreTS and PRISM to predict recently discovered protein interactions. Results: We described sequence-based and structure-based PPI prediction tools and addressed their peculiarities. Additionally, since the usefulness of prediction algorithms critically depends on the quality and quantity of the experimental data they are built on; we extensively discussed community resources for protein interactions. We focused on the active and recently updated primary and secondary PPI databases, repositories specialized to the subject or species, as well as databases that include both experimental and predicted PPIs. Conclusion: PPI complexes are the basis of important physiological processes and therefore, possible targets for cell-penetrating ligands. Reliable computational PPI predictions can speed up new target discoveries through prioritization of therapeutically relevant protein–protein complexes for experimental studies.


2020 ◽  
Vol 27 (37) ◽  
pp. 6306-6355 ◽  
Author(s):  
Marian Vincenzi ◽  
Flavia Anna Mercurio ◽  
Marilisa Leone

Background:: Many pathways regarding healthy cells and/or linked to diseases onset and progression depend on large assemblies including multi-protein complexes. Protein-protein interactions may occur through a vast array of modules known as protein interaction domains (PIDs). Objective:: This review concerns with PIDs recognizing post-translationally modified peptide sequences and intends to provide the scientific community with state of art knowledge on their 3D structures, binding topologies and potential applications in the drug discovery field. Method:: Several databases, such as the Pfam (Protein family), the SMART (Simple Modular Architecture Research Tool) and the PDB (Protein Data Bank), were searched to look for different domain families and gain structural information on protein complexes in which particular PIDs are involved. Recent literature on PIDs and related drug discovery campaigns was retrieved through Pubmed and analyzed. Results and Conclusion:: PIDs are rather versatile as concerning their binding preferences. Many of them recognize specifically only determined amino acid stretches with post-translational modifications, a few others are able to interact with several post-translationally modified sequences or with unmodified ones. Many PIDs can be linked to different diseases including cancer. The tremendous amount of available structural data led to the structure-based design of several molecules targeting protein-protein interactions mediated by PIDs, including peptides, peptidomimetics and small compounds. More studies are needed to fully role out, among different families, PIDs that can be considered reliable therapeutic targets, however, attacking PIDs rather than catalytic domains of a particular protein may represent a route to obtain selective inhibitors.


Author(s):  
Rohan Dandage ◽  
Caroline M Berger ◽  
Isabelle Gagnon-Arsenault ◽  
Kyung-Mee Moon ◽  
Richard Greg Stacey ◽  
...  

Abstract Hybrids between species often show extreme phenotypes, including some that take place at the molecular level. In this study, we investigated the phenotypes of an interspecies diploid hybrid in terms of protein-protein interactions inferred from protein correlation profiling. We used two yeast species, Saccharomyces cerevisiae and Saccharomyces uvarum, which are interfertile, but yet have proteins diverged enough to be differentiated using mass spectrometry. Most of the protein-protein interactions are similar between hybrid and parents, and are consistent with the assembly of chimeric complexes, which we validated using an orthogonal approach for the prefoldin complex. We also identified instances of altered protein-protein interactions in the hybrid, for instance in complexes related to proteostasis and in mitochondrial protein complexes. Overall, this study uncovers the likely frequent occurrence of chimeric protein complexes with few exceptions, which may result from incompatibilities or imbalances between the parental proteins.


Sign in / Sign up

Export Citation Format

Share Document