Sequence-Based Prediction of Protein-Protein Binding Residues in Alpha-Helical Membrane Proteins

Author(s):  
Feng Xiao ◽  
Hong-Bin Shen
2017 ◽  
Vol 20 (4) ◽  
pp. 1250-1268 ◽  
Author(s):  
Jian Zhang ◽  
Zhiqiang Ma ◽  
Lukasz Kurgan

Abstract Proteins interact with a variety of molecules including proteins and nucleic acids. We review a comprehensive collection of over 50 studies that analyze and/or predict these interactions. While majority of these studies address either solely protein–DNA or protein–RNA binding, only a few have a wider scope that covers both protein–protein and protein–nucleic acid binding. Our analysis reveals that binding residues are typically characterized with three hallmarks: relative solvent accessibility (RSA), evolutionary conservation and propensity of amino acids (AAs) for binding. Motivated by drawbacks of the prior studies, we perform a large-scale analysis to quantify and contrast the three hallmarks for residues that bind DNA-, RNA-, protein- and (for the first time) multi-ligand-binding residues that interact with DNA and proteins, and with RNA and proteins. Results generated on a well-annotated data set of over 23 000 proteins show that conservation of binding residues is higher for nucleic acid- than protein-binding residues. Multi-ligand-binding residues are more conserved and have higher RSA than single-ligand-binding residues. We empirically show that each hallmark discriminates between binding and nonbinding residues, even predicted RSA, and that combining them improves discriminatory power for each of the five types of interactions. Linear scoring functions that combine these hallmarks offer good predictive performance of residue-level propensity for binding and provide intuitive interpretation of predictions. Better understanding of these residue-level interactions will facilitate development of methods that accurately predict binding in the exponentially growing databases of protein sequences.


Author(s):  
Alexandra M. Young ◽  
Amanda L. Gunn ◽  
Emily M. Hatch

AbstractNuclear membrane rupture during interphase occurs in a variety of cell contexts, both healthy and pathological. Membrane ruptures can be rapidly repaired, but these mechanisms are still unclear. Here we show BAF, a nuclear envelope protein that shapes chromatin and recruits membrane proteins in mitosis, also facilitates nuclear membrane repair in interphase, in part through recruitment of the nuclear membrane proteins emerin and LEMD2 to rupture sites. Characterization of GFP-BAF accumulation at nuclear membrane rupture sites confirmed BAF is a fast, accurate, and persistent mark of nucleus rupture whose kinetics are partially dictated by membrane resealing. BAF depletion significantly delayed nuclear membrane repair, with a larger effect on longer ruptures. This phenotype could be rescued by GFP-BAF, but not by a BAF mutant lacking the LEM-protein binding domain. Depletion of the BAF interactors LEMD2 or emerin, and to a lesser extent lamin A/C, increased the duration of nucleus ruptures, consistent with LEM-protein binding being a key function of BAF during membrane repair. Overall our results suggest a model where BAF is critical for timely repair of large ruptures in the nuclear membrane, potentially by facilitating membrane attachment to the rupture site.


2021 ◽  
Vol 1 ◽  
Author(s):  
Guillaume Brysbaert ◽  
Marc F. Lensink

Residue interaction networks (RINs) describe a protein structure as a network of interacting residues. Central nodes in these networks, identified by centrality analyses, highlight those residues that play a role in the structure and function of the protein. However, little is known about the capability of such analyses to identify residues involved in the formation of macromolecular complexes. Here, we performed six different centrality measures on the RINs generated from the complexes of the SKEMPI 2 database of changes in protein–protein binding upon mutation in order to evaluate the capability of each of these measures to identify major binding residues. The analyses were performed with and without the crystallographic water molecules, in addition to the protein residues. We also investigated the use of a weight factor based on the inter-residue distances to improve the detection of these residues. We show that for the identification of major binding residues, closeness, degree, and PageRank result in good precision, whereas betweenness, eigenvector, and residue centrality analyses give a higher sensitivity. Including water in the analysis improves the sensitivity of all measures without losing precision. Applying weights only slightly raises the sensitivity of eigenvector centrality analysis. We finally show that a combination of multiple centrality analyses is the optimal approach to identify residues that play a role in protein–protein interaction.


2019 ◽  
Vol 35 (14) ◽  
pp. i343-i353 ◽  
Author(s):  
Jian Zhang ◽  
Lukasz Kurgan

AbstractMotivationAccurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use.ResultsWe propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins.Availability and implementationSCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/.Supplementary informationSupplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Linus Mathias Scheibenreif ◽  
Maria Littmann ◽  
Christine Orengo ◽  
Burkhard Rost

Abstract Background The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues. Results FunFam members agreed, on average, in 36.9±0.6% of their binding residue annotations. This constituted a 6.7-fold increase over randomly grouped proteins and a 1.2-fold increase (1.1-fold on the same dataset) over proteins with the same enzymatic function (identical Enzyme Commission, EC, number). Mapping de novo binding site prediction methods (BindPredict-CCS, BindPredict-CC) onto FunFam resulted in consensus predictions for those residues that were aligned and predicted alike (binding/non-binding) within a FunFam. This simple consensus increased the F1-score (for binding) 1.5-fold over the original prediction method. Variation of the threshold for how many proteins in the consensus prediction had to agree provided a convenient control of accuracy/precision and coverage/recall, e.g. reaching a precision as high as 60.8±0.4% for a stringent threshold. Conclusions The FunFams outperformed even the carefully curated EC numbers in terms of agreement of binding site residues. Additionally, we assume that our proof-of-principle through the prediction of protein binding residues will be relevant for many other solutions profiting from FunFams to infer functional information at the residue level.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jie Sun ◽  
Xiaoran Roger Liu ◽  
Shuang Li ◽  
Peng He ◽  
Weikai Li ◽  
...  

AbstractMass spectrometry-based footprinting can probe higher order structure of soluble proteins in their native states and serve as a complement to high-resolution approaches. Traditional footprinting approaches, however, are hampered for integral membrane proteins because their transmembrane regions are not accessible to solvent, and they contain hydrophobic residues that are generally unreactive with most chemical reagents. To address this limitation, we bond photocatalytic titanium dioxide (TiO2) nanoparticles to a lipid bilayer. Upon laser irradiation, the nanoparticles produce local concentrations of radicals that penetrate the lipid layer, which is made permeable by a simultaneous laser-initiated Paternò–Büchi reaction. This approach achieves footprinting for integral membrane proteins in liposomes, helps locate both ligand-binding residues in a transporter and ligand-induced conformational changes, and reveals structural aspects of proteins at the flexible unbound state. Overall, this approach proves effective in intramembrane footprinting and forges a connection between material science and biology.


2019 ◽  
Author(s):  
Linus Mathias Scheibenreif ◽  
Maria Littmann ◽  
Christine Orengo ◽  
Burkhard Rost

Abstract Background The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues. Results FunFam members agreed, on average, in 36.9±0.6% of their binding residue annotations. This constituted a 6.7-fold increase over randomly grouped proteins and a 1.2-fold increase (1.1-fold on the same dataset) over proteins with the same enzymatic function (identical Enzyme Commission, EC, number). Mapping de novo binding site prediction methods (BindPredict-CCS, BindPredict-CC) onto FunFam resulted in consensus predictions for those residues that were aligned and predicted alike (binding/non-binding) within a FunFam. This simple consensus increased the F1-score (for binding) 1.5-fold over the original prediction method. Variation of the threshold for how many proteins in the consensus prediction had to agree provided a convenient control of accuracy/precision and coverage/recall, e.g. reaching a precision as high as 60.8±0.4% for a stringent threshold. Conclusions The FunFams outperformed even the carefully curated EC numbers in terms of agreement of binding site residues. Additionally, we assume that our proof-of-principle through the prediction of protein binding residues will be relevant for many other solutions profiting from FunFams to infer functional information at the residue level.


2020 ◽  
Vol 36 (18) ◽  
pp. 4729-4738 ◽  
Author(s):  
Jian Zhang ◽  
Sina Ghadermarzi ◽  
Lukasz Kurgan

Abstract Motivation There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to cross-over, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs. Availability and implementation HybridPBRpred webserver, benchmark dataset and supplementary information are available at http://biomine.cs.vcu.edu/servers/hybridPBRpred/. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document