scholarly journals Protein engineering in the big data era: harnessing near-redundant structural data

Author(s):  
Samuel Coulbourn Flores ◽  
Athanasios Alexiou ◽  
Anastasios Glaros

Abstract Motivation: Predicting the effect of mutations on protein-protein interactions is important for relating structure to function, as well as for in silico affinity maturation. The effect of mutations on protein-protein binding energy (ΔΔG) can be predicted by a variety of atomic simulation methods involving full or limited flexibility, and explicit or implicit solvent. Methods which consider only limited flexibility are naturally more economical, and many of them are quite accurate, however results are dependent on the atomic coordinate set used. In this work we perform a sequence and structure based search of the Protein Data Bank to find additional coordinate sets and repeat the calculation on each.Results: . We improve increase precision and Positive Predictive Value, and decrease Root Mean Square Error and higher Positive Predictive Value, compared to using single structures. Given the ongoing growth of near-redundant structures in the Protein Data Bank, our method will only increase in applicability and accuracy.Availability: Public web server at biodesign.scilifelab.se

2020 ◽  
Author(s):  
Samuel Coulbourn Flores ◽  
Athanasios Alexiou ◽  
Anastasios Glaros

Abstract Motivation: Predicting the effect of mutations on protein-protein interactions is important for relating structure to function, as well as for in silico affinity maturation. The effect of mutations on protein-protein binding energy (ΔΔG) can be predicted by a variety of atomic simulation methods involving full or limited flexibility, and explicit or implicit solvent. Methods which consider only limited flexibility are naturally more economical, and many of them are quite accurate, however results are dependent on the atomic coordinate set used. In this work we perform a sequence and structure based search of the Protein Data Bank to find additional coordinate sets and repeat the calculation on each. Results: . We improve increase precision and Positive Predictive Value, and decrease Root Mean Square Error and higher Positive Predictive Value, compared to using single structures. Given the ongoing growth of near-redundant structures in the Protein Data Bank, our method will only increase in applicability and accuracy. Availability: Public web server at biodesign.scilifelab.se


PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0257614
Author(s):  
Samuel Coulbourn Flores ◽  
Athanasios Alexiou ◽  
Anastasios Glaros

Predicting the effect of mutations on protein-protein interactions is important for relating structure to function, as well as for in silico affinity maturation. The effect of mutations on protein-protein binding energy (ΔΔG) can be predicted by a variety of atomic simulation methods involving full or limited flexibility, and explicit or implicit solvent. Methods which consider only limited flexibility are naturally more economical, and many of them are quite accurate, however results are dependent on the atomic coordinate set used. In this work we perform a sequence and structure based search of the Protein Data Bank to find additional coordinate sets and repeat the calculation on each. The method increases precision and Positive Predictive Value, and decreases Root Mean Square Error, compared to using single structures. Given the ongoing growth of near-redundant structures in the Protein Data Bank, our method will only increase in applicability and accuracy.


2009 ◽  
Vol 284 (24) ◽  
pp. 16369-16376 ◽  
Author(s):  
Xuebo Hu ◽  
Sungkwon Kang ◽  
Xiaoyue Chen ◽  
Charles B. Shoemaker ◽  
Moonsoo M. Jin

A quantitative in vivo method for detecting protein-protein interactions will enhance our understanding of protein interaction networks and facilitate affinity maturation as well as designing new interaction pairs. We have developed a novel platform, dubbed “yeast surface two-hybrid (YS2H),” to enable a quantitative measurement of pairwise protein interactions via the secretory pathway by expressing one protein (bait) anchored to the cell wall and the other (prey) in soluble form. In YS2H, the prey is released either outside of the cells or remains on the cell surface by virtue of its binding to the bait. The strength of their interaction is measured by antibody binding to the epitope tag appended to the prey or direct readout of split green fluorescence protein (GFP) complementation. When two α-helices forming coiled coils were expressed as a pair of prey and bait, the amount of the prey in complex with the bait progressively decreased as the affinity changes from 100 pm to 10 μm. With GFP complementation assay, we were able to discriminate a 6-log difference in binding affinities in the range of 100 pm to 100 μm. The affinity estimated from the level of antibody binding to fusion tags was in good agreement with that measured in solution using a surface plasmon resonance technique. In contrast, the level of GFP complementation linearly increased with the on-rate of coiled coil interactions, likely because of the irreversible nature of GFP reconstitution. Furthermore, we demonstrate the use of YS2H in exploring the nature of antigen recognition by antibodies and activation allostery in integrins and in isolating heavy chain-only antibodies against botulinum neurotoxin.


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 286 ◽  
Author(s):  
Eliza C. Martin ◽  
Octavina C. A. Sukarta ◽  
Laurentiu Spiridon ◽  
Laurentiu G. Grigore ◽  
Vlad Constantinescu ◽  
...  

Leucine-rich-repeats (LRRs) belong to an archaic procaryal protein architecture that is widely involved in protein–protein interactions. In eukaryotes, LRR domains developed into key recognition modules in many innate immune receptor classes. Due to the high sequence variability imposed by recognition specificity, precise repeat delineation is often difficult especially in plant NOD-like Receptors (NLRs) notorious for showing far larger irregularities. To address this problem, we introduce here LRRpredictor, a method based on an ensemble of estimators designed to better identify LRR motifs in general but particularly adapted for handling more irregular LRR environments, thus allowing to compensate for the scarcity of structural data on NLR proteins. The extrapolation capacity tested on a set of annotated LRR domains from six immune receptor classes shows the ability of LRRpredictor to recover all previously defined specific motif consensuses and to extend the LRR motif coverage over annotated LRR domains. This analysis confirms the increased variability of LRR motifs in plant and vertebrate NLRs when compared to extracellular receptors, consistent with previous studies. Hence, LRRpredictor is able to provide novel insights into the diversification of LRR domains and a robust support for structure-informed analyses of LRRs in immune receptor functioning.


Author(s):  
Keisuke Arikawa

On the basis of robot kinematics, we have thus far developed a method for predicting the motion of proteins from their 3D structural data given in the Protein Data Bank (PDB data). In this method, proteins are modeled as serial manipulators constrained by springs and the structural compliance properties of the models are evaluated. We focus on localized instead of whole structures of proteins. Employing the same model used in our method of motion prediction, the motion properties of the localized structures and the relation between the motion properties of localized and whole structures are analyzed. First, we present a method for graphically expressing the deformation of objects with a complex shape, such as proteins, by approximating the shape as a rectangular prism with a mesh on its surface. We then formulate a method for comparing the motion properties of localized structures cleaved from the whole structure and those remaining in it by expressing the motion of the latter using the decomposed motion modes of the former according to the structural compliance. Finally, we show a method for evaluating the effect of a localized structure on the motion properties of proteins by applying forces to localized structures. In the formulations, we demonstrate applications as illustrative examples using the PDB data of a real protein.


2011 ◽  
Vol 40 (D1) ◽  
pp. D453-D460 ◽  
Author(s):  
A. R. Kinjo ◽  
H. Suzuki ◽  
R. Yamashita ◽  
Y. Ikegawa ◽  
T. Kudou ◽  
...  

2020 ◽  
Author(s):  
Mayank Baranwal ◽  
Abram Magner ◽  
Jacob Saldinger ◽  
Emine S. Turali-Emre ◽  
Shivani Kozarekar ◽  
...  

AbstractDevelopment of new methods for analysis of protein-protein interactions (PPIs) at molecular and nanometer scales gives insights into intracellular signaling pathways and will improve understanding of protein functions, as well as other nanoscale structures of biological and abiological origins. Recent advances in computational tools, particularly the ones involving modern deep learning algorithms, have been shown to complement experimental approaches for describing and rationalizing PPIs. However, most of the existing works on PPI predictions use protein-sequence information, and thus have difficulties in accounting for the three-dimensional organization of the protein chains. In this study, we address this problem and describe a PPI analysis method based on a graph attention network, named Struct2Graph, for identifying PPIs directly from the structural data of folded protein globules. Our method is capable of predicting the PPI with an accuracy of 98.89% on the balanced set consisting of an equal number of positive and negative pairs. On the unbalanced set with the ratio of 1:10 between positive and negative pairs, Struct2Graph achieves a five-fold cross validation average accuracy of 99.42%. Moreover, unsupervised prediction of the interaction sites by Struct2Graph for phenol-soluble modulins are found to be in concordance with the previously reported binding sites for this family.Author summaryPPIs are the central part of signal transduction, metabolic regulation, environmental sensing, and cellular organization. Despite their success, most strategies to decode PPIs use sequence based approaches do not generalize to broader classes of chemical compounds of similar scale as proteins that are equally capable of forming complexes with proteins that are not based on amino acids, and thus lack of an equivalent sequence-based representation. Here, we address the problem of prediction of PPIs using a first of its kind, 3D structure based graph attention network (available at https://github.com/baranwa2/Struct2Graph). Despite its excellent prediction performance, the novel mutual attention mechanism provides insights into likely interaction sites through its knowledge selection process in a completely unsupervised manner.


2019 ◽  
Author(s):  
A.T. Balci ◽  
C. Gumeli ◽  
A. Hakouz ◽  
D. Yuret ◽  
O. Keskin ◽  
...  

AbstractMotivationProtein–protein interactions are crucial in almost all biological processes. Proteins interact through their interfaces. It is important to determine how proteins interact through interfaces to understand protein binding mechanisms and to predict new protein-protein interactions.ResultsWe present DeepInterface, a deep learning based method which predicts, for a given protein complex, if the interface between the proteins of a complex is a true interface or not. The model is a 3-dimensional convolutional neural networks model and the positive datasets are obtained from all complexes in the Protein Data Bank, the negative datasets are the incorrect solutions of the docking decoys. The model analyzes a given interface structure and outputs the probability of the given structure being an interface. The accuracy of the model for several interface data sets, including PIFACE, PPI4DOCK, DOCKGROUND is approximately 88% in the validation dataset and 75% in the test dataset. The method can be used to improve the accuracy of template based PPI predictions.


2021 ◽  
Vol 15 ◽  
Author(s):  
Hale Yapici-Eser ◽  
Yunus Emre Koroglu ◽  
Ozgur Oztop-Cakmak ◽  
Ozlem Keskin ◽  
Attila Gursoy ◽  
...  

The first clinical symptoms focused on the presentation of coronavirus disease 2019 (COVID-19) have been respiratory failure, however, accumulating evidence also points to its presentation with neuropsychiatric symptoms, the exact mechanisms of which are not well known. By using a computational methodology, we aimed to explain the molecular paths of COVID-19 associated neuropsychiatric symptoms, based on the mimicry of the human protein interactions with SARS-CoV-2 proteins.Methods: Available 11 of the 29 SARS-CoV-2 proteins’ structures have been extracted from Protein Data Bank. HMI-PRED (Host-Microbe Interaction PREDiction), a recently developed web server for structural PREDiction of protein-protein interactions (PPIs) between host and any microbial species, was used to find the “interface mimicry” through which the microbial proteins hijack host binding surfaces. Classification of the found interactions was conducted using the PANTHER Classification System.Results: Predicted Human-SARS-CoV-2 protein interactions have been extensively compared with the literature. Based on the analysis of the molecular functions, cellular localizations and pathways related to human proteins, SARS-CoV-2 proteins are found to possibly interact with human proteins linked to synaptic vesicle trafficking, endocytosis, axonal transport, neurotransmission, growth factors, mitochondrial and blood-brain barrier elements, in addition to its peripheral interactions with proteins linked to thrombosis, inflammation and metabolic control.Conclusion: SARS-CoV-2-human protein interactions may lead to the development of delirium, psychosis, seizures, encephalitis, stroke, sensory impairments, peripheral nerve diseases, and autoimmune disorders. Our findings are also supported by the previous in vivo and in vitro studies from other viruses. Further in vivo and in vitro studies using the proteins that are pointed here, could pave new targets both for avoiding and reversing neuropsychiatric presentations.


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0255167
Author(s):  
Vladimir Sladek ◽  
Yuta Yamamoto ◽  
Ryuhei Harada ◽  
Mitsuo Shoji ◽  
Yasuteru Shigeta ◽  
...  

The field of protein residue network (PRN) research has brought several useful methods and techniques for structural analysis of proteins and protein complexes. Many of these are ripe and ready to be used by the proteomics community outside of the PRN specialists. In this paper we present software which collects an ensemble of (network) methods tailored towards the analysis of protein-protein interactions (PPI) and/or interactions of proteins with ligands of other type, e.g. nucleic acids, oligosaccharides etc. In parallel, we propose the use of the network differential analysis as a method to identify residues mediating key interactions between proteins. We use a model system, to show that in combination with other, already published methods, also included in pyProGA, it can be used to make such predictions. Such extended repertoire of methods allows to cross-check predictions with other methods as well, as we show here. In addition, the possibility to construct PRN models from various kinds of input is so far a unique asset of our code. One can use structural data as defined in PDB files and/or from data on residue pair interaction energies, either from force-field parameters or fragment molecular orbital (FMO) calculations. pyProGA is a free open-source software available from https://gitlab.com/Vlado_S/pyproga.


Sign in / Sign up

Export Citation Format

Share Document