scholarly journals Coevolutive, Evolutive and Stochastic Information in Protein-Protein Interactions

2019 ◽  
Author(s):  
Miguel Andrade ◽  
Camila Pontes ◽  
Werner Treptow

ABSTRACTHere, we investigate the contributions of coevolutive, evolutive and stochastic information in determining protein-protein interactions (PPIs) based on primary sequences of two interacting protein families A and B. Specifically, under the assumption that coevolutive information is imprinted on the interacting amino acids of two proteins in contrast to other (evolutive and stochastic) sources spread over their sequences, we dissect those contributions in terms of compensatory mutations at physically-coupled and uncoupled amino acids of A and B. We find that physically-coupled amino-acids at short range distances store the largest per-contact mutual information content, with a significant fraction of that content resulting from coevolutive sources alone. The information stored in coupled amino acids is shown further to discriminate multi-sequence alignments (MSAs) with the largest expectation fraction of PPI matches – a conclusion that holds against various definitions of intermolecular contacts and binding modes. When compared to the informational content resulting from evolution at long-range interactions, the mutual information in physically-coupled amino-acids is the strongest signal to distinguish PPIs derived from cospeciation and likely, the unique indication in case of molecular coevolution in independent genomes as the evolutive information must vanish for uncorrelated proteins.SIGNIFICANCEThe problem of predicting protein-protein interactions (PPIs) based on multi-sequence alignments (MSAs) appears not completely resolved to date. In previous studies, one or more sources of information were taken into account not clarifying the isolated contributions of coevolutive, evolutive and stochastic information in resolving the problem. By benefiting from data sets made available in the sequence- and structure-rich era, we revisit the field to show that physically-coupled amino-acids of proteins store the largest (per contact) information content to discriminate MSAs with the largest expectation fraction of PPI matches – a result that should guide new developments in the field, aiming at characterizing protein interactions in general.

Molecules ◽  
2020 ◽  
Vol 25 (8) ◽  
pp. 1841 ◽  
Author(s):  
Da Xu ◽  
Hanxiao Xu ◽  
Yusen Zhang ◽  
Wei Chen ◽  
Rui Gao

Identification of protein-protein interactions (PPIs) plays an essential role in the understanding of protein functions and cellular biological activities. However, the traditional experiment-based methods are time-consuming and laborious. Therefore, developing new reliable computational approaches has great practical significance for the identification of PPIs. In this paper, a novel prediction method is proposed for predicting PPIs using graph energy, named PPI-GE. Particularly, in the process of feature extraction, we designed two new feature extraction methods, the physicochemical graph energy based on the ionization equilibrium constant and isoelectric point and the contact graph energy based on the contact information of amino acids. The dipeptide composition method was used for order information of amino acids. After multi-information fusion, principal component analysis (PCA) was implemented for eliminating noise and a robust weighted sparse representation-based classification (WSRC) classifier was applied for sample classification. The prediction accuracies based on the five-fold cross-validation of the human, Helicobacter pylori (H. pylori), and yeast data sets were 99.49%, 97.15%, and 99.56%, respectively. In addition, in five independent data sets and two significant PPI networks, the comparative experimental results also demonstrate that PPI-GE obtained better performance than the compared methods.


2021 ◽  
Author(s):  
Patrick Bryant ◽  
Gabriele Pozzati ◽  
Arne Elofsson

Predicting the structure of single-chain proteins is now close to being a solved problem due to the recent achievement of AlphaFold2 (AF2). However, predicting the structure of interacting protein chains is still a challenge. Here, we utilise AF2 to optimise a protocol for predicting the structure of heterodimeric protein complexes using only sequence information. We find that using the default AF2 protocol, 32% of the models in the Dockground test set can be modelled accurately. By tuning the input alignment and identifying the best model, we adjusted the performance to 43%. Our protocol uses MSAs generated by AF2 and MSAs paired on the organism level generated with HHblits. In a more extensive, more realistic, independent test set, the accuracy is 59%. In comparison, the alternative fold-and-dock method RoseTTAFold is only successful in 10% of the cases on this set and traditional docking methods 22%. However, for the traditional method, the performance would be lower if the bound form of both monomers was not known. The success is higher for bacterial protein pairs, pairs with large interaction areas consisting of helices or sheets, and many homologous sequences. We can distinguish acceptable (DockQ>0.23) from incorrect models with an AUC of 0.84 on the test set by analysing the predicted interfaces. At an error rate of 1%, 13% are acceptable (at a 10% error rate, 40% of the models are acceptable). All scripts and tools to run our protocol are freely available at: https://gitlab.com/ElofssonLab/FoldDock.


Toxins ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 290
Author(s):  
Caterina Peggion ◽  
Fiorella Tonello

Snake venom phospholipases A2 (PLA2s) have sequences and structures very similar to those of mammalian group I and II secretory PLA2s, but they possess many toxic properties, ranging from the inhibition of coagulation to the blockage of nerve transmission, and the induction of muscle necrosis. The biological properties of these proteins are not only due to their enzymatic activity, but also to protein–protein interactions which are still unidentified. Here, we compare sequence alignments of snake venom and mammalian PLA2s, grouped according to their structure and biological activity, looking for differences that can justify their different behavior. This bioinformatics analysis has evidenced three distinct regions, two central and one C-terminal, having amino acid compositions that distinguish the different categories of PLA2s. In these regions, we identified short linear motifs (SLiMs), peptide modules involved in protein–protein interactions, conserved in mammalian and not in snake venom PLA2s, or vice versa. The different content in the SLiMs of snake venom with respect to mammalian PLA2s may result in the formation of protein membrane complexes having a toxic activity, or in the formation of complexes whose activity cannot be blocked due to the lack of switches in the toxic PLA2s, as the motif recognized by the prolyl isomerase Pin1.


Proteomes ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 16
Author(s):  
Shomeek Chowdhury ◽  
Stephen Hepper ◽  
Mudassir K. Lodi ◽  
Milton H. Saier ◽  
Peter Uetz

Glycolysis is regulated by numerous mechanisms including allosteric regulation, post-translational modification or protein-protein interactions (PPI). While glycolytic enzymes have been found to interact with hundreds of proteins, the impact of only some of these PPIs on glycolysis is well understood. Here we investigate which of these interactions may affect glycolysis in E. coli and possibly across numerous other bacteria, based on the stoichiometry of interacting protein pairs (from proteomic studies) and their conservation across bacteria. We present a list of 339 protein-protein interactions involving glycolytic enzymes but predict that ~70% of glycolytic interactors are not present in adequate amounts to have a significant impact on glycolysis. Finally, we identify a conserved but uncharacterized subset of interactions that are likely to affect glycolysis and deserve further study.


2021 ◽  
Author(s):  
Babu Sudhamalla ◽  
Anirban Roy ◽  
Soumen Barman ◽  
Jyotirmayee Padhan

The site-specific installation of light-activable crosslinker unnatural amino acids offers a powerful approach to trap transient protein-protein interactions both in vitro and in vivo. Herein, we engineer a bromodomain to...


2019 ◽  
Vol 21 (5) ◽  
pp. 1798-1805 ◽  
Author(s):  
Kai Yu ◽  
Qingfeng Zhang ◽  
Zekun Liu ◽  
Yimeng Du ◽  
Xinjiao Gao ◽  
...  

Abstract Protein lysine acetylation regulation is an important molecular mechanism for regulating cellular processes and plays critical physiological and pathological roles in cancers and diseases. Although massive acetylation sites have been identified through experimental identification and high-throughput proteomics techniques, their enzyme-specific regulation remains largely unknown. Here, we developed the deep learning-based protein lysine acetylation modification prediction (Deep-PLA) software for histone acetyltransferase (HAT)/histone deacetylase (HDAC)-specific acetylation prediction based on deep learning. Experimentally identified substrates and sites of several HATs and HDACs were curated from the literature to generate enzyme-specific data sets. We integrated various protein sequence features with deep neural network and optimized the hyperparameters with particle swarm optimization, which achieved satisfactory performance. Through comparisons based on cross-validations and testing data sets, the model outperformed previous studies. Meanwhile, we found that protein–protein interactions could enrich enzyme-specific acetylation regulatory relations and visualized this information in the Deep-PLA web server. Furthermore, a cross-cancer analysis of acetylation-associated mutations revealed that acetylation regulation was intensively disrupted by mutations in cancers and heavily implicated in the regulation of cancer signaling. These prediction and analysis results might provide helpful information to reveal the regulatory mechanism of protein acetylation in various biological processes to promote the research on prognosis and treatment of cancers. Therefore, the Deep-PLA predictor and protein acetylation interaction networks could provide helpful information for studying the regulation of protein acetylation. The web server of Deep-PLA could be accessed at http://deeppla.cancerbio.info.


BMC Genomics ◽  
2019 ◽  
Vol 20 (S9) ◽  
Author(s):  
Xiaoshi Zhong ◽  
Rama Kaalia ◽  
Jagath C. Rajapakse

Abstract Background Semantic similarity between Gene Ontology (GO) terms is a fundamental measure for many bioinformatics applications, such as determining functional similarity between genes or proteins. Most previous research exploited information content to estimate the semantic similarity between GO terms; recently some research exploited word embeddings to learn vector representations for GO terms from a large-scale corpus. In this paper, we proposed a novel method, named GO2Vec, that exploits graph embeddings to learn vector representations for GO terms from GO graph. GO2Vec combines the information from both GO graph and GO annotations, and its learned vectors can be applied to a variety of bioinformatics applications, such as calculating functional similarity between proteins and predicting protein-protein interactions. Results We conducted two kinds of experiments to evaluate the quality of GO2Vec: (1) functional similarity between proteins on the Collaborative Evaluation of GO-based Semantic Similarity Measures (CESSM) dataset and (2) prediction of protein-protein interactions on the Yeast and Human datasets from the STRING database. Experimental results demonstrate the effectiveness of GO2Vec over the information content-based measures and the word embedding-based measures. Conclusion Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GO and GOA graphs. Our results also demonstrate that GO annotations provide useful information for computing the similarity between GO terms and between proteins.


BMC Cancer ◽  
2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Konstantinos Karakostis ◽  
Robin Fåhraeus

Abstract Structured RNA regulatory motifs exist from the prebiotic stages of the RNA world to the more complex eukaryotic systems. In cases where a functional RNA structure is within the coding sequence a selective pressure drives a parallel co-evolution of the RNA structure and the encoded peptide domain. The p53-MDM2 axis, describing the interactions between the p53 tumor suppressor and the MDM2 E3 ubiquitin ligase, serves as particularly useful model revealing how secondary RNA structures have co-evolved along with corresponding interacting protein motifs, thus having an impact on protein – RNA and protein – protein interactions; and how such structures developed signal-dependent regulation in mammalian systems. The p53(BOX-I) RNA sequence binds the C-terminus of MDM2 and controls p53 synthesis while the encoded peptide domain binds MDM2 and controls p53 degradation. The BOX-I peptide domain is also located within p53 transcription activation domain. The folding of the p53 mRNA structure has evolved from temperature-regulated in pre-vertebrates to an ATM kinase signal-dependent pathway in mammalian cells. The protein – protein interaction evolved in vertebrates and became regulated by the same signaling pathway. At the same time the protein - RNA and protein - protein interactions evolved, the p53 trans-activation domain progressed to become integrated into a range of cellular pathways. We discuss how a single synonymous mutation in the BOX-1, the p53(L22 L), observed in a chronic lymphocyte leukaemia patient, prevents the activation of p53 following DNA damage. The concepts analysed and discussed in this review may serve as a conceptual mechanistic paradigm of the co-evolution and function of molecules having roles in cellular regulation, or the aetiology of genetic diseases and how synonymous mutations can affect the encoded protein.


Sign in / Sign up

Export Citation Format

Share Document