GO2Vec: transforming GO terms and proteins to vector representations via graph embeddings

AbstractFunctional similarity between genes is widely used in many bioinformatics applications including detecting molecular pathways, finding co-expressed genes, predicting protein-protein interactions, and prioritization of candidate genes. Methods evaluating functional similarity of genes are mostly based on semantic similarity of gene ontology (GO) terms. Though there are hundreds of functional similarity measures available in the literature, none of them considers the enrichment of the GO terms by the querying gene pair. We propose a novel method to incorporate GO enrichment into the existing functional similarity measures. Our experiments show that the inclusion of gene enrichment significantly improves the performance of 44 widely used functional similarity measures, especially in the prediction of sequence homologies, gene expression correlations, and protein-protein interactions.Software availabilityThe software (python code) and all the benchmark datasets evaluation (R script) are available at https://gitlab.com/liuwt/EnrichFunSim.

Download Full-text

Graph embeddings on gene ontology annotations for protein–protein interaction prediction

BMC Bioinformatics ◽

10.1186/s12859-020-03816-8 ◽

2020 ◽

Vol 21 (S16) ◽

Author(s):

Xiaoshi Zhong ◽

Jagath C. Rajapakse

Keyword(s):

Gene Ontology ◽

Protein Interaction ◽

Structural Information ◽

Experimental Results ◽

Graph Embeddings ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Vector Representations ◽

Ppi Prediction ◽

Go Terms

Abstract Background Protein–protein interaction (PPI) prediction is an important task towards the understanding of many bioinformatics functions and applications, such as predicting protein functions, gene-disease associations and disease-drug associations. However, many previous PPI prediction researches do not consider missing and spurious interactions inherent in PPI networks. To address these two issues, we define two corresponding tasks, namely missing PPI prediction and spurious PPI prediction, and propose a method that employs graph embeddings that learn vector representations from constructed Gene Ontology Annotation (GOA) graphs and then use embedded vectors to achieve the two tasks. Our method leverages on information from both term–term relations among GO terms and term-protein annotations between GO terms and proteins, and preserves properties of both local and global structural information of the GO annotation graph. Results We compare our method with those methods that are based on information content (IC) and one method that is based on word embeddings, with experiments on three PPI datasets from STRING database. Experimental results demonstrate that our method is more effective than those compared methods. Conclusion Our experimental results demonstrate the effectiveness of using graph embeddings to learn vector representations from undirected GOA graphs for our defined missing and spurious PPI tasks.

Download Full-text

Target-Templated de novo Design of Macrocyclic D-/L-Peptides: Inhibitors of the PD-1/PD-L1 Interaction

10.26434/chemrxiv.11663337.v3 ◽

2020 ◽

Author(s):

Salvador Guardiola ◽

Monica Varese ◽

Xavier Roig ◽

Jesús Garcia ◽

Ernest Giralt

Keyword(s):

Protein Interactions ◽

Cyclic Peptides ◽

General Framework ◽

Large Scale ◽

De Novo ◽

Inhibitory Effect ◽

Original Text ◽

Protein Protein Interactions ◽

Retraction Notice ◽

Pharmaceutical Properties

NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above. ------------------------------------------------------------------------ Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-i3 and PD-i6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our de novo design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.

Download Full-text

Target-Templated de novo Design of Macrocyclic D-/L-Peptides: Inhibitors of the PD-1/PD-L1 Interaction

10.26434/chemrxiv.11663337 ◽

2020 ◽

Author(s):

Salvador Guardiola ◽

Monica Varese ◽

Xavier Roig ◽

Jesús Garcia ◽

Ernest Giralt

Keyword(s):

Protein Interactions ◽

Cyclic Peptides ◽

General Framework ◽

Large Scale ◽

De Novo ◽

Inhibitory Effect ◽

Original Text ◽

Protein Protein Interactions ◽

Retraction Notice ◽

Pharmaceutical Properties

NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above. ------------------------------------------------------------------------ Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-i3 and PD-i6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our de novo design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.

Download Full-text

Faculty Opinions recommendation of Comparative assessment of large-scale data sets of protein-protein interactions.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1006598.82257 ◽

2002 ◽

Author(s):

Rob Russell

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Comparative Assessment ◽

Data Sets ◽

Protein Protein Interactions ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Short loop functional commonality identified in leukaemia proteome highlights crucial protein sub-networks

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab010 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Sun Sook Chung ◽

Joseph C F Ng ◽

Anna Laddach ◽

N Shaun B Thomas ◽

Franca Fraternali

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Interaction Network ◽

Protein Protein Interactions ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Short Loop ◽

New Strategy ◽

Loop Network ◽

Protein Protein Interaction Network

Abstract Direct drug targeting of mutated proteins in cancer is not always possible and efficacy can be nullified by compensating protein–protein interactions (PPIs). Here, we establish an in silico pipeline to identify specific PPI sub-networks containing mutated proteins as potential targets, which we apply to mutation data of four different leukaemias. Our method is based on extracting cyclic interactions of a small number of proteins topologically and functionally linked in the Protein–Protein Interaction Network (PPIN), which we call short loop network motifs (SLM). We uncover a new property of PPINs named ‘short loop commonality’ to measure indirect PPIs occurring via common SLM interactions. This detects ‘modules’ of PPI networks enriched with annotated biological functions of proteins containing mutation hotspots, exemplified by FLT3 and other receptor tyrosine kinase proteins. We further identify functional dependency or mutual exclusivity of short loop commonality pairs in large-scale cellular CRISPR–Cas9 knockout screening data. Our pipeline provides a new strategy for identifying new therapeutic targets for drug discovery.

Download Full-text

Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2016.00054 ◽

2016 ◽

Vol 3 ◽

Cited By ~ 8

Author(s):

Elise Delaforge ◽

Sigrid Milles ◽

Jie-rong Huang ◽

Denis Bouvier ◽

Malene Ringkjøbing Jensen ◽

...

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Protein Protein Interactions ◽

Domain Dynamics

Download Full-text

A MapReduce-Based Parallel Random Forest Approach for Predicting Large-Scale Protein-Protein Interactions

Intelligent Computing Methodologies - Lecture Notes in Computer Science ◽

10.1007/978-3-030-60796-8_34 ◽

2020 ◽

pp. 400-407

Author(s):

Bo-Ya Ji ◽

Zhu-Hong You ◽

Long Yang ◽

Ji-Ren Zhou ◽

Peng-Wei Hu

Keyword(s):

Random Forest ◽

Protein Interactions ◽

Large Scale ◽

Protein Protein Interactions

Download Full-text

Gene Functional Similarity Analysis by Definition-based Semantic Similarity Measurement of GO Terms

Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-319-06483-3_18 ◽

2014 ◽

pp. 203-214 ◽

Cited By ~ 4

Author(s):

Ahmad Pesaranghader ◽

Ali Pesaranghader ◽

Azadeh Rezaei ◽

Danoosh Davoodi

Keyword(s):

Semantic Similarity ◽

Functional Similarity ◽

Similarity Measurement ◽

Similarity Analysis ◽

Semantic Similarity Measurement ◽

Go Terms

Download Full-text

FrustratometeR: an R-package to compute Local frustration in protein structures, point mutants and MD simulations

10.1101/2020.11.26.400432 ◽

2020 ◽

Author(s):

Atilio O. Rausch ◽

Maria I. Freiberger ◽

Cesar O. Leonetti ◽

Diego M. Luna ◽

Leandro G. Radusky ◽

...

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Protein Structures ◽

Md Simulations ◽

R Package ◽

Protein Protein Interactions ◽

Large Scale Analysis ◽

Functional Aspects ◽

Catalytic Sites ◽

Polypeptide Chains

Once folded natural protein molecules have few energetic conflicts within their polypeptide chains. Many protein structures do however contain regions where energetic conflicts remain after folding, i.e. they have highly frustrated regions. These regions, kept in place over evolutionary and physiological timescales, are related to several functional aspects of natural proteins such as protein-protein interactions, small ligand recognition, catalytic sites and allostery. Here we present FrustratometeR, an R package that easily computes local energetic frustration on a personal computer or a cluster. This package facilitates large scale analysis of local frustration, point mutants and MD trajectories, allowing straightforward integration of local frustration analysis in to pipelines for protein structural analysis.Availability and implementation: https://github.com/proteinphysiologylab/frustratometeR

Download Full-text