An interactome landscape of SARS-CoV-2 virus-human protein-protein interactions by protein sequence-based multi-label classifiers

Mapping Intimacies ◽

10.1101/2021.11.07.467640 ◽

2021 ◽

Author(s):

Ho-Joon Lee

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Smooth Muscle Contraction ◽

Histone H2a ◽

Protein Protein Interactions ◽

Proteomics Data ◽

Global Pandemic ◽

Novel Approach ◽

Human Proteins ◽

Significant Enrichment

The new coronavirus species, SARS-CoV-2, caused an unprecedented global pandemic of COVID-19 disease since late December 2019. A comprehensive characterization of protein-protein interactions (PPIs) between SARS-CoV-2 and human cells is a key to understanding the infection and preventing the disease. Here we present a novel approach to predict virus-host PPIs by multi-label machine learning classifiers of random forests and XGBoost using amino acid composition profiles of virus and human proteins. Our models harness a large-scale database of Viruses.STRING with >80,000 virus-host PPIs along with evidence scores for multi-level evidence prediction, which is distinct from predicting binary interactions in previous studies. Our multi-label classifiers are based on 5 evidence levels binned from evidence scores. Our best model of XGBoost achieves 74% AUC and 68% accuracy on average in 10-fold cross validation. The most important amino acids are cysteine and histidine. In addition, our model predicts experimental PPIs with higher evidence level than text mining-based PPIs. We then predict evidence levels of ~2,000 SARS-CoV-2 virus-human PPIs from public experimental proteomics data. Interactions with SARS-CoV-2 Nsp7b show high evidence. We also predict evidence levels of all pairwise PPIs of ~550,000 between the SARS-CoV-2 and human proteomes to provide a draft virus-host interactome landscape for SARS-CoV-2 infection in humans in a comprehensive and unbiased way in silico. Most human proteins from 140 highest evidence predictions interact with SARS-CoV-2 Nsp7, Nsp1, and ORF14, with significant enrichment in the top 2 pathways of vascular smooth muscle contraction (CALD1, NPR2, CALML3) and Myc targets (CBX3, PES1). Our prediction also suggests that histone H2A components are targeted by multiple SARS-CoV-2 proteins.

Download Full-text

On the structure of protein–protein interaction networks

Biochemical Society Transactions ◽

10.1042/bst0311491 ◽

2003 ◽

Vol 31 (6) ◽

pp. 1491-1496 ◽

Cited By ~ 47

Author(s):

A. Thomas ◽

R. Cannings ◽

N.A.M. Monk ◽

C. Cannings

Keyword(s):

Power Law ◽

Protein Interactions ◽

Large Scale ◽

Underlying Structure ◽

Protein Protein Interactions ◽

Yeast Two Hybrid ◽

Protein Protein Interaction ◽

Human Proteins ◽

Approximate Power ◽

Two Hybrid

We present a simple model for the underlying structure of protein–protein pairwise interaction graphs that is based on the way in which proteins attach to each other in experiments such as yeast two-hybrid assays. We show that data on the interactions of human proteins lend support to this model. The frequency of the number of connections per protein under this model does not follow a power law, in contrast to the reported behaviour of data from large-scale yeast two-hybrid screens of yeast protein–protein interactions. Sampling sub-graphs from the underlying graphs generated with our model, in a way analogous to the sampling performed in large-scale yeast two-hybrid searches, gives degree distributions that differ subtly from the power law and that fit the observed data better than the power law itself. Our results show that the observation of approximate power law behaviour in a sampled sub-graph does not imply that the underlying graph follows a power law.

Download Full-text

A modular toolkit to inhibit proline-rich motif–mediated protein–protein interactions

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1422054112 ◽

2015 ◽

Vol 112 (16) ◽

pp. 5011-5016 ◽

Cited By ~ 25

Author(s):

Robert Opitz ◽

Matthias Müller ◽

Cédric Reuter ◽

Matthias Barone ◽

Arne Soicke ◽

...

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Focal Adhesions ◽

Protein Protein Interactions ◽

Proteomics Data ◽

General Applicability ◽

Genomics And Proteomics ◽

Domain Interactions ◽

Src Homology ◽

Recognition Motifs

Small-molecule competitors of protein–protein interactions are urgently needed for functional analysis of large-scale genomics and proteomics data. Particularly abundant, yet so far undruggable, targets include domains specialized in recognizing proline-rich segments, including Src-homology 3 (SH3), WW, GYF, and Drosophila enabled (Ena)/vasodilator-stimulated phosphoprotein (VASP) homology 1 (EVH1) domains. Here, we present a modular strategy to obtain an extendable toolkit of chemical fragments (ProMs) designed to replace pairs of conserved prolines in recognition motifs. As proof-of-principle, we developed a small, selective, peptidomimetic inhibitor of Ena/VASP EVH1 domain interactions. Highly invasive MDA MB 231 breast-cancer cells treated with this ligand showed displacement of VASP from focal adhesions, as well as from the front of lamellipodia, and strongly reduced cell invasion. General applicability of our strategy is illustrated by the design of an ErbB4-derived ligand containing two ProM-1 fragments, targeting the yes-associated protein 1 (YAP1)-WW domain with a fivefold higher affinity.

Download Full-text

Target-Templated de novo Design of Macrocyclic D-/L-Peptides: Inhibitors of the PD-1/PD-L1 Interaction

10.26434/chemrxiv.11663337.v3 ◽

2020 ◽

Author(s):

Salvador Guardiola ◽

Monica Varese ◽

Xavier Roig ◽

Jesús Garcia ◽

Ernest Giralt

Keyword(s):

Protein Interactions ◽

Cyclic Peptides ◽

General Framework ◽

Large Scale ◽

De Novo ◽

Inhibitory Effect ◽

Original Text ◽

Protein Protein Interactions ◽

Retraction Notice ◽

Pharmaceutical Properties

NOTE: This preprint has been retracted by consensus from all authors. See the retraction notice in place above; the original text can be found under "Version 1", accessible from the version selector above. ------------------------------------------------------------------------ Peptides, together with antibodies, are among the most potent biochemical tools to modulate challenging protein-protein interactions. However, current structure-based methods are largely limited to natural peptides and are not suitable for designing target-specific binders with improved pharmaceutical properties, such as macrocyclic peptides. Here we report a general framework that leverages the computational power of Rosetta for large-scale backbone sampling and energy scoring, followed by side-chain composition, to design heterochiral cyclic peptides that bind to a protein surface of interest. To showcase the applicability of our approach, we identified two peptides (PD-i3 and PD-i6) that target PD-1, a key immune checkpoint, and work as protein ligand decoys. A comprehensive biophysical evaluation confirmed their binding mechanism to PD-1 and their inhibitory effect on the PD-1/PD-L1 interaction. Finally, elucidation of their solution structures by NMR served as validation of our de novo design approach. We anticipate that our results will provide a general framework for designing target-specific drug-like peptides.

Download Full-text

Target-Templated de novo Design of Macrocyclic D-/L-Peptides: Inhibitors of the PD-1/PD-L1 Interaction

10.26434/chemrxiv.11663337 ◽

2020 ◽

Author(s):

Salvador Guardiola ◽

Monica Varese ◽

Xavier Roig ◽

Jesús Garcia ◽

Ernest Giralt

Keyword(s):

Protein Interactions ◽

Cyclic Peptides ◽

General Framework ◽

Large Scale ◽

De Novo ◽

Inhibitory Effect ◽

Original Text ◽

Protein Protein Interactions ◽

Retraction Notice ◽

Pharmaceutical Properties

Download Full-text

Faculty Opinions recommendation of Comparative assessment of large-scale data sets of protein-protein interactions.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1006598.82257 ◽

2002 ◽

Author(s):

Rob Russell

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Comparative Assessment ◽

Data Sets ◽

Protein Protein Interactions ◽

Large Scale Data ◽

Scale Data ◽

Large Scale Data Sets

Download Full-text

Short loop functional commonality identified in leukaemia proteome highlights crucial protein sub-networks

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab010 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Sun Sook Chung ◽

Joseph C F Ng ◽

Anna Laddach ◽

N Shaun B Thomas ◽

Franca Fraternali

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Interaction Network ◽

Protein Protein Interactions ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Short Loop ◽

New Strategy ◽

Loop Network ◽

Protein Protein Interaction Network

Abstract Direct drug targeting of mutated proteins in cancer is not always possible and efficacy can be nullified by compensating protein–protein interactions (PPIs). Here, we establish an in silico pipeline to identify specific PPI sub-networks containing mutated proteins as potential targets, which we apply to mutation data of four different leukaemias. Our method is based on extracting cyclic interactions of a small number of proteins topologically and functionally linked in the Protein–Protein Interaction Network (PPIN), which we call short loop network motifs (SLM). We uncover a new property of PPINs named ‘short loop commonality’ to measure indirect PPIs occurring via common SLM interactions. This detects ‘modules’ of PPI networks enriched with annotated biological functions of proteins containing mutation hotspots, exemplified by FLT3 and other receptor tyrosine kinase proteins. We further identify functional dependency or mutual exclusivity of short loop commonality pairs in large-scale cellular CRISPR–Cas9 knockout screening data. Our pipeline provides a new strategy for identifying new therapeutic targets for drug discovery.

Download Full-text

Investigating the Role of Large-Scale Domain Dynamics in Protein-Protein Interactions

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2016.00054 ◽

2016 ◽

Vol 3 ◽

Cited By ~ 8

Author(s):

Elise Delaforge ◽

Sigrid Milles ◽

Jie-rong Huang ◽

Denis Bouvier ◽

Malene Ringkjøbing Jensen ◽

...

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Protein Protein Interactions ◽

Domain Dynamics

Download Full-text

A MapReduce-Based Parallel Random Forest Approach for Predicting Large-Scale Protein-Protein Interactions

Intelligent Computing Methodologies - Lecture Notes in Computer Science ◽

10.1007/978-3-030-60796-8_34 ◽

2020 ◽

pp. 400-407

Author(s):

Bo-Ya Ji ◽

Zhu-Hong You ◽

Long Yang ◽

Ji-Ren Zhou ◽

Peng-Wei Hu

Keyword(s):

Random Forest ◽

Protein Interactions ◽

Large Scale ◽

Protein Protein Interactions

Download Full-text

A computational interactome and functional annotation for the human proteome

eLife ◽

10.7554/elife.18715 ◽

2016 ◽

Vol 5 ◽

Cited By ~ 32

Author(s):

José Ignacio Garzón ◽

Lei Deng ◽

Diana Murray ◽

Sagi Shapira ◽

Donald Petrey ◽

...

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Enrichment Analysis ◽

Human Proteome ◽

Gene Set Enrichment Analysis ◽

Protein Protein Interactions ◽

Functional Relationships ◽

Structural Relationships ◽

Human Proteins ◽

Validation Tests

We present a database, PrePPI (Predicting Protein-Protein Interactions), of more than 1.35 million predicted protein-protein interactions (PPIs). Of these at least 127,000 are expected to constitute direct physical interactions although the actual number may be much larger (~500,000). The current PrePPI, which contains predicted interactions for about 85% of the human proteome, is related to an earlier version but is based on additional sources of interaction evidence and is far larger in scope. The use of structural relationships allows PrePPI to infer numerous previously unreported interactions. PrePPI has been subjected to a series of validation tests including reproducing known interactions, recapitulating multi-protein complexes, analysis of disease associated SNPs, and identifying functional relationships between interacting proteins. We show, using Gene Set Enrichment Analysis (GSEA), that predicted interaction partners can be used to annotate a protein’s function. We provide annotations for most human proteins, including many annotated as having unknown function.

Download Full-text

H2V, a Database for Human Proteins and Genes in Response to SARS-CoV-2, SARS-CoV, and MERS-CoV Infection

10.21203/rs.3.rs-61338/v1 ◽

2020 ◽

Author(s):

Nan Zhou ◽

Jinku Bao ◽

Yuping Ning

Keyword(s):

Differentially Expressed Genes ◽

Protein Interactions ◽

Viral Infections ◽

Antiviral Agents ◽

Differentially Expressed ◽

Easy Access ◽

Differentially Expressed Proteins ◽

Protein Protein Interactions ◽

Associated Proteins ◽

Human Proteins

Abstract The ongoing COVID-19 pandemic in the world is caused by SARS-CoV-2, a new coronavirus firstly discovered in the end of 2019. It has led to more than 10 million confirmed cases and more than 500,000 confirmed deaths across 216 countries by 1 July 2020, according to WHO statistics. SARS-CoV-2, SARS-CoV, and MERS-CoV are alike, killing people, impairing economy, and inflicting long-term impacts on the society. However, no specific drug or vaccine has been approved as a cure for these viruses. The efforts to develop antiviral measures are hampered by insufficient understanding of molecular responses of human to viral infections. In this study, we collected experimentally validated human proteins that interact with SARS-CoV-2 proteins, human proteins whose expression, translation and phosphorylation levels experience significantly changes after SARS-CoV-2 or SARS-CoV infection, human proteins that correlate with COVID-19 severity, and human genes whose expression levels significantly changed upon SARS-CoV-2 or MERS-CoV infection. A database, H2V, was then developed for easy access to these data. Currently H2V includes: 332 human-SARS-CoV-2 protein-protein interactions; 65 differentially expressed proteins, 232 differentially translated proteins, 1298 differentially phosphorylated proteins, 204 severity associated proteins, and 4012 differentially expressed genes responding to SARS-CoV-2 infection; 66 differentially expressed proteins responding to SARS-CoV infection; and 6981 differentially expressed genes responding to MERS-CoV infection. H2V can help to understand the cellular responses associated with SARS-CoV-2, SARS-CoV and MERS-CoV infection. It is expected to speed up the development of antiviral agents and shed light on the preparation for potential coronavirus emergency in the future.Database url: http://www.zhounan.org/h2v

Download Full-text