ProtFinder: finding subcellular locations of proteins using protein interaction networks

Interaction Networks ◽

Protein Subcellular Localization ◽

Human Protein Atlas ◽

Protein Subcellular Localization Prediction ◽

Localization Prediction ◽

Roc Score

Protein subcellular localization prediction plays a crucial role in improving our understandings of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder - the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model gives an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations.

XlinkCyNET: a Cytoscape application for visualization of protein interaction networks based on cross-linking mass-spectrometry identifications

10.1101/2020.12.20.423654 ◽

2020 ◽

Author(s):

Diogo Borges Lima ◽

Ying Zhu ◽

Fan Liu

Keyword(s):

Mass Spectrometry ◽

Protein Interaction ◽

Large Scale ◽

Interaction Network ◽

Interaction Networks ◽

Cross Linking ◽

Protein Interaction Data ◽

Interaction Data ◽

Link Type

ABSTRACTSoftware tools that allow visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a large selection of plugins for interpretation of protein interaction data. Chemical cross-linking coupled to mass spectrometry (XL-MS) is an increasingly important source for such interaction data, but there are currently no Cytoscape tools to analyze XL-MS results. In light of the suitability of Cytoscape platform but also to expand its toolbox, here we introduce XlinkCyNET, an open-source Cytoscape Java plugin for exploring large-scale XL-MS-based protein interaction networks. XlinkCyNET offers rapid and easy visualization of intra and intermolecular cross-links and the locations of protein domains in a rectangular bar style, allowing subdomain-level interrogation of the interaction network. XlinkCyNET is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/xlinkcynet and at https://www.theliulab.com/software/xlinkcynet.

Discovering Network Motifs in Protein Interaction Networks

Biological Data Mining in Protein Interaction Networks ◽

10.4018/978-1-60566-398-2.ch008 ◽

2009 ◽

pp. 117-143 ◽

Cited By ~ 1

Author(s):

Raymond Wan ◽

Hiroshi Mamitsuka

Keyword(s):

Protein Interaction ◽

Graph Algorithms ◽

Undirected Graph ◽

Interaction Network ◽

Building Blocks ◽

Software Tools ◽

Interaction Networks ◽

Network Motifs

This chapter examines some of the available techniques for analyzing a protein interaction network (PIN) when depicted as an undirected graph. Within this graph, algorithms have been developed which identify “notable” smaller building blocks called network motifs. The authors examine these algorithms by dividing them into two broad categories based on two de?nitions of “notable”: (a) statistically-based methods and (b) frequency-based methods. They describe how these two classes of algorithms differ not only in terms of ef?ciency, but also in terms of the type of results that they report. Some publicly-available programs are demonstrated as part of their comparison. While most of the techniques are generic and were originally proposed for other types of networks, the focus of this chapter is on the application of these methods and software tools to PINs.

Positive Selection and Centrality in the Yeast and Fly Protein-Protein Interaction Networks

BioMed Research International ◽

10.1155/2016/4658506 ◽

2016 ◽

Vol 2016 ◽

pp. 1-12 ◽

Cited By ~ 10

Author(s):

Sandip Chakraborty ◽

David Alvarez-Ponce

Keyword(s):

Positive Selection ◽

Protein Interaction ◽

Interaction Network ◽

Interaction Networks ◽

Protein Protein Interaction ◽

Protein Protein Interaction Networks ◽

Interspecific Comparisons ◽

High Degree

Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins) are expected to be “seen” by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons) tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes’ adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.

Computational identification of signaling pathways in protein interaction networks

F1000Research ◽

10.12688/f1000research.7591.1 ◽

2015 ◽

Vol 4 ◽

pp. 1522

Author(s):

Angela U. Makolo ◽

Temitayo A. Olagunju

Keyword(s):

Signaling Pathways ◽

High Throughput ◽

Protein Interaction ◽

Interaction Network ◽

Interaction Networks ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction

The knowledge of signaling pathways is central to understanding the biological mechanisms of organisms since it has been identified that in eukaryotic organisms, the number of signaling pathways determines the number of ways the organism will react to external stimuli. Signaling pathways are studied using protein interaction networks constructed from protein-protein interaction data obtained from high-throughput experiments. However, these high-throughput methods are known to produce very high rates of false positive and negative interactions. To construct a useful protein interaction network from this noisy data, computational methods are applied to validate the protein-protein interactions. In this study, a computational technique to identify signaling pathways from a protein interaction network constructed using validated protein-protein interaction data was designed.A weighted interaction graph of Saccharomyces Cerevisiae was constructed. The weights were obtained using a Bayesian probabilistic network to estimate the posterior probability of interaction between two proteins given the gene expression measurement as biological evidence. Only interactions above a threshold were accepted for the network model.We were able to identify some pathway segments, one of which is a segment of the pathway that signals the start of the process of meiosis in S. Cerevisiae.

The Implementation of Regularized Markov Clustering with Pigeon Inspired Optimization Algorithm in Analyzing the SARS-CoV-2 (COVID-19) Protein Interaction Network

Desimal Jurnal Matematika ◽

10.24042/djm.v3i3.6822 ◽

2020 ◽

Vol 3 (3) ◽

pp. 191-200

Author(s):

M. Syamsuddin Wisnubroto ◽

Marsudi Siburian ◽

Febri Dwi Irawati

Keyword(s):

Protein Interaction ◽

Optimization Algorithm ◽

Large Scale ◽

Clustering Algorithm ◽

Drug Research ◽

Interaction Network ◽

Interaction Networks ◽

Clustering Methods ◽

Markov Clustering

Proteins interact with other proteins, DNA, and other molecules, forming large-scale protein interaction networks and for easy analysis, clustering methods are needed. Regularized Markov clustering algorithm is an improvement of MCL where operations on expansion are replaced by new operations that update the flow distributions of each node. But to reduce the weaknesses of the RMCL optimization, Pigeon Inspired Optimization Algorithm (PIO) is used to replace the inflation parameters. The simulation results of IPC SARS-Cov-2 (COVID-19) inflation parameters get the result of 42 proteins as the center of the cluster and 8 protein pairs interacting with each other. Proteins of COVID-19 that interact with 20 or more proteins are ORF8, NSP13, NSP7, M, N, ORF9C, NSP8, and NSP1. Their interactions might be used as a target for drug research.

P olar M apper : a computational tool for integrated visualization of protein interaction networks and mRNA expression data

Journal of The Royal Society Interface ◽

10.1098/rsif.2008.0407 ◽

2008 ◽

Vol 6 (39) ◽

pp. 881-896 ◽

Cited By ~ 10

Author(s):

Joana P. Gonçalves ◽

Mário Grãos ◽

André X.C.N. Valente

Keyword(s):

Mrna Expression ◽

Protein Interaction ◽

Interaction Network ◽

Interaction Networks ◽

System Level ◽

Heat Shock Gene ◽

Expression Data ◽

Mrna Expression Data

P olar M apper is a computational application for exposing the architecture of protein interaction networks. It facilitates the system-level analysis of mRNA expression data in the context of the underlying protein interaction network. Preliminary analysis of a human protein interaction network and comparison of the yeast oxidative stress and heat shock gene expression responses are addressed as case studies.

Individualized interactomes for network-based precision medicine in hypertrophic cardiomyopathy with implications for other clinical pathophenotypes

Nature Communications ◽

10.1038/s41467-021-21146-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Bradley A. Maron ◽

Rui-Sheng Wang ◽

Sergei Shevtsov ◽

Stavros G. Drakos ◽

Elena Arons ◽

...

Keyword(s):

Hypertrophic Cardiomyopathy ◽

Precision Medicine ◽

Protein Interaction ◽

Interaction Network ◽

Protein Protein Interaction Networks

Janus Kinase ◽

Interaction Networks ◽

Patient Specific ◽

Protein Protein Interaction ◽

AbstractProgress in precision medicine is limited by insufficient knowledge of transcriptomic or proteomic features in involved tissues that define pathobiological differences between patients. Here, myectomy tissue from patients with obstructive hypertrophic cardiomyopathy and heart failure is analyzed using RNA-Seq, and the results are used to develop individualized protein-protein interaction networks. From this approach, hypertrophic cardiomyopathy is distinguished from dilated cardiomyopathy based on the protein-protein interaction network pattern. Within the hypertrophic cardiomyopathy cohort, the patient-specific networks are variable in complexity, and enriched for 30 endophenotypes. The cardiac Janus kinase 2-Signal Transducer and Activator of Transcription 3-collagen 4A2 (JAK2-STAT3-COL4A2) expression profile informed by the networks was able to discriminate two hypertrophic cardiomyopathy patients with extreme fibrosis phenotypes. Patient-specific network features also associate with other important hypertrophic cardiomyopathy clinical phenotypes. These proof-of-concept findings introduce personalized protein-protein interaction networks (reticulotypes) for characterizing patient-specific pathobiology, thereby offering a direct strategy for advancing precision medicine.

The PathLinker app: Connect the dots in protein interaction networks

F1000Research ◽

10.12688/f1000research.9909.1 ◽

2017 ◽

Vol 6 ◽

pp. 58 ◽

Cited By ~ 12

Author(s):

Daniel P. Gil ◽

Jeffrey N. Law ◽

T. M. Murali

Keyword(s):

Transcription Factors ◽

Signaling Pathways ◽

Signaling Pathway ◽

Protein Interaction ◽

Interaction Network ◽

Interaction Networks ◽

Graph Theoretic ◽

Manual Curation

PathLinker is a graph-theoretic algorithm for reconstructing the interactions in a signaling pathway of interest. It efficiently computes multiple short paths within a background protein interaction network from the receptors to transcription factors (TFs) in a pathway. We originally developed PathLinker to complement manual curation of signaling pathways, which is slow and painstaking. The method can be used in general to connect any set of sources to any set of targets in an interaction network. The app presented here makes the PathLinker functionality available to Cytoscape users. We present an example where we used PathLinker to compute and analyze the network of interactions connecting proteins that are perturbed by the drug lovastatin.

RNP-MaP: In-cell analysis of protein interaction networks defines functional hubs in RNA

10.1101/2020.02.07.939108 ◽

2020 ◽

Cited By ~ 2

Author(s):

Chase A. Weidmann ◽

Anthony M. Mustoe ◽

Parth B. Jariwala ◽

J. Mauro Calabrese ◽

Kevin M. Weeks

Keyword(s):

Protein Interaction ◽

Protein Interactions ◽

Noncoding Rna ◽

Interaction Network ◽

Interaction Networks ◽

E Region ◽

Nucleotide Resolution ◽

Map Data ◽

In Cells

ABSTRACTRNAs interact with networks of proteins to form complexes (RNPs) that govern many biological processes, but these networks are currently impossible to examine in a comprehensive way. We developed a live-cell chemical probing strategy for mapping protein interaction networks in any RNA with single-nucleotide resolution. This RNP-MaP strategy (RNP network analysis by mutational profiling) simultaneously detects binding by and cooperative interactions involving multiple proteins with single RNA molecules. RNP-MaP revealed that two structurally related, but sequence-divergent noncoding RNAs, RNase P and RMRP, share nearly identical RNP networks and, further, that protein interaction network hubs identify function-critical sites in these RNAs. RNP-MaP identified numerous protein interaction networks within the XIST long noncoding RNA that are conserved between mouse and human RNAs and distinguished communities of proteins that network together on XIST. RNP-MaP data show that the Xist E region is densely networked by protein interactions and that PTBP1, MATR3, and TIA1 proteins each interface with the XIST E region via two distinct interaction modes; and we find that the XIST E region is sufficient to mediate RNA foci formation in cells. RNP-MaP will enable discovery and mechanistic analysis of protein interaction networks across any RNA in cells.

Super.Complex: A supervised machine learning pipeline for molecular complex detection in protein-interaction networks

10.1101/2021.06.22.449395 ◽

2021 ◽

Author(s):

Meghana Venkata Palukuri ◽

Edward M Marcotte

Keyword(s):

Machine Learning ◽

Community Detection ◽

Protein Interaction ◽

Protein Complexes ◽

Fitness Function ◽

Interaction Network ◽

Interaction Networks ◽

Supervised Machine Learning

Protein complexes can be computationally identified from protein-interaction networks with community detection methods, suggesting new multi-protein assemblies. Most community detection algorithms tend to be un- or semi-supervised and assume that communities are dense network subgraphs, which is not always true, as protein complexes can exhibit diverse network topologies. The few existing supervised machine learning methods are serial and can potentially be improved in terms of accuracy and scalability by using better-suited machine learning models and by using parallel algorithms, respectively. Here, we present Super.Complex, a distributed supervised machine learning pipeline for community detection in networks. Super.Complex learns a community fitness function from known communities using an AutoML method and applies this fitness function to detect new communities. A heuristic local search algorithm finds maximally scoring communities with epsilon-greedy and pseudo-metropolis criteria, and an embarrassingly parallel implementation can be run on a computer cluster for scaling to large networks. In order to evaluate Super.Complex, we propose three new measures for the still outstanding issue of comparing sets of learned and known communities. On a yeast protein-interaction network, Super.Complex outperforms 6 other supervised and 4 unsupervised methods. Application of Super.Complex to a human protein-interaction network with ~8k nodes and ~60k edges yields 1,028 protein complexes, with 234 complexes linked to SARS-CoV-2, with 111 uncharacterized proteins present in 103 learned complexes. Super.Complex is generalizable and can be used in different applications of community detection, with the ability to improve results by incorporating domain-specific features. Learned community characteristics can also be transferred from existing applications to detect communities in a new application with no known communities. Code and interactive visualizations of learned human protein complexes are freely available at: https://sites.google.com/view/supercomplex/super-complex-v3-0 .