scholarly journals ProtFinder: finding subcellular locations of proteins using protein interaction networks

2022 ◽  
Author(s):  
Aayush Grover ◽  
Laurent Gatto

Protein subcellular localization prediction plays a crucial role in improving our understandings of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder - the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model gives an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations.

2020 ◽  
Author(s):  
Diogo Borges Lima ◽  
Ying Zhu ◽  
Fan Liu

ABSTRACTSoftware tools that allow visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a large selection of plugins for interpretation of protein interaction data. Chemical cross-linking coupled to mass spectrometry (XL-MS) is an increasingly important source for such interaction data, but there are currently no Cytoscape tools to analyze XL-MS results. In light of the suitability of Cytoscape platform but also to expand its toolbox, here we introduce XlinkCyNET, an open-source Cytoscape Java plugin for exploring large-scale XL-MS-based protein interaction networks. XlinkCyNET offers rapid and easy visualization of intra and intermolecular cross-links and the locations of protein domains in a rectangular bar style, allowing subdomain-level interrogation of the interaction network. XlinkCyNET is freely available from the Cytoscape app store: http://apps.cytoscape.org/apps/xlinkcynet and at https://www.theliulab.com/software/xlinkcynet.


Author(s):  
Raymond Wan ◽  
Hiroshi Mamitsuka

This chapter examines some of the available techniques for analyzing a protein interaction network (PIN) when depicted as an undirected graph. Within this graph, algorithms have been developed which identify “notable” smaller building blocks called network motifs. The authors examine these algorithms by dividing them into two broad categories based on two de?nitions of “notable”: (a) statistically-based methods and (b) frequency-based methods. They describe how these two classes of algorithms differ not only in terms of ef?ciency, but also in terms of the type of results that they report. Some publicly-available programs are demonstrated as part of their comparison. While most of the techniques are generic and were originally proposed for other types of networks, the focus of this chapter is on the application of these methods and software tools to PINs.


2016 ◽  
Vol 2016 ◽  
pp. 1-12 ◽  
Author(s):  
Sandip Chakraborty ◽  
David Alvarez-Ponce

Proteins within a molecular network are expected to be subject to different selective pressures depending on their relative hierarchical positions. However, it is not obvious what genes within a network should be more likely to evolve under positive selection. On one hand, only mutations at genes with a relatively high degree of control over adaptive phenotypes (such as those encoding highly connected proteins) are expected to be “seen” by natural selection. On the other hand, a high degree of pleiotropy at these genes is expected to hinder adaptation. Previous analyses of the human protein-protein interaction network have shown that genes under long-term, recurrent positive selection (as inferred from interspecific comparisons) tend to act at the periphery of the network. It is unknown, however, whether these trends apply to other organisms. Here, we show that long-term positive selection has preferentially targeted the periphery of the yeast interactome. Conversely, in flies, genes under positive selection encode significantly more connected and central proteins. These observations are not due to covariation of genes’ adaptability and centrality with confounding factors. Therefore, the distribution of proteins encoded by genes under recurrent positive selection across protein-protein interaction networks varies from one species to another.


F1000Research ◽  
2015 ◽  
Vol 4 ◽  
pp. 1522
Author(s):  
Angela U. Makolo ◽  
Temitayo A. Olagunju

The knowledge of signaling pathways is central to understanding the biological mechanisms of organisms since it has been identified that in eukaryotic organisms, the number of signaling pathways determines the number of ways the organism will react to external stimuli. Signaling pathways are studied using protein interaction networks constructed from protein-protein interaction data obtained from high-throughput experiments. However, these high-throughput methods are known to produce very high rates of false positive and negative interactions. To construct a useful protein interaction network from this noisy data, computational methods are applied to validate the protein-protein interactions. In this study, a computational technique to identify signaling pathways from a protein interaction network constructed using validated protein-protein interaction data was designed.A weighted interaction graph of Saccharomyces Cerevisiae was constructed. The weights were obtained using a Bayesian probabilistic network to estimate the posterior probability of interaction between two proteins given the gene expression measurement as biological evidence. Only interactions above a threshold were accepted for the network model.We were able to identify some pathway segments, one of which is a segment of the pathway that signals the start of the process of meiosis in S. Cerevisiae.


2020 ◽  
Vol 3 (3) ◽  
pp. 191-200
Author(s):  
M. Syamsuddin Wisnubroto ◽  
Marsudi Siburian ◽  
Febri Dwi Irawati

Proteins interact with other proteins, DNA, and other molecules, forming large-scale protein interaction networks and for easy analysis, clustering methods are needed. Regularized Markov clustering algorithm is an improvement of MCL where operations on expansion are replaced by new operations that update the flow distributions of each node. But to reduce the weaknesses of the RMCL optimization, Pigeon Inspired Optimization Algorithm (PIO) is used to replace the inflation parameters. The simulation results of IPC SARS-Cov-2 (COVID-19) inflation parameters  get the result of 42 proteins as the center of the cluster and 8 protein pairs interacting with each other. Proteins of COVID-19 that interact with 20 or more proteins are ORF8, NSP13, NSP7, M, N, ORF9C, NSP8, and NSP1. Their interactions might be used as a target for drug research.


2008 ◽  
Vol 6 (39) ◽  
pp. 881-896 ◽  
Author(s):  
Joana P. Gonçalves ◽  
Mário Grãos ◽  
André X.C.N. Valente

P olar M apper is a computational application for exposing the architecture of protein interaction networks. It facilitates the system-level analysis of mRNA expression data in the context of the underlying protein interaction network. Preliminary analysis of a human protein interaction network and comparison of the yeast oxidative stress and heat shock gene expression responses are addressed as case studies.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Bradley A. Maron ◽  
Rui-Sheng Wang ◽  
Sergei Shevtsov ◽  
Stavros G. Drakos ◽  
Elena Arons ◽  
...  

AbstractProgress in precision medicine is limited by insufficient knowledge of transcriptomic or proteomic features in involved tissues that define pathobiological differences between patients. Here, myectomy tissue from patients with obstructive hypertrophic cardiomyopathy and heart failure is analyzed using RNA-Seq, and the results are used to develop individualized protein-protein interaction networks. From this approach, hypertrophic cardiomyopathy is distinguished from dilated cardiomyopathy based on the protein-protein interaction network pattern. Within the hypertrophic cardiomyopathy cohort, the patient-specific networks are variable in complexity, and enriched for 30 endophenotypes. The cardiac Janus kinase 2-Signal Transducer and Activator of Transcription 3-collagen 4A2 (JAK2-STAT3-COL4A2) expression profile informed by the networks was able to discriminate two hypertrophic cardiomyopathy patients with extreme fibrosis phenotypes. Patient-specific network features also associate with other important hypertrophic cardiomyopathy clinical phenotypes. These proof-of-concept findings introduce personalized protein-protein interaction networks (reticulotypes) for characterizing patient-specific pathobiology, thereby offering a direct strategy for advancing precision medicine.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 58 ◽  
Author(s):  
Daniel P. Gil ◽  
Jeffrey N. Law ◽  
T. M. Murali

PathLinker is a graph-theoretic algorithm for reconstructing the interactions in a signaling pathway of interest. It efficiently computes multiple short paths within a background protein interaction network from the receptors to transcription factors (TFs) in a pathway. We originally developed PathLinker to complement manual curation of signaling pathways, which is slow and painstaking. The method can be used in general to connect any set of sources to any set of targets in an interaction network. The app presented here makes the PathLinker functionality available to Cytoscape users. We present an example where we used PathLinker to compute and analyze the network of interactions connecting proteins that are perturbed by the drug lovastatin.


Author(s):  
Chase A. Weidmann ◽  
Anthony M. Mustoe ◽  
Parth B. Jariwala ◽  
J. Mauro Calabrese ◽  
Kevin M. Weeks

ABSTRACTRNAs interact with networks of proteins to form complexes (RNPs) that govern many biological processes, but these networks are currently impossible to examine in a comprehensive way. We developed a live-cell chemical probing strategy for mapping protein interaction networks in any RNA with single-nucleotide resolution. This RNP-MaP strategy (RNP network analysis by mutational profiling) simultaneously detects binding by and cooperative interactions involving multiple proteins with single RNA molecules. RNP-MaP revealed that two structurally related, but sequence-divergent noncoding RNAs, RNase P and RMRP, share nearly identical RNP networks and, further, that protein interaction network hubs identify function-critical sites in these RNAs. RNP-MaP identified numerous protein interaction networks within the XIST long noncoding RNA that are conserved between mouse and human RNAs and distinguished communities of proteins that network together on XIST. RNP-MaP data show that the Xist E region is densely networked by protein interactions and that PTBP1, MATR3, and TIA1 proteins each interface with the XIST E region via two distinct interaction modes; and we find that the XIST E region is sufficient to mediate RNA foci formation in cells. RNP-MaP will enable discovery and mechanistic analysis of protein interaction networks across any RNA in cells.


2021 ◽  
Author(s):  
Meghana Venkata Palukuri ◽  
Edward M Marcotte

Protein complexes can be computationally identified from protein-interaction networks with community detection methods, suggesting new multi-protein assemblies. Most community detection algorithms tend to be un- or semi-supervised and assume that communities are dense network subgraphs, which is not always true, as protein complexes can exhibit diverse network topologies. The few existing supervised machine learning methods are serial and can potentially be improved in terms of accuracy and scalability by using better-suited machine learning models and by using parallel algorithms, respectively.  Here, we present Super.Complex, a distributed supervised machine learning pipeline for community detection in networks. Super.Complex learns a community fitness function from known communities using an AutoML method and applies this fitness function to detect new communities. A heuristic local search algorithm finds maximally scoring communities with epsilon-greedy and pseudo-metropolis criteria, and an embarrassingly parallel implementation can be run on a computer cluster for scaling to large networks. In order to evaluate Super.Complex, we propose three new measures for the still outstanding issue of comparing sets of learned and known communities. On a yeast protein-interaction network, Super.Complex outperforms 6 other supervised and 4 unsupervised methods. Application of Super.Complex to a human protein-interaction network with ~8k nodes and ~60k edges yields 1,028 protein complexes, with 234 complexes linked to SARS-CoV-2, with 111 uncharacterized proteins present in 103 learned complexes. Super.Complex is generalizable and can be used in different applications of community detection, with the ability to improve results by incorporating domain-specific features. Learned community characteristics can also be transferred from existing applications to detect communities in a new application with no known communities. Code and interactive visualizations of learned human protein complexes are freely available at: https://sites.google.com/view/supercomplex/super-complex-v3-0 .


Sign in / Sign up

Export Citation Format

Share Document