Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks

Abstract Background Protein complexes are the cornerstones of many biological processes and gather them to form various types of molecular machinery that perform a vast array of biological functions. In fact, a protein may belong to multiple protein complexes. Most existing protein complex detection algorithms cannot reflect overlapping protein complexes. To solve this problem, a novel overlapping protein complexes identification algorithm is proposed. Results In this paper, a new clustering algorithm based on overlay network chain in quotient space, marked as ONCQS, was proposed to detect overlapping protein complexes in weighted PPI networks. In the quotient space, a multilevel overlay network is constructed by using the maximal complete subgraph to mine overlapping protein complexes. The GO annotation data is used to weight the PPI network. According to the compatibility relation, the overlay network chain in quotient space was calculated. The protein complexes are contained in the last level of the overlay network. The experiments were carried out on four PPI databases, and compared ONCQS with five other state-of-the-art methods in the identification of protein complexes. Conclusions We have applied ONCQS to four PPI databases DIP, Gavin, Krogan and MIPS, the results show that it is superior to other five existing algorithms MCODE, MCL, CORE, ClusterONE and COACH in detecting overlapping protein complexes.

Download Full-text

Functional geometry of protein interactomes

Bioinformatics ◽

10.1093/bioinformatics/btz146 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3727-3734 ◽

Cited By ~ 2

Author(s):

Noël Malod-Dognin ◽

Nataša Pržulj

Keyword(s):

Functional Organization ◽

Protein Complexes ◽

Simplicial Complexes ◽

Higher Order ◽

Biological Information ◽

Supplementary Information ◽

Data Types ◽

Ppi Networks ◽

Functional Geometry ◽

Better Than

Abstract Motivation Protein–protein interactions (PPIs) are usually modeled as networks. These networks have extensively been studied using graphlets, small induced subgraphs capturing the local wiring patterns around nodes in networks. They revealed that proteins involved in similar functions tend to be similarly wired. However, such simple models can only represent pairwise relationships and cannot fully capture the higher-order organization of protein interactomes, including protein complexes. Results To model the multi-scale organization of these complex biological systems, we utilize simplicial complexes from computational geometry. The question is how to mine these new representations of protein interactomes to reveal additional biological information. To address this, we define simplets, a generalization of graphlets to simplicial complexes. By using simplets, we define a sensitive measure of similarity between simplicial complex representations that allows for clustering them according to their data types better than clustering them by using other state-of-the-art measures, e.g. spectral distance, or facet distribution distance. We model human and baker’s yeast protein interactomes as simplicial complexes that capture PPIs and protein complexes as simplices. On these models, we show that our newly introduced simplet-based methods cluster proteins by function better than the clustering methods that use the standard PPI networks, uncovering the new underlying functional organization of the cell. We demonstrate the existence of the functional geometry in the protein interactome data and the superiority of our simplet-based methods to effectively mine for new biological information hidden in the complexity of the higher-order organization of protein interactomes. Availability and implementation Codes and datasets are freely available at http://www0.cs.ucl.ac.uk/staff/natasa/Simplets/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A Survey of Computational Methods for Protein Complexes Prediction Based on Static PPI Networks

Software Engineering and Applications ◽

10.12677/sea.2018.73018 ◽

2018 ◽

Vol 07 (03) ◽

pp. 151-159

Author(s):

杨于

Keyword(s):

Computational Methods ◽

Protein Complexes ◽

Ppi Networks

Download Full-text

Identifying Protein Complexes Method Based on Time-Sequenced Association and Ant Colony Clustering in Dynamic PPI Networks

2016 IEEE 16th International Conference on Bioinformatics and Bioengineering (BIBE) ◽

10.1109/bibe.2016.28 ◽

2016 ◽

Cited By ~ 1

Author(s):

Cuicui Yang ◽

Junzhong Ji ◽

Jiawei Lv

Keyword(s):

Protein Complexes ◽

Ant Colony ◽

Ppi Networks ◽

Ant Colony Clustering

Download Full-text

Unsupervised methods for finding protein complexes from PPI networks

Network Modeling Analysis in Health Informatics and Bioinformatics ◽

10.1007/s13721-015-0080-7 ◽

2015 ◽

Vol 4 (1) ◽

Cited By ~ 4

Author(s):

Pooja Sharma ◽

Hasin A. Ahmed ◽

Swarup Roy ◽

Dhruba K. Bhattacharyya

Keyword(s):

Protein Complexes ◽

Ppi Networks ◽

Unsupervised Methods

Download Full-text

A METHOD BASED ON LOCAL DENSITY AND RANDOM WALKS FOR COMPLEXES DETECTION IN PROTEIN INTERACTION NETWORKS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720010005191 ◽

2010 ◽

Vol 08 (supp01) ◽

pp. 47-62 ◽

Cited By ~ 6

Author(s):

LIANG YU ◽

LIN GAO ◽

KUI LI

Keyword(s):

Random Walks ◽

Protein Interaction ◽

Local Density ◽

Protein Complexes ◽

Biological Significance ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Comprehensive Comparison ◽

Attachment Proteins ◽

Two Stages

In this paper, we present a method based on local density and random walks (LDRW) for core-attachment complexes detection in protein-protein interaction (PPI) networks whether they are weighted or not. Our LDRW method consists of two stages. Firstly, it finds all the protein-complex cores based on local density of subnetwork. Then it uses random walks with restarts for finding the attachment proteins of each detected core to form complexes. We evaluate the effectiveness of our method using two different yeast PPI networks and validate the biological significance of the predicted protein complexes using known complexes in the Munich Information Center for Protein Sequence (MIPS) and Gene Ontology (GO) databases. We also perform a comprehensive comparison between our method and other existing methods. The results show that our method can find more protein complexes with high biological significance and obtains a significant improvement. Furthermore, our method is able to identify biologically significant overlapped protein complexes.

Download Full-text

Identifying Protein Complexes from PPI Networks Using GO Semantic Similarity

2011 IEEE International Conference on Bioinformatics and Biomedicine ◽

10.1109/bibm.2011.52 ◽

2011 ◽

Cited By ~ 1

Author(s):

Jian Wang ◽

Dong Xie ◽

Hongfei Lin ◽

Zhihao Yang ◽

Yijia Zhang

Keyword(s):

Semantic Similarity ◽

Protein Complexes ◽

Ppi Networks

Download Full-text

Entropy-Based Graph Clustering of PPI Networks for Predicting Overlapping Functional Modules of Proteins

Entropy ◽

10.3390/e23101271 ◽

2021 ◽

Vol 23 (10) ◽

pp. 1271

Author(s):

Hoyeon Jeong ◽

Yoonbee Kim ◽

Yi-Sue Jung ◽

Dae Ryong Kang ◽

Young-Rae Cho

Keyword(s):

Protein Interactions ◽

Protein Complexes ◽

Clustering Algorithms ◽

Graph Clustering ◽

Functional Modules ◽

Protein Protein Interactions ◽

Overlapping Clusters ◽

Novel Proteins ◽

Ppi Networks ◽

Function Modules

Functional modules can be predicted using genome-wide protein–protein interactions (PPIs) from a systematic perspective. Various graph clustering algorithms have been applied to PPI networks for this task. In particular, the detection of overlapping clusters is necessary because a protein is involved in multiple functions under different conditions. graph entropy (GE) is a novel metric to assess the quality of clusters in a large, complex network. In this study, the unweighted and weighted GE algorithm is evaluated to prove the validity of predicting function modules. To measure clustering accuracy, the clustering results are compared to protein complexes and Gene Ontology (GO) annotations as references. We demonstrate that the GE algorithm is more accurate in overlapping clusters than the other competitive methods. Moreover, we confirm the biological feasibility of the proteins that occur most frequently in the set of identified clusters. Finally, novel proteins for the additional annotation of GO terms are revealed.

Download Full-text

ClusterM: a scalable algorithm for computational prediction of conserved protein complexes across multiple protein interaction networks

BMC Genomics ◽

10.1186/s12864-020-07010-1 ◽

2020 ◽

Vol 21 (S10) ◽

Author(s):

Yijie Wang ◽

Hyundoo Jeong ◽

Byung-Jun Yoon ◽

Xiaoning Qian

Keyword(s):

Protein Interaction ◽

Protein Sequence ◽

De Novo ◽

Homo Sapiens ◽

Protein Complexes ◽

Sequence Similarity ◽

Computational Prediction ◽

Scalable Algorithm ◽

Ppi Networks ◽

Multiple Protein

Abstract Background The current computational methods on identifying conserved protein complexes across multiple Protein-Protein Interaction (PPI) networks suffer from the lack of explicit modeling of the desired topological properties within conserved protein complexes as well as their scalability. Results To overcome those issues, we propose a scalable algorithm—ClusterM—for identifying conserved protein complexes across multiple PPI networks through the integration of network topology and protein sequence similarity information. ClusterM overcomes the computational barrier that existed in previous methods, where the complexity escalates exponentially when handling an increasing number of PPI networks; and it is able to detect conserved protein complexes with both topological separability and cohesive protein sequence conservation. On two independent compendiums of PPI networks from Saccharomyces cerevisiae (Sce, yeast), Drosophila melanogaster (Dme, fruit fly), Caenorhabditis elegans (Cel, worm), and Homo sapiens (Hsa, human), we demonstrate that ClusterM outperforms other state-of-the-art algorithms by a significant margin and is able to identify de novo conserved protein complexes across four species that are missed by existing algorithms. Conclusions ClusterM can better capture the desired topological property of a typical conserved protein complex, which is densely connected within the complex while being well-separated from the rest of the networks. Furthermore, our experiments have shown that ClusterM is highly scalable and efficient when analyzing multiple PPI networks.

Download Full-text