scholarly journals Detecting overlapping protein complexes in weighted PPI network based on overlay network chain in quotient space

2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Jie Zhao ◽  
Xiujuan Lei

Abstract Background Protein complexes are the cornerstones of many biological processes and gather them to form various types of molecular machinery that perform a vast array of biological functions. In fact, a protein may belong to multiple protein complexes. Most existing protein complex detection algorithms cannot reflect overlapping protein complexes. To solve this problem, a novel overlapping protein complexes identification algorithm is proposed. Results In this paper, a new clustering algorithm based on overlay network chain in quotient space, marked as ONCQS, was proposed to detect overlapping protein complexes in weighted PPI networks. In the quotient space, a multilevel overlay network is constructed by using the maximal complete subgraph to mine overlapping protein complexes. The GO annotation data is used to weight the PPI network. According to the compatibility relation, the overlay network chain in quotient space was calculated. The protein complexes are contained in the last level of the overlay network. The experiments were carried out on four PPI databases, and compared ONCQS with five other state-of-the-art methods in the identification of protein complexes. Conclusions We have applied ONCQS to four PPI databases DIP, Gavin, Krogan and MIPS, the results show that it is superior to other five existing algorithms MCODE, MCL, CORE, ClusterONE and COACH in detecting overlapping protein complexes.

2020 ◽  
Author(s):  
Ping Kong ◽  
Wei Liu

Abstract Background: Escherichia coli has been at the center of microbial research for decades, making it a standard microorganism for studying molecular mechanism. Molecular complexes, operons and functional modules are important molecular functional domains of Escherichia coli. Most previous studies focused on the detection of E. coli protein complexes based on the experimental methods. While the research of prediction of protein complexes in E. coli based on large-scale proteomic data, especially the functional modules of E. coli are relatively few. Identifying protein complexes and functional modules of E. coli is crucial to reveal principles of cellular organizations, processes and functions. Results: In this study, the protein complexes and functional modules of two high-quality binary interaction datasets of E. coli are predicted by an efficient edge clustering algorithm (ELPA) for complex biological network, respectively. According to the gold standard protein complexes and function annotations provided by EcoCyc dataset, the experimental results show that most topological modules predicted in the two datasets match very well with the real protein complexes, cellular processes and biological functions. By analyzing the corresponding complexes and functional modules shows that all predicted protein complexes are fully covered by one or more functional modules. Furthermore, we compared the results of ELPA with a famous node clustering algorithm (MCL) on the same PPI network of E. coli , and found that ELPA outperforms MCL in terms of matching with gold standard complexes. Conclusions: As a consequence, we surmise that topological modules of PPI network detected by ELPA fits well with real protein complexes and functional units. In most predicted topological modules, the protein complexes and corresponding functional modules are highly overlapping. ELPA is an effective tool to predict protein complexes and functional modules in PPI networks of E. coli.


2015 ◽  
Vol 13 (02) ◽  
pp. 1571001 ◽  
Author(s):  
Chern Han Yong ◽  
Limsoon Wong

Protein interactions and complexes behave in a dynamic fashion, but this dynamism is not captured by interaction screening technologies, and not preserved in protein–protein interaction (PPI) networks. The analysis of static interaction data to derive dynamic protein complexes leads to several challenges, of which we identify three. First, many proteins participate in multiple complexes, leading to overlapping complexes embedded within highly-connected regions of the PPI network. This makes it difficult to accurately delimit the boundaries of such complexes. Second, many condition- and location-specific PPIs are not detected, leading to sparsely-connected complexes that cannot be picked out by clustering algorithms. Third, the majority of complexes are small complexes (made up of two or three proteins), which are extra sensitive to the effects of extraneous edges and missing co-complex edges. We show that many existing complex-discovery algorithms have trouble predicting such complexes, and show that our insight into the disparity between the static interactome and dynamic protein complexes can be used to improve the performance of complex discovery.


2014 ◽  
Vol 2014 ◽  
pp. 1-12
Author(s):  
Jun Ren ◽  
Wei Zhou ◽  
Jianxin Wang

Many evidences have demonstrated that protein complexes are overlapping and hierarchically organized in PPI networks. Meanwhile, the large size of PPI network wants complex detection methods have low time complexity. Up to now, few methods can identify overlapping and hierarchical protein complexes in a PPI network quickly. In this paper, a novel method, called MCSE, is proposed based onλ-module and “seed-expanding.” First, it chooses seeds as essential PPIs or edges with high edge clustering values. Then, it identifies protein complexes by expanding each seed to aλ-module. MCSE is suitable for large PPI networks because of its low time complexity. MCSE can identify overlapping protein complexes naturally because a protein can be visited by different seeds. MCSE uses the parameterλ_th to control the range of seed expanding and can detect a hierarchical organization of protein complexes by tuning the value ofλ_th. Experimental results ofS. cerevisiaeshow that this hierarchical organization is similar to that of known complexes in MIPS database. The experimental results also show that MCSE outperforms other previous competing algorithms, such as CPM, CMC, Core-Attachment, Dpclus, HC-PIN, MCL, and NFC, in terms of the functional enrichment and matching with known protein complexes.


2012 ◽  
Vol 2012 ◽  
pp. 1-11 ◽  
Author(s):  
Md. Altaf-Ul-Amin ◽  
Masayoshi Wada ◽  
Shigehiko Kanaya

This paper presents an algorithm called DPClusO for partitioning simple graphs into overlapping modules, that is, clusters constrained by density and periphery tracking. The major advantages of DPClusO over the related and previously published algorithm DPClus are shorter running time and ensuring coverage, that is, each node goes to at least one module. DPClusO is a general-purpose clustering algorithm and useful for finding overlapping cohesive groups in a simple graph for any type of application. This work shows that the modules generated by DPClusO from several PPI networks of yeast with high-density constraint match with more known complexes compared to some other recently published complex generating algorithms. Furthermore, the biological significance of the high density modules has been demonstrated by comparing their P values in the context of Gene Ontology (GO) terms with those of the randomly generated modules having the same size, distribution, and zero density. As a consequence, it was also learnt that a PPI network is a combination of mainly high-density and star-like modules.


2021 ◽  
Vol 29 (2) ◽  
Author(s):  
Soheir Noori ◽  
Nabeel Al-A’araji ◽  
Eman Al-Shamery

Defining protein complexes by analysing the protein–protein interaction (PPI) networks is a crucial task in understanding the principles of a biological cell. In the last few decades, researchers have proposed numerous methods to explore the topological structure of a PPI network to detect dense protein complexes. In this paper, the overlapping protein complexes with different densities are predicted within an acceptable execution time using seed expanding model and topological structure of the PPI network (SETS). SETS depend on the relation between the seed and its neighbours. The algorithm was compared with six algorithms on six datasets: five for yeast and one for human. The results showed that SETS outperformed other algorithms in terms of F-measure, coverage rate and the number of complexes that have high similarity with real complexes.


Genes ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 177 ◽  
Author(s):  
Xiujuan Lei ◽  
Siguo Wang ◽  
Fang-Xiang Wu

Essential proteins are critical to the development and survival of cells. Identifying and analyzing essential proteins is vital to understand the molecular mechanisms of living cells and design new drugs. With the development of high-throughput technologies, many protein–protein interaction (PPI) data are available, which facilitates the studies of essential proteins at the network level. Up to now, although various computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a novel method by applying Hyperlink-Induced Topic Search (HITS) on weighted PPI networks to detect essential proteins, named HSEP. First, an original undirected PPI network is transformed into a bidirectional PPI network. Then, both biological information and network topological characteristics are taken into account to weighted PPI networks. Pieces of biological information include gene expression data, Gene Ontology (GO) annotation and subcellular localization. The edge clustering coefficient is represented as network topological characteristics to measure the closeness of two connected nodes. We conducted experiments on two species, namely Saccharomyces cerevisiae and Drosophila melanogaster, and the experimental results show that HSEP outperformed some state-of-the-art essential proteins detection techniques.


2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Qiguo Dai ◽  
Maozu Guo ◽  
Yingjie Guo ◽  
Xiaoyan Liu ◽  
Yang Liu ◽  
...  

Protein complex formed by a group of physical interacting proteins plays a crucial role in cell activities. Great effort has been made to computationally identify protein complexes from protein-protein interaction (PPI) network. However, the accuracy of the prediction is still far from being satisfactory, because the topological structures of protein complexes in the PPI network are too complicated. This paper proposes a novel optimization framework to detect complexes from PPI network, named PLSMC. The method is on the basis of the fact that if two proteins are in a common complex, they are likely to be interacting. PLSMC employs this relation to determine complexes by a penalized least squares method. PLSMC is applied to several public yeast PPI networks, and compared with several state-of-the-art methods. The results indicate that PLSMC outperforms other methods. In particular, complexes predicted by PLSMC can match known complexes with a higher accuracy than other methods. Furthermore, the predicted complexes have high functional homogeneity.


2017 ◽  
Author(s):  
Dong Li ◽  
Zexuan Zhu ◽  
Zhisong Pan ◽  
Guyu Hu ◽  
Shan He

AbstractActive modules identification has received much attention due to its ability to reveal regulatory and signaling mechanisms of a given cellular response. Most existing algorithms identify active modules by extracting connected nodes with high activity scores from a graph. These algorithms do not consider other topological properties such as community structure, which may correspond to functional units. In this paper, we propose an active module identification algorithm based on a novel objective function, which considers both and network topology and nodes activity. This objective is formulated as a constrained quadratic programming problem, which is convex and can be solved by iterative methods. Furthermore, the framework is extended to the multilayer dynamic PPI networks. Empirical results on the single layer and multilayer PPI networks show the effectiveness of proposed algorithms.Availability: The package and code for reproducing all results and figures are available at https://github.com/fairmiracle/ModuleExtraction.


Sign in / Sign up

Export Citation Format

Share Document