A Least Square Method Based Model for Identifying Protein Complexes in Protein-Protein Interaction Network

Protein complex formed by a group of physical interacting proteins plays a crucial role in cell activities. Great effort has been made to computationally identify protein complexes from protein-protein interaction (PPI) network. However, the accuracy of the prediction is still far from being satisfactory, because the topological structures of protein complexes in the PPI network are too complicated. This paper proposes a novel optimization framework to detect complexes from PPI network, named PLSMC. The method is on the basis of the fact that if two proteins are in a common complex, they are likely to be interacting. PLSMC employs this relation to determine complexes by a penalized least squares method. PLSMC is applied to several public yeast PPI networks, and compared with several state-of-the-art methods. The results indicate that PLSMC outperforms other methods. In particular, complexes predicted by PLSMC can match known complexes with a higher accuracy than other methods. Furthermore, the predicted complexes have high functional homogeneity.

Download Full-text

Nonessential-Nonhub Proteins in the Protein-Protein Interaction Network

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.934.159 ◽

2014 ◽

Vol 934 ◽

pp. 159-164

Author(s):

Yun Yuan Dong ◽

Xian Chun Zhang

Keyword(s):

Protein Interaction ◽

Interaction Network ◽

Clustering Coefficient ◽

Centrality Measures ◽

Ppi Network ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Comparison Results ◽

A Cell ◽

High Degree

Protein-protein interaction (PPI) networks provide a simplified overview of the web of interactions that take place inside a cell. According to the centrality-lethality rule, hub proteins (proteins with high degree) tend to be essential in the PPI network. Moreover, there are also many low degree proteins in the PPI network, but they have different lethality. Some of them are essential proteins (essential-nonhub proteins), and the others are not (nonessential-nonhub proteins). In order to explain why nonessential-nonhub proteins don’t have essentiality, we propose a new measure n-iep (the number of essential neighbors) and compare nonessential-nonhub proteins with essential-nonhub proteins from topological, evolutionary and functional view. The comparison results show that there are statistical differences between nonessential-nonhub proteins and essential-nonhub proteins in centrality measures, clustering coefficient, evolutionary rate and the number of essential neighbors. These are reasons why nonessential-nonhub proteins don’t have lethality.

Download Full-text

SETS: A Seed-Dense-Expanding Model-Based Topological Structure for the Prediction of Overlapping Protein Complexes

Pertanika Journal of Science and Technology ◽

10.47836/pjst.29.2.35 ◽

2021 ◽

Vol 29 (2) ◽

Author(s):

Soheir Noori ◽

Nabeel Al-A’araji ◽

Eman Al-Shamery

Keyword(s):

Protein Interaction ◽

Execution Time ◽

Topological Structure ◽

Protein Complexes ◽

Biological Cell ◽

Ppi Network ◽

High Similarity ◽

Protein Protein Interaction ◽

Ppi Networks ◽

F Measure

Defining protein complexes by analysing the protein–protein interaction (PPI) networks is a crucial task in understanding the principles of a biological cell. In the last few decades, researchers have proposed numerous methods to explore the topological structure of a PPI network to detect dense protein complexes. In this paper, the overlapping protein complexes with different densities are predicted within an acceptable execution time using seed expanding model and topological structure of the PPI network (SETS). SETS depend on the relation between the seed and its neighbours. The algorithm was compared with six algorithms on six datasets: five for yeast and one for human. The results showed that SETS outperformed other algorithms in terms of F-measure, coverage rate and the number of complexes that have high similarity with real complexes.

Download Full-text

CLUSTERING ALGORITHMS FOR DETECTING FUNCTIONAL MODULES IN PROTEIN INTERACTION NETWORKS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720009004023 ◽

2009 ◽

Vol 07 (01) ◽

pp. 217-242 ◽

Cited By ~ 24

Author(s):

LIN GAO ◽

PENG-GANG SUN ◽

JIA SONG

Keyword(s):

Protein Interaction ◽

Protein Complexes ◽

Clustering Algorithms ◽

Interaction Network ◽

Future Research ◽

Functional Modules ◽

Sources Of Information ◽

Protein Protein Interaction ◽

Metabolic Functions ◽

Ppi Networks

Protein–Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. When studying the workings of a biological cell, it is useful to be able to detect known and predict still undiscovered protein complexes within the cell's PPI networks. Such predictions may be used as an inexpensive tool to direct biological experiments. The increasing amount of available PPI data necessitate a fast, accurate approach to biological complex identification. Because of its importance in the studies of protein interaction network, there are different models and algorithms in identifying functional modules in PPI networks. In this paper, we review some representative algorithms, focusing on the algorithms underlying the approaches and how the algorithms relate to each other. In particular, a comparison is given based on the property of the algorithms. Since the PPI network is noisy and still incomplete, some methods which consider other additional properties for preprocessing and purifying of PPI data are presented. We also give a discussion about the functional annotation and validation of protein complexes. Finally, new progress and future research directions are discussed from the computational viewpoint.

Download Full-text

A Novel Method for Identifying Essential Proteins Based on Non-negative Matrix Tri-Factorization

Frontiers in Genetics ◽

10.3389/fgene.2021.709660 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zhihong Zhang ◽

Meiping Jiang ◽

Dongjie Wu ◽

Wang Zhang ◽

Wei Yan ◽

...

Keyword(s):

Protein Interaction ◽

Protein Interactions ◽

Negative Impact ◽

False Negative ◽

Interaction Network ◽

Biological Information ◽

Ppi Network ◽

Essential Proteins ◽

Protein Protein Interaction ◽

Ppi Networks

Identification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, there has been an increasing interest in using computational methods to predict essential proteins based on protein–protein interaction (PPI) networks or fusing multiple biological information. However, it has been observed that existing PPI data have false-negative and false-positive data. The fusion of multiple biological information can reduce the influence of false data in PPI, but inevitably more noise data will be produced at the same time. In this article, we proposed a novel non-negative matrix tri-factorization (NMTF)-based model (NTMEP) to predict essential proteins. Firstly, a weighted PPI network is established only using the topology features of the network, so as to avoid more noise. To reduce the influence of false data (existing in PPI network) on performance of identify essential proteins, the NMTF technique, as a widely used recommendation algorithm, is performed to reconstruct a most optimized PPI network with more potential protein–protein interactions. Then, we use the PageRank algorithm to compute the final ranking score of each protein, in which subcellular localization and homologous information of proteins were used to calculate the initial scores. In addition, extensive experiments are performed on the publicly available datasets and the results indicate that our NTMEP model has better performance in predicting essential proteins against the start-of-the-art method. In this investigation, we demonstrated that the introduction of non-negative matrix tri-factorization technology can effectively improve the condition of the protein–protein interaction network, so as to reduce the negative impact of noise on the prediction. At the same time, this finding provides a more novel angle of view for other applications based on protein–protein interaction networks.

Download Full-text

Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1603992113 ◽

2016 ◽

Vol 113 (18) ◽

pp. 4976-4981 ◽

Cited By ~ 118

Author(s):

Arunachalam Vinayagam ◽

Travis E. Gibson ◽

Ho-Joon Lee ◽

Bahar Yilmazel ◽

Charles Roesel ◽

...

Keyword(s):

Protein Interaction ◽

Drug Targets ◽

Interaction Network ◽

Disease Genes ◽

Ppi Network ◽

Protein Protein Interaction ◽

Disease States ◽

Ppi Networks ◽

Human Viruses ◽

Potential Drug Targets

The protein–protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as “indispensable,” “neutral,” or “dispensable,” which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network’s control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

Download Full-text

Network Embedding the Protein–Protein Interaction Network for Human Essential Genes Identification

Genes ◽

10.3390/genes11020153 ◽

2020 ◽

Vol 11 (2) ◽

pp. 153 ◽

Cited By ~ 3

Author(s):

Wei Dai ◽

Qi Chang ◽

Wei Peng ◽

Jiancheng Zhong ◽

Yongjiang Li

Keyword(s):

Protein Interaction ◽

Essential Gene ◽

Interaction Network ◽

Essential Genes ◽

Svm Classifier ◽

Sequence Information ◽

Network Embedding ◽

Ppi Network ◽

Protein Protein Interaction ◽

Ppi Networks

Essential genes are a group of genes that are indispensable for cell survival and cell fertility. Studying human essential genes helps scientists reveal the underlying biological mechanisms of a human cell but also guides disease treatment. Recently, the publication of human essential gene data makes it possible for researchers to train a machine-learning classifier by using some features of the known human essential genes and to use the classifier to predict new human essential genes. Previous studies have found that the essentiality of genes closely relates to their properties in the protein–protein interaction (PPI) network. In this work, we propose a novel supervised method to predict human essential genes by network embedding the PPI network. Our approach implements a bias random walk on the network to get the node network context. Then, the node pairs are input into an artificial neural network to learn their representation vectors that maximally preserves network structure and the properties of the nodes in the network. Finally, the features are put into an SVM classifier to predict human essential genes. The prediction results on two human PPI networks show that our method achieves better performance than those that refer to either genes’ sequence information or genes’ centrality properties in the network as input features. Moreover, it also outperforms the methods that represent the PPI network by other previous approaches.

Download Full-text

Decoding the molecular mechanism of parthenocarpy in Musa spp. through protein–protein interaction network

Scientific Reports ◽

10.1038/s41598-021-93661-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Suthanthiram Backiyarani ◽

Rajendran Sasikala ◽

Simeon Sharmiladevi ◽

Subbaraya Uma

Keyword(s):

Candidate Genes ◽

Protein Interaction ◽

Target Genes ◽

Interaction Network ◽

Enrichment Analysis ◽

Auxin Signaling ◽

Pathway Enrichment Analysis ◽

Ppi Network ◽

Protein Protein Interaction ◽

The Difference

AbstractBanana, one of the most important staple fruit among global consumers is highly sterile owing to natural parthenocarpy. Identification of genetic factors responsible for parthenocarpy would facilitate the conventional breeders to improve the seeded accessions. We have constructed Protein–protein interaction (PPI) network through mining differentially expressed genes and the genes used for transgenic studies with respect to parthenocarpy. Based on the topological and pathway enrichment analysis of proteins in PPI network, 12 candidate genes were shortlisted. By further validating these candidate genes in seeded and seedless accession of Musa spp. we put forward MaAGL8, MaMADS16, MaGH3.8, MaMADS29, MaRGA1, MaEXPA1, MaGID1C, MaHK2 and MaBAM1 as possible target genes in the study of natural parthenocarpy. In contrary, expression profile of MaACLB-2 and MaZEP is anticipated to highlight the difference in artificially induced and natural parthenocarpy. By exploring the PPI of validated genes from the network, we postulated a putative pathway that bring insights into the significance of cytokinin mediated CLAVATA(CLV)–WUSHEL(WUS) signaling pathway in addition to gibberellin mediated auxin signaling in parthenocarpy. Our analysis is the first attempt to identify candidate genes and to hypothesize a putative mechanism that bridges the gaps in understanding natural parthenocarpy through PPI network.

Download Full-text

Spectral clustering for detecting protein complexes in protein–protein interaction (PPI) networks

Mathematical and Computer Modelling ◽

10.1016/j.mcm.2010.06.015 ◽

2010 ◽

Vol 52 (11-12) ◽

pp. 2066-2074 ◽

Cited By ~ 25

Author(s):

Guimin Qin ◽

Lin Gao

Keyword(s):

Protein Interaction ◽

Spectral Clustering ◽

Protein Complexes ◽

Protein Protein Interaction ◽

Ppi Networks

Download Full-text

Systems-level cancer gene identification from protein interaction network topology applied to melanogenesis-related functional genomics data

Journal of The Royal Society Interface ◽

10.1098/rsif.2009.0192 ◽

2009 ◽

Vol 7 (44) ◽

pp. 423-437 ◽

Cited By ~ 58

Author(s):

Tijana Milenković ◽

Vesna Memišević ◽

Anand K. Ganesan ◽

Nataša Pržulj

Keyword(s):

Protein Interaction ◽

Network Topology ◽

Interaction Network ◽

Biological Processes ◽

Cancer Gene ◽

Cancer Genes ◽

Clustering Methods ◽

Ppi Network ◽

Graph Theoretic ◽

Ppi Networks

Many real-world phenomena have been described in terms of large networks. Networks have been invaluable models for the understanding of biological systems. Since proteins carry out most biological processes, we focus on analysing protein–protein interaction (PPI) networks. Proteins interact to perform a function. Thus, PPI networks reflect the interconnected nature of biological processes and analysing their structural properties could provide insights into biological function and disease. We have already demonstrated, by using a sensitive graph theoretic method for comparing topologies of node neighbourhoods called ‘graphlet degree signatures’, that proteins with similar surroundings in PPI networks tend to perform the same functions. Here, we explore whether the involvement of genes in cancer suggests the similarity of their topological ‘signatures’ as well. By applying a series of clustering methods to proteins' topological signature similarities, we demonstrate that the obtained clusters are significantly enriched with cancer genes. We apply this methodology to identify novel cancer gene candidates, validating 80 per cent of our predictions in the literature. We also validate predictions biologically by identifying cancer-related negative regulators of melanogenesis identified in our siRNA screen. This is encouraging, since we have done this solely from PPI network topology. We provide clear evidence that PPI network structure around cancer genes is different from the structure around non-cancer genes. Understanding the underlying principles of this phenomenon is an open question, with a potential for increasing our understanding of complex diseases.

Download Full-text

A Non-negative Matrix Factorization Based Method for Identifying Essential Proteins

10.21203/rs.3.rs-537545/v1 ◽

2021 ◽

Author(s):

Zhihong Zhang ◽

Sai Hu ◽

Wei Yan ◽

Bihai Zhao ◽

Lei Wang

Keyword(s):

Protein Interaction ◽

Matrix Factorization ◽

Biological Data ◽

Protein Domain ◽

Biological Information ◽

Ppi Network ◽

Essential Proteins ◽

Protein Protein Interaction ◽

Ppi Networks ◽

Non Negative Matrix Factorization

Abstract BackgroundIdentification of essential proteins is very important for understanding the basic requirements to sustain a living organism. In recent years, various different computational methods have been proposed to identify essential proteins based on protein-protein interaction (PPI) networks. However, there has been reliable evidence that a huge amount of false negatives and false positives exist in PPI data. Therefore, it is necessary to reduce the influence of false data on accuracy of essential proteins prediction by integrating multi-source biological information with PPI networks.ResultsIn this paper, we proposed a non-negative matrix factorization and multiple biological information based model (NDM) for identifying essential proteins. The first stage in this progress was to construct a weighted PPI network by combing the information of protein domain, protein complex and the topology characteristic of the original PPI network. Then, the non-negative matrix factorization technique was used to reconstruct an optimized PPI network with whole enough weight of edges. In the final stage, the ranking score of each protein was computed by the PageRank algorithm in which the initial scores were calculated with homologous and subcellular localization information. In order to verify the effectiveness of the NDM method, we compared the NDM with other state-of-the-art essential proteins prediction methods. The comparison of the results obtained from different methods indicated that our NDM model has better performance in predicting essential proteins.ConclusionEmploying the non-negative matrix factorization and integrating multi-source biological data can effectively improve quality of the PPI network, which resulted in the led to optimization of the performance essential proteins identification. This will also provide a new perspective for other prediction based on protein-protein interaction networks.

Download Full-text