Identifying protein complexes from protein–protein interaction networks based on the gene expression profile and core-attachment approach

Author(s):  
Soheir Noori ◽  
Nabeel Al-A’Araji ◽  
Eman Al-Shamery

Defining protein complexes in the cell is important for learning about cellular processes mechanisms as they perform many of the molecular functions in these processes. Most of the proposed algorithms predict a complex as a dense area in a Protein–Protein Interaction (PPI) network. Others, on the other hand, weight the network using gene expression or geneontology (GO). These approaches, however, eliminate the proteins and their edges that offer no gene expression data. This can lead to the loss of important topological relations. Therefore, in this study, a method based on the Gene Expression and Core-Attachment (GECA) approach was proposed for addressing these limitations. GECA is a new technique to identify core proteins using common neighbor techniques and biological information. Moreover, GECA improves the attachment technique by adding the proteins that have low closeness but high similarity to the gene expression of the core proteins. GECA has been compared with several existing methods and proved in most datasets to be able to achieve the highest F-measure. The evaluation of complexes predicted by GECA shows high biological significance.

2011 ◽  
Vol 135-136 ◽  
pp. 602-608
Author(s):  
Ya Meng ◽  
Xue Qun Shang ◽  
Miao Miao ◽  
Miao Wang

Mining functional modules with biological significance has attracted lots of attention recently. However, protein-protein interaction (PPI) network and other biological data generally bear uncertainties attributed to noise, incompleteness and inaccuracy in practice. In this paper, we focus on received PPI data with uncertainties to explore interesting protein complexes. Moreover, some novel conceptions extended from known graph conceptions are used to develop a depth-first algorithm to mine protein complexes in a simple uncertain graph. Our experiments take protein complexes from MIPS database as standard of accessing experimental results. Experiment results indicate that our algorithm has good performance in terms of coverage and precision. Experimental results are also assessed on Gene Ontology (GO) annotation, and the evaluation demonstrates proteins of our most acquired protein complexes show a high similarity. Finally, several experiments are taken to test the scalability of our algorithm. The result is also observed.


2010 ◽  
Vol 08 (supp01) ◽  
pp. 47-62 ◽  
Author(s):  
LIANG YU ◽  
LIN GAO ◽  
KUI LI

In this paper, we present a method based on local density and random walks (LDRW) for core-attachment complexes detection in protein-protein interaction (PPI) networks whether they are weighted or not. Our LDRW method consists of two stages. Firstly, it finds all the protein-complex cores based on local density of subnetwork. Then it uses random walks with restarts for finding the attachment proteins of each detected core to form complexes. We evaluate the effectiveness of our method using two different yeast PPI networks and validate the biological significance of the predicted protein complexes using known complexes in the Munich Information Center for Protein Sequence (MIPS) and Gene Ontology (GO) databases. We also perform a comprehensive comparison between our method and other existing methods. The results show that our method can find more protein complexes with high biological significance and obtains a significant improvement. Furthermore, our method is able to identify biologically significant overlapped protein complexes.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Min Li ◽  
Weijie Chen ◽  
Jianxin Wang ◽  
Fang-Xiang Wu ◽  
Yi Pan

Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data ofSaccharomyces cerevisiaeand the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures.


2019 ◽  
Vol 2019 ◽  
pp. 1-17 ◽  
Author(s):  
Jinxiong Zhang ◽  
Cheng Zhong ◽  
Hai Xiang Lin ◽  
Mian Wang

Identification of protein complex is very important for revealing the underlying mechanism of biological processes. Many computational methods have been developed to identify protein complexes from static protein-protein interaction (PPI) networks. Recently, researchers are considering the dynamics of protein-protein interactions. Dynamic PPI networks are closer to reality in the cell system. It is expected that more protein complexes can be accurately identified from dynamic PPI networks. In this paper, we use the undulating degree above the base level of gene expression instead of the gene expression level to construct dynamic temporal PPI networks. Further we convert dynamic temporal PPI networks into dynamic Temporal Interval Protein Interaction Networks (TI-PINs) and propose a novel method to accurately identify more protein complexes from the constructed TI-PINs. Owing to preserving continuous interactions within temporal interval, the constructed TI-PINs contain more dynamical information for accurately identifying more protein complexes. Our proposed identification method uses multisource biological data to judge whether the joint colocalization condition, the joint coexpression condition, and the expanding cluster condition are satisfied; this is to ensure that the identified protein complexes have the features of colocalization, coexpression, and functional homogeneity. The experimental results on yeast data sets demonstrated that using the constructed TI-PINs can obtain better identification of protein complexes than five existing dynamic PPI networks, and our proposed identification method can find more protein complexes accurately than four other methods.


2020 ◽  
Vol 27 (33) ◽  
pp. 5530-5542
Author(s):  
Xiaoqing Ye ◽  
Gang Chen ◽  
Jia Jin ◽  
Binzhong Zhang ◽  
Yinda Wang ◽  
...  

Mixed Lineage Leukemia 1 (MLL1), an important member of Histone Methyltransferases (HMT) family, is capable of catalyzing mono-, di-, and trimethylation of Histone 3 lysine 4 (H3K4). The optimal catalytic activity of MLL1 requires the formation of a core complex consisting of MLL1, WDR5, RbBP5, and ASH2L. The Protein-Protein Interaction (PPI) between WDR5 and MLL1 plays an important role in abnormal gene expression during tumorigenesis, and disturbing this interaction may have a potential for the treatment of leukemia harboring MLL1 fusion proteins. In this review, we will summarize recent progress in the development of inhibitors targeting MLL1- WDR5 interaction.


2021 ◽  
Vol 20 ◽  
pp. 153303382098329
Author(s):  
Yujie Weng ◽  
Wei Liang ◽  
Yucheng Ji ◽  
Zhongxian Li ◽  
Rong Jia ◽  
...  

Human epidermal growth factor 2 (HER2)+ breast cancer is considered the most dangerous type of breast cancers. Herein, we used bioinformatics methods to identify potential key genes in HER2+ breast cancer to enable its diagnosis, treatment, and prognosis prediction. Datasets of HER2+ breast cancer and normal tissue samples retrieved from Gene Expression Omnibus and The Cancer Genome Atlas databases were subjected to analysis for differentially expressed genes using R software. The identified differentially expressed genes were subjected to gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses followed by construction of protein-protein interaction networks using the STRING database to identify key genes. The genes were further validated via survival and differential gene expression analyses. We identified 97 upregulated and 106 downregulated genes that were primarily associated with processes such as mitosis, protein kinase activity, cell cycle, and the p53 signaling pathway. Visualization of the protein-protein interaction network identified 10 key genes ( CCNA2, CDK1, CDC20, CCNB1, DLGAP5, AURKA, BUB1B, RRM2, TPX2, and MAD2L1), all of which were upregulated. Survival analysis using PROGgeneV2 showed that CDC20, CCNA2, DLGAP5, RRM2, and TPX2 are prognosis-related key genes in HER2+ breast cancer. A nomogram showed that high expression of RRM2, DLGAP5, and TPX2 was positively associated with the risk of death. TPX2, which has not previously been reported in HER2+ breast cancer, was associated with breast cancer development, progression, and prognosis and is therefore a potential key gene. It is hoped that this study can provide a new method for the diagnosis and treatment of HER2 + breast cancer.


Sign in / Sign up

Export Citation Format

Share Document