scholarly journals Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data

2012 ◽  
Vol 13 (1) ◽  
Author(s):  
Min Li ◽  
Xuehong Wu ◽  
Jianxin Wang ◽  
Yi Pan
BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zihao Zhao ◽  
Wenjun Xu ◽  
Aiwen Chen ◽  
Yueyue Han ◽  
Shengrong Xia ◽  
...  

Abstract Background The study of protein complexes and protein functional modules has become an important method to further understand the mechanism and organization of life activities. The clustering algorithms used to analyze the information contained in protein-protein interaction network are effective ways to explore the characteristics of protein functional modules. Results This paper conducts an intensive study on the problems of low recognition efficiency and noise in the overlapping structure of protein functional modules, based on topological characteristics of PPI network. Developing a protein function module recognition method ECTG based on Topological Features and Gene expression data for Protein Complex Identification. Conclusions The algorithm can effectively remove the noise data reflected by calculating the topological structure characteristic values in the PPI network through the similarity of gene expression patterns, and also properly use the information hidden in the gene expression data. The experimental results show that the ECTG algorithm can detect protein functional modules better.


2007 ◽  
Vol 8 (1) ◽  
pp. 408 ◽  
Author(s):  
Ioannis A Maraziotis ◽  
Konstantina Dimitrakopoulou ◽  
Anastasios Bezerianos

2020 ◽  
Author(s):  
Jiancheng Zhong ◽  
Chao Tang ◽  
Wei Peng ◽  
Minzhu Xie ◽  
Yusui Sun ◽  
...  

Abstract Background: Some proposed methods for identifying essential proteins have better results by usingbiological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins.Results: In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression.Conclusions: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network.


2021 ◽  
Vol 16 ◽  
Author(s):  
Yuanyuan Chen ◽  
Xiaodan Fan ◽  
Cong Pian

Aims: The aim of this article was to find functional (or disease-relevant) modules using gene expression data. Background: Biotechnological developments are leading to a rapid increase in the volume of transcriptome data and thus driving the growth of interactome data. This has made it possible to perform transcriptomic analysis by integrating interactome data. Considering that genes do not exist nor operate in isolation, and instead participate in biological networks, interactomics is equally important to expression profiles. Objective: We constructed a network-based method based on gene expression data in order to identify functional (or disease-relevant) modules. Method: We used the energy minimization with graph cuts method by integrating gene interaction networks under the assumption of the ‘guilt by association’ principle. Result: Our method performs well in an independent simulation experiment and has the ability to identify strongly disease-relevant modules in real experiments. Our method is able to find important functional modules associated with two subtypes of lymphoma in a lymphoma microarray dataset. Moreover, the method can identify the biological subnetworks and most of the genes associated with Duchenne muscular dystrophy. Conclusion: We successfully adapted the energy minimization with the graph cuts method to identify functionally important genes from genomic data by integrating gene interaction networks.


2020 ◽  
Author(s):  
Jiancheng Zhong ◽  
Chao Tang ◽  
Wei Peng ◽  
Minzhu Xie ◽  
Yusui Sun ◽  
...  

Abstract Background: Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network based on gene expression and the PPI network data. Our experiments show that our method can improve the accuracy in predicting essential proteins. Results: In this paper, we propose a new measure, named JDC, based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We respectively perform experiments on Yeast data and E.coli data and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins from active PPI networks constructed with dynamic gene expression. Conclusions: We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network . (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jiancheng Zhong ◽  
Chao Tang ◽  
Wei Peng ◽  
Minzhu Xie ◽  
Yusui Sun ◽  
...  

Abstract Background Some proposed methods for identifying essential proteins have better results by using biological information. Gene expression data is generally used to identify essential proteins. However, gene expression data is prone to fluctuations, which may affect the accuracy of essential protein identification. Therefore, we propose an essential protein identification method based on gene expression and the PPI network data to calculate the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network. Our experiments show that the method can improve the accuracy in predicting essential proteins. Results In this paper, we propose a new measure named JDC, which is based on the PPI network data and gene expression data. The JDC method offers a dynamic threshold method to binarize gene expression data. After that, it combines the degree centrality and Jaccard similarity index to calculate the JDC score for each protein in the PPI network. We benchmark the JDC method on four organisms respectively, and evaluate our method by using ROC analysis, modular analysis, jackknife analysis, overlapping analysis, top analysis, and accuracy analysis. The results show that the performance of JDC is better than DC, IC, EC, SC, BC, CC, NC, PeC, and WDC. We compare JDC with both NF-PIN and TS-PIN methods, which predict essential proteins through active PPI networks constructed from dynamic gene expression. Conclusions We demonstrate that the new centrality measure, JDC, is more efficient than state-of-the-art prediction methods with same input. The main ideas behind JDC are as follows: (1) Essential proteins are generally densely connected clusters in the PPI network. (2) Binarizing gene expression data can screen out fluctuations in gene expression profiles. (3) The essentiality of the protein depends on the similarity of "active" and "inactive" state of gene expression in a cluster of the PPI network.


2021 ◽  
Vol 22 (S1) ◽  
Author(s):  
Fangfang Zhu ◽  
Jiang Li ◽  
Juan Liu ◽  
Wenwen Min

Abstract Background Since genes involved in the same biological modules usually present correlated expression profiles, lots of computational methods have been proposed to identify gene functional modules based on the expression profiles data. Recently, Sparse Singular Value Decomposition (SSVD) method has been proposed to bicluster gene expression data to identify gene modules. However, this model can only handle the gene expression data where no gene interaction information is integrated. Ignoring the prior gene interaction information may produce the identified gene modules hard to be biologically interpreted. Results In this paper, we develop a Sparse Network-regularized SVD (SNSVD) method that integrates a prior gene interaction network from a protein protein interaction network and gene expression data to identify underlying gene functional modules. The results on a set of simulated data show that SNSVD is more effective than the traditional SVD-based methods. The further experiment results on real cancer genomic data show that most co-expressed modules are not only significantly enriched on GO/KEGG pathways, but also correspond to dense sub-networks in the prior gene interaction network. Besides, we also use our method to identify ten differentially co-expressed miRNA-gene modules by integrating matched miRNA and mRNA expression data of breast cancer from The Cancer Genome Atlas (TCGA). Several important breast cancer related miRNA-gene modules are discovered. Conclusions All the results demonstrate that SNSVD can overcome the drawbacks of SSVD and capture more biologically relevant functional modules by incorporating a prior gene interaction network. These identified functional modules may provide a new perspective to understand the diagnostics, occurrence and progression of cancer.


Author(s):  
Debahuti Mishra ◽  
Kailash Shaw ◽  
Sashikala Mishra ◽  
Amiya Kumar Rath ◽  
Milu Acharya

Sign in / Sign up

Export Citation Format

Share Document