Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks

Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data ofSaccharomyces cerevisiaeand the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures.

Download Full-text

Identifying dynamic protein complexes based on gene expression profiles and PPI networks

2013 IEEE International Conference on Bioinformatics and Biomedicine ◽

10.1109/bibm.2013.6732610 ◽

2013 ◽

Author(s):

Min Li ◽

Weijie Chen ◽

Jianxin Wang ◽

Fang-Xiang Wu ◽

Yi Pan

Keyword(s):

Gene Expression ◽

Protein Complexes ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Ppi Networks

Download Full-text

Integrating network topology, gene expression data and GO annotation information for protein complex prediction

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972001950001x ◽

2019 ◽

Vol 17 (01) ◽

pp. 1950001 ◽

Cited By ~ 3

Author(s):

Wei Zhang ◽

Jia Xu ◽

Yuanyuan Li ◽

Xiufen Zou

Keyword(s):

Gene Expression ◽

Protein Interaction ◽

Protein Interaction Network ◽

Protein Complex ◽

Complex Disease ◽

Protein Complexes ◽

Expression Profiles ◽

Interaction Network ◽

Ppi Networks ◽

Annotation Information

The prediction of protein complexes based on the protein interaction network is a fundamental task for the understanding of cellular life as well as the mechanisms underlying complex disease. A great number of methods have been developed to predict protein complexes based on protein–protein interaction (PPI) networks in recent years. However, because the high throughput data obtained from experimental biotechnology are incomplete, and usually contain a large number of spurious interactions, most of the network-based protein complex identification methods are sensitive to the reliability of the PPI network. In this paper, we propose a new method, Identification of Protein Complex based on Refined Protein Interaction Network (IPC-RPIN), which integrates the topology, gene expression profiles and GO functional annotation information to predict protein complexes from the reconstructed networks. To demonstrate the performance of the IPC-RPIN method, we evaluated the IPC-RPIN on three PPI networks of Saccharomycescerevisiae and compared it with four state-of-the-art methods. The simulation results show that the IPC-RPIN achieved a better result than the other methods on most of the measurements and is able to discover small protein complexes which have traditionally been neglected.

Download Full-text

A Novel Core-Attachment-Based Method to Identify Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks

PROTEOMICS ◽

10.1002/pmic.201800129 ◽

2019 ◽

Vol 19 (5) ◽

pp. 1800129 ◽

Cited By ~ 1

Author(s):

Qianghua Xiao ◽

Ping Luo ◽

Min Li ◽

Jianxin Wang ◽

Fang-Xiang Wu

Keyword(s):

Gene Expression ◽

Protein Complexes ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Ppi Networks

Download Full-text

Identification of Lung-Cancer-Related Genes with the Shortest Path Approach in a Protein-Protein Interaction Network

BioMed Research International ◽

10.1155/2013/267375 ◽

2013 ◽

Vol 2013 ◽

pp. 1-8 ◽

Cited By ~ 8

Author(s):

Bi-Qing Li ◽

Jin You ◽

Lei Chen ◽

Jian Zhang ◽

Ning Zhang ◽

...

Keyword(s):

Gene Expression ◽

Lung Cancer ◽

Shortest Path ◽

Protein Interaction ◽

Expression Profiles ◽

Shortest Paths ◽

Gene Expression Profiles ◽

Cancer Genes ◽

Ppi Network ◽

Protein Protein Interaction

Lung cancer is one of the leading causes of cancer mortality worldwide. The main types of lung cancer are small cell lung cancer (SCLC) and nonsmall cell lung cancer (NSCLC). In this work, a computational method was proposed for identifying lung-cancer-related genes with a shortest path approach in a protein-protein interaction (PPI) network. Based on the PPI data from STRING, a weighted PPI network was constructed. 54 NSCLC- and 84 SCLC-related genes were retrieved from associated KEGG pathways. Then the shortest paths between each pair of these 54 NSCLC genes and 84 SCLC genes were obtained with Dijkstra’s algorithm. Finally, all the genes on the shortest paths were extracted, and 25 and 38 shortest genes with a permutationPvalue less than 0.05 for NSCLC and SCLC were selected for further analysis. Some of the shortest path genes have been reported to be related to lung cancer. Intriguingly, the candidate genes we identified from the PPI network contained more cancer genes than those identified from the gene expression profiles. Furthermore, these genes possessed more functional similarity with the known cancer genes than those identified from the gene expression profiles. This study proved the efficiency of the proposed method and showed promising results.

Download Full-text

GENE FUNCTION PREDICTION BY A COMBINED ANALYSIS OF GENE EXPRESSION DATA AND PROTEIN-PROTEIN INTERACTION DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001612 ◽

2005 ◽

Vol 03 (06) ◽

pp. 1371-1389 ◽

Cited By ~ 11

Author(s):

GUANGHUA XIAO ◽

WEI PAN

Keyword(s):

Gene Expression ◽

Protein Interaction ◽

Gene Function ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Function Prediction ◽

Protein Interaction Data ◽

Interaction Data ◽

Gene Function Prediction ◽

Protein Protein Interaction

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.

Download Full-text

Identification of Differential Gene Expression and Related Signaling Pathways In SARS-COV-2 Infected Human Cells

10.21203/rs.3.rs-794267/v1 ◽

2021 ◽

Author(s):

Tian-Ao Xie ◽

Hou-He Li ◽

Zu-En Lin ◽

Xiao-Ye Lin ◽

Xin Meng ◽

...

Keyword(s):

Gene Expression ◽

Vaccine Development ◽

Virus Disease ◽

Expression Profiles ◽

Biological Significance ◽

Gene Expression Profiles ◽

Gene Expression Omnibus ◽

Human Cells ◽

Hub Genes ◽

Protein Protein Interaction

Abstract Background: The Corona Virus Disease 2019 (COVID-19) pandemic poses a serious public health threat to the survival and health of people all over the world. We analyzed related mRNA data and gene expression profiles of human cell lines infected with SARS-CoV-2 obtained from GEO (GSE148729), using bioinformatics tools. Differentially expressed genes (DEGs) of human cells infected with SARS-CoV-2 were identified.Method: The GSE148729 datasets were downloaded from the Gene Expression Omnibus (GEO) database. To explore the Biological significance of DEGs, Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment of the DEGs was performed. Protein-protein interaction (PPI) networks of the DEGs were constructed by using the STRING database. The hub genes were selected using the Cytoscape Software, and a t-test was performed to validate the hub genes.Result: A total of 1241 DEGs were screened, including 1049 up-regulated genes and 192 down-regulated genes. Besides, 10 hub genes were obtained from the PPI network, among which the expression level of CXCL2, Etv7, and HIST1H2BG was found to be statistically significant.Conclusion: In conclusion, bioinformatics analysis reveals genes and cellular pathways that are significantly altered in SARS-CoV-2 infected cells. This is conducive to further guide the clinical study of SARS-CoV-2 and provides new perspectives for vaccine development.

Download Full-text

Screening of Hub Genes Associated with Lung Adenocarcinoma by Integrated Bioinformatic Analysis

10.21203/rs.3.rs-580657/v1 ◽

2021 ◽

Author(s):

Zimeng Wei ◽

Min Zhao ◽

Linnan Zang

Keyword(s):

Gene Expression ◽

Lung Adenocarcinoma ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Enrichment Analysis ◽

Bioinformatic Analysis ◽

Analysis Tool ◽

Hub Genes ◽

Kegg Analysis ◽

Ppi Networks

Abstract Background Lung adenocarcinoma (LUAD) is the main histological subtype of lung cancer. However, the molecular mechanism underlying LUAD is not yet clearly defined, but elucidating this process in detail would be of great significance for clinical diagnosis and treatment. Methods Gene expression profiles were retrieved from Gene Expression Omnibus database (GEO), and the common differentially expressed genes (DEGs) were identified by online GEO2R analysis tool. Subsequently, the enrichment analysis of function and signaling pathways of DEGs in LUAD were performed by gene ontology (GO) and The Kyoto Encyclopedia of Genes and Genomics (KEGG) analysis. The protein-protein interaction (PPI) networks of the DEGs were established through the Search Tool for the Retrieval of Interacting Genes (STRING) database and hub genes were screened by plug-in CytoHubba in Cytoscape. Afterwards, we detected the expression of hub genes in LUAD and other cancers via GEPIA, Oncomine and HPA databases. Finally, Kaplan-Meier plotter were performed to analyze the prognosis efficacy of hub genes. Results 74 up-regulated and 238 down-regulated DEGs were identified. As for the up-regulated DEGs, KEGG analysis results revealed they were mainly enrolled in protein digestion and absorption. However, the down-regulated DEGs were primarily enriched in cell adhesion molecules. Subsequently, 9 hub genes: KIAA0101, CDCA7, TOP2A, CDC20, ASPM, TPX2, CENPF, UBE2T and ECT2, were identified and showed higher expression in both LUAD and other cancers. Finally, all these hub genes were found significantly related to the prognosis of LUAD (p < 0.05). Conclusions Our results screened out the hub genes and pathways that were related to the development and prognosis of LUAD, which could provide new insight for the future molecularly targeted therapy and prognosis evaluation of LUAD.

Download Full-text

Supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by omega-3 polyunsaturated fatty acids

Current Chinese Science ◽

10.2174/2210298102666220112114505 ◽

2022 ◽

Vol 02 ◽

Author(s):

Sergey Shityakov ◽

Jane Pei-Chen Chang ◽

Ching-Fang Sun ◽

David Ta-Wei Guu ◽

Thomas Dandekar ◽

...

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Protein Interaction ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Interaction Network ◽

Supervised Machine Learning ◽

Omega 3 ◽

Protein Protein Interaction ◽

Protein Protein Interaction Network

Background: Omega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids, have beneficial effects on human health, but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. In order to examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated dif-ferentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction (PPI) networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways. Objective: This study aimed to implement supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by PUFAs. Methods: The transcriptional profile of GSE12375 was obtained from the Gene Expression Om-nibus database, which is based on the Affymetrix NuGO array. The probe cell intensity data were converted into the gene expression values, and the background correction was performed by the multi-array average algorithm. The LIMMA (Linear Models for Microarray Data) algo-rithm was implemented to identify relevant DEGs at baseline and after 26 weeks of supplemen-tation with a p-value < 0.05. The DAVID web server was used to identify and construct the en-riched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. Finally, the construction of machine learning (ML) models, including logistic regression, naïve Bayes, and deep neural networks, were implemented for the analyzed DEGs associated with the specific pathways. Results: The results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, ML approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression performing the best. Conclusion: Overall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.

Download Full-text

Novel Insight Into Glycosaminoglycan Biosynthesis Based on Gene Expression Profiles

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.709018 ◽

2021 ◽

Vol 9 ◽

Author(s):

Yi-Fan Huang ◽

Shuji Mizumoto ◽

Morihisa Fujita

Keyword(s):

Gene Expression ◽

Chondroitin Sulfate ◽

Dermatan Sulfate ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Sulfate Proteoglycan ◽

Expression Levels ◽

Core Proteins ◽

Dermatan Sulfate Proteoglycan ◽

Insight Into

Glycosaminoglycans (GAGs) including chondroitin sulfate, dermatan sulfate, heparan sulfate, and keratan sulfate, except for hyaluronan that is a free polysaccharide, are covalently attached to core proteins to form proteoglycans. More than 50 gene products are involved in the biosynthesis of GAGs. We recently developed a comprehensive glycosylation mapping tool, GlycoMaple, for visualization and estimation of glycan structures based on gene expression profiles. Using this tool, the expression levels of GAG biosynthetic genes were analyzed in various human tissues as well as tumor tissues. In brain and pancreatic tumors, the pathways for biosynthesis of chondroitin and dermatan sulfate were predicted to be upregulated. In breast cancerous tissues, the pathways for biosynthesis of chondroitin and dermatan sulfate were predicted to be up- and down-regulated, respectively, which are consistent with biochemical findings published in the literature. In addition, the expression levels of the chondroitin sulfate-proteoglycan versican and the dermatan sulfate-proteoglycan decorin were up- and down-regulated, respectively. These findings may provide new insight into GAG profiles in various human diseases including cancerous tumors as well as neurodegenerative disease using GlycoMaple analysis.

Download Full-text

Integrated analysis of lncRNA–miRNA–mRNA ceRNA network and the potential prognosis indicators in sarcomas

BMC Medical Genomics ◽

10.1186/s12920-021-00918-x ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Lu Gao ◽

Yu Zhao ◽

Xuelei Ma ◽

Ling Zhang

Keyword(s):

Gene Expression ◽

Survival Analysis ◽

Gene Prediction ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Gene Expression Omnibus ◽

Integrated Analysis ◽

Ppi Network ◽

Protein Protein Interaction ◽

Cerna Network

Abstract Background Competitive endogenous RNA (ceRNA) networks have revealed a new mechanism of interaction between RNAs, and play crucial roles in multiple biological processes and development of neoplasms. They might serve as diagnostic and prognosis markers as well as therapeutic targets. Methods In this work, we identified differentially expressed mRNAs (DEGs), lncRNAs (DELs) and miRNAs (DEMs) in sarcomas by comparing the gene expression profiles between sarcoma and normal muscle samples in Gene Expression Omnibus (GEO) datasets. Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway enrichment analyses were applied to investigate the primary functions of the overlapped DEGs. Then, lncRNA-miRNA and miRNA-mRNA interactions were predicted, and the ceRNA regulatory network was constructed using Cytoscape software. In addition, the protein–protein interaction (PPI) network and survival analysis were performed. Results A total of 1296 DEGs were identified in sarcoma samples by combining the GO and KEGG enrichment analyses, 338 DELs were discovered after the probes were reannotated, and 36 DEMs were ascertained through intersecting two different expression miRNAs sets. Further, through target gene prediction, a lncRNA–miRNA–mRNA ceRNA network that contained 113 mRNAs, 69 lncRNAs and 29 miRNAs was constructed. The PPI network identified the six most significant hub proteins. Survival analysis revealed that seven mRNAs, four miRNAs and one lncRNA were associated with overall survival of sarcoma patients. Conclusions Overall, we constructed a ceRNA network in sarcomas, which might provide insights for further research on the molecular mechanism and potential prognosis biomarkers.

Download Full-text