Integrating Gene Expression Data, Protein Interaction Data, and Ontology-Based Literature Searches

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.

Download Full-text

CGI: a new approach for prioritizing genes by combining gene expression and protein–protein interaction data

Bioinformatics ◽

10.1093/bioinformatics/btl569 ◽

2006 ◽

Vol 23 (2) ◽

pp. 215-221 ◽

Cited By ~ 76

Author(s):

Xiaotu Ma ◽

Hyunju Lee ◽

Li Wang ◽

Fengzhu Sun

Keyword(s):

Gene Expression ◽

Protein Interaction ◽

Protein Interaction Data ◽

Interaction Data ◽

New Approach ◽

Protein Protein Interaction

Download Full-text

Construction and application of dynamic protein interaction network based on time course gene expression data

PROTEOMICS ◽

10.1002/pmic.201200277 ◽

2013 ◽

Vol 13 (2) ◽

pp. 301-312 ◽

Cited By ~ 97

Author(s):

Jianxin Wang ◽

Xiaoqing Peng ◽

Min Li ◽

Yi Pan

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Protein Interaction ◽

Protein Interaction Network ◽

Time Course ◽

Interaction Network ◽

Expression Data

Download Full-text

Extracting compact representation of knowledge from gene expression data for protein-protein interaction

International Journal of Data Mining and Bioinformatics ◽

10.1504/ijdmb.2017.085711 ◽

2017 ◽

Vol 17 (4) ◽

pp. 279

Author(s):

Haohan Wang ◽

Aman Gupta ◽

Ming Xu

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Protein Interaction ◽

Compact Representation ◽

Expression Data ◽

Protein Protein Interaction ◽

Representation Of Knowledge

Download Full-text

Growing functional modules from a seed protein via integration of protein interaction and gene expression data

BMC Bioinformatics ◽

10.1186/1471-2105-8-408 ◽

2007 ◽

Vol 8 (1) ◽

pp. 408 ◽

Cited By ~ 57

Author(s):

Ioannis A Maraziotis ◽

Konstantina Dimitrakopoulou ◽

Anastasios Bezerianos

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Protein Interaction ◽

Seed Protein ◽

Functional Modules ◽

Expression Data

Download Full-text

Protein-Protein Interaction Networks Comparison Between Paediatric Neuroblastoma Cancer and Glioblastoma Multiforme Cancer with Gene Expression Data

2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer) ◽

10.1109/icter51097.2020.9325492 ◽

2020 ◽

Author(s):

S.P.B.M Senadheera ◽

A.R. Weerasinghe

Keyword(s):

Gene Expression ◽

Glioblastoma Multiforme ◽

Gene Expression Data ◽

Protein Interaction ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Expression Data ◽

Protein Protein Interaction ◽

Protein Protein Interaction Networks

Download Full-text

Gene expression and protein–protein interaction data for identification of colon cancer related genes using f-information measures

Natural Computing ◽

10.1007/s11047-015-9485-6 ◽

2015 ◽

Vol 15 (3) ◽

pp. 449-463 ◽

Cited By ~ 8

Author(s):

Sushmita Paul ◽

Pradipta Maji

Keyword(s):

Gene Expression ◽

Colon Cancer ◽

Protein Interaction ◽

Protein Interaction Data ◽

Interaction Data ◽

Protein Protein Interaction ◽

Information Measures

Download Full-text

A Cross-Entropy-based Method for Essential Protein Identification in Yeast Protein–Protein Interaction Network

Current Bioinformatics ◽

10.2174/1574893615999201116210840 ◽

2020 ◽

Vol 15 ◽

Author(s):

Weimiao Sun ◽

Lei Wang ◽

Jiaxin Peng ◽

Zhen Zhang ◽

Tingrui Pei ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Protein Interaction ◽

Prediction Accuracy ◽

Cross Entropy ◽

Expression Data ◽

Essential Proteins ◽

Protein Protein Interaction ◽

Simulation Results ◽

Initial Score

Background:: Research has shown that essential proteins play important roles in the development and survival of organisms. Because of the high costs of traditional biological experiments, several computational prediction methods based on known protein-protein interactions (PPIs) have been recently proposed to detect essential proteins. Objective:: Here, a novel prediction model called IoMCD is proposed to identify essential proteins by combining known PPIs with a variety of biological information about proteins, including gene expression data and homologous information of proteins. Methods:: Compared to the traditional state-of-the-art prediction models, IoMCD involves two kinds of weights that are obtained, respectively, by extracting topological features of proteins from the original known protein–protein interaction (PPI) networks and calculating the Pearson correlation coefficients (PCCs) between the gene expression data of proteins. Based on these two kinds of weights and adopting a cross-entropy method, a unique weight is assigned to each protein. Subsequently, the homologous information of proteins is used to calculate an initial score for each protein. Finally, based on the unique weights and initial score of proteins, an iterative method is designed to measure the essentialities of proteins. Results:: Intensive experiments were performed, and simulation results showed that the prediction accuracy of IoMCD, based on the dataset downloaded from the DIP and Gavin databases, was 92.16% and 89.71%, respectively, in the top 1% of the predicted essential proteins. Conclusion:: Both simulation results demonstrated that IoMCD can achieve excellent prediction accuracy and could be an effective method for essential protein prediction.

Download Full-text