Integrating Gene Expression Data, Protein Interaction Data, and Ontology-Based Literature Searches

Author(s):  
Panos Dafas ◽  
Alexander Kozlenkov ◽  
Alan Robinson ◽  
Michael Schroeder
2005 ◽  
Vol 03 (06) ◽  
pp. 1371-1389 ◽  
Author(s):  
GUANGHUA XIAO ◽  
WEI PAN

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.


2007 ◽  
Vol 8 (1) ◽  
pp. 408 ◽  
Author(s):  
Ioannis A Maraziotis ◽  
Konstantina Dimitrakopoulou ◽  
Anastasios Bezerianos

2020 ◽  
Vol 15 ◽  
Author(s):  
Weimiao Sun ◽  
Lei Wang ◽  
Jiaxin Peng ◽  
Zhen Zhang ◽  
Tingrui Pei ◽  
...  

Background:: Research has shown that essential proteins play important roles in the development and survival of organisms. Because of the high costs of traditional biological experiments, several computational prediction methods based on known protein-protein interactions (PPIs) have been recently proposed to detect essential proteins. Objective:: Here, a novel prediction model called IoMCD is proposed to identify essential proteins by combining known PPIs with a variety of biological information about proteins, including gene expression data and homologous information of proteins. Methods:: Compared to the traditional state-of-the-art prediction models, IoMCD involves two kinds of weights that are obtained, respectively, by extracting topological features of proteins from the original known protein–protein interaction (PPI) networks and calculating the Pearson correlation coefficients (PCCs) between the gene expression data of proteins. Based on these two kinds of weights and adopting a cross-entropy method, a unique weight is assigned to each protein. Subsequently, the homologous information of proteins is used to calculate an initial score for each protein. Finally, based on the unique weights and initial score of proteins, an iterative method is designed to measure the essentialities of proteins. Results:: Intensive experiments were performed, and simulation results showed that the prediction accuracy of IoMCD, based on the dataset downloaded from the DIP and Gavin databases, was 92.16% and 89.71%, respectively, in the top 1% of the predicted essential proteins. Conclusion:: Both simulation results demonstrated that IoMCD can achieve excellent prediction accuracy and could be an effective method for essential protein prediction.


Sign in / Sign up

Export Citation Format

Share Document