Detection of Changes in Transitive Associations by Shortest-path Analysis of Protein Interaction Networks Integrated with Gene Expression Profiles

Lung cancer is one of the leading causes of cancer mortality worldwide. The main types of lung cancer are small cell lung cancer (SCLC) and nonsmall cell lung cancer (NSCLC). In this work, a computational method was proposed for identifying lung-cancer-related genes with a shortest path approach in a protein-protein interaction (PPI) network. Based on the PPI data from STRING, a weighted PPI network was constructed. 54 NSCLC- and 84 SCLC-related genes were retrieved from associated KEGG pathways. Then the shortest paths between each pair of these 54 NSCLC genes and 84 SCLC genes were obtained with Dijkstra’s algorithm. Finally, all the genes on the shortest paths were extracted, and 25 and 38 shortest genes with a permutationPvalue less than 0.05 for NSCLC and SCLC were selected for further analysis. Some of the shortest path genes have been reported to be related to lung cancer. Intriguingly, the candidate genes we identified from the PPI network contained more cancer genes than those identified from the gene expression profiles. Furthermore, these genes possessed more functional similarity with the known cancer genes than those identified from the gene expression profiles. This study proved the efficiency of the proposed method and showed promising results.

Download Full-text

Gene expression profiles and protein-protein interaction networks in amyotrophic lateral sclerosis patients with C9orf72 mutation

Orphanet Journal of Rare Diseases ◽

10.1186/s13023-016-0531-y ◽

2016 ◽

Vol 11 (1) ◽

Cited By ~ 24

Author(s):

Meena Kumari Kotni ◽

Mingzhu Zhao ◽

Dong-Qing Wei

Keyword(s):

Gene Expression ◽

Amyotrophic Lateral Sclerosis ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Protein Protein Interaction ◽

C9orf72 Mutation ◽

Protein Protein Interaction Networks ◽

Lateral Sclerosis

Download Full-text

The Approximability of Shortest Path-Based Graph Orientations of Protein–Protein Interaction Networks

Journal of Computational Biology ◽

10.1089/cmb.2013.0064 ◽

2013 ◽

Vol 20 (12) ◽

pp. 945-957 ◽

Cited By ~ 4

Author(s):

Dima Blokh ◽

Danny Segev ◽

Roded Sharan

Keyword(s):

Shortest Path ◽

Protein Interaction ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Protein Protein Interaction ◽

Protein Protein Interaction Networks

Download Full-text

GENE FUNCTION PREDICTION BY A COMBINED ANALYSIS OF GENE EXPRESSION DATA AND PROTEIN-PROTEIN INTERACTION DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001612 ◽

2005 ◽

Vol 03 (06) ◽

pp. 1371-1389 ◽

Cited By ~ 11

Author(s):

GUANGHUA XIAO ◽

WEI PAN

Keyword(s):

Gene Expression ◽

Protein Interaction ◽

Gene Function ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Function Prediction ◽

Protein Interaction Data ◽

Interaction Data ◽

Gene Function Prediction ◽

Protein Protein Interaction

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.

Download Full-text

Weighted edge based clustering to identify protein complexes in protein–protein interaction networks incorporating gene expression profile

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2016.10.001 ◽

2016 ◽

Vol 65 ◽

pp. 69-79 ◽

Cited By ~ 17

Author(s):

Seketoulie Keretsu ◽

Rosy Sarmah

Keyword(s):

Gene Expression ◽

Gene Expression Profile ◽

Expression Profile ◽

Protein Interaction ◽

Protein Complexes ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Protein Protein Interaction ◽

Protein Protein Interaction Networks ◽

Edge Based

Download Full-text

Supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by omega-3 polyunsaturated fatty acids

Current Chinese Science ◽

10.2174/2210298102666220112114505 ◽

2022 ◽

Vol 02 ◽

Author(s):

Sergey Shityakov ◽

Jane Pei-Chen Chang ◽

Ching-Fang Sun ◽

David Ta-Wei Guu ◽

Thomas Dandekar ◽

...

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Protein Interaction ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Interaction Network ◽

Supervised Machine Learning ◽

Omega 3 ◽

Protein Protein Interaction ◽

Protein Protein Interaction Network

Background: Omega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids, have beneficial effects on human health, but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. In order to examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated dif-ferentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction (PPI) networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways. Objective: This study aimed to implement supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by PUFAs. Methods: The transcriptional profile of GSE12375 was obtained from the Gene Expression Om-nibus database, which is based on the Affymetrix NuGO array. The probe cell intensity data were converted into the gene expression values, and the background correction was performed by the multi-array average algorithm. The LIMMA (Linear Models for Microarray Data) algo-rithm was implemented to identify relevant DEGs at baseline and after 26 weeks of supplemen-tation with a p-value < 0.05. The DAVID web server was used to identify and construct the en-riched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. Finally, the construction of machine learning (ML) models, including logistic regression, naïve Bayes, and deep neural networks, were implemented for the analyzed DEGs associated with the specific pathways. Results: The results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, ML approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression performing the best. Conclusion: Overall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.

Download Full-text

A deep learning model based on sparse auto-encoder for prioritizing cancer-related genes and drug target combinations

Carcinogenesis ◽

10.1093/carcin/bgz044 ◽

2019 ◽

Vol 40 (5) ◽

pp. 624-632

Author(s):

Ji-Wei Chang ◽

Yuduan Ding ◽

Muhammad Tahir ul Qamar ◽

Yin Shen ◽

Junxiang Gao ◽

...

Keyword(s):

Gene Expression ◽

Deep Learning ◽

Protein Expression ◽

Protein Interaction ◽

Targeted Therapies ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Expression Data ◽

Related Proteins ◽

Deep Learning Model

Abstract Prioritization of cancer-related genes from gene expression profiles and proteomic data is vital to improve the targeted therapies research. Although computational approaches have been complementing high-throughput biological experiments on the understanding of human diseases, it still remains a big challenge to accurately discover cancer-related proteins/genes via automatic learning from large-scale protein/gene expression data and protein–protein interaction data. Most of the existing methods are based on network construction combined with gene expression profiles, which ignore the diversity between normal samples and disease cell lines. In this study, we introduced a deep learning model based on a sparse auto-encoder to learn the specific characteristics of protein interactions in cancer cell lines integrated with protein expression data. The model showed learning ability to identify cancer-related proteins/genes from the input of different protein expression profiles by extracting the characteristics of protein interaction information, which could also predict cancer-related protein combinations. Comparing with other reported methods including differential expression and network-based methods, our model got the highest area under the curve value (>0.8) in predicting cancer-related genes. Our study prioritized ~500 high-confidence cancer-related genes; among these genes, 211 already known cancer drug targets were found, which supported the accuracy of our method. The above results indicated that the proposed auto-encoder model could computationally prioritize candidate proteins/genes involved in cancer and improve the targeted therapies research.

Download Full-text