scholarly journals Supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by omega-3 polyunsaturated fatty acids

2022 ◽  
Vol 02 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Background: Omega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids, have beneficial effects on human health, but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. In order to examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated dif-ferentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction (PPI) networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways. Objective: This study aimed to implement supervised machine learning models and protein-protein interaction network analysis of gene expression profiles induced by PUFAs. Methods: The transcriptional profile of GSE12375 was obtained from the Gene Expression Om-nibus database, which is based on the Affymetrix NuGO array. The probe cell intensity data were converted into the gene expression values, and the background correction was performed by the multi-array average algorithm. The LIMMA (Linear Models for Microarray Data) algo-rithm was implemented to identify relevant DEGs at baseline and after 26 weeks of supplemen-tation with a p-value < 0.05. The DAVID web server was used to identify and construct the en-riched KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways. Finally, the construction of machine learning (ML) models, including logistic regression, naïve Bayes, and deep neural networks, were implemented for the analyzed DEGs associated with the specific pathways. Results: The results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, ML approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression performing the best. Conclusion: Overall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.

2020 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Abstract BackgroundOmega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic acid (EPA) and docosahexaenoic (DHA) acids have beneficial effects on human health but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. To examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated differentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways.ResultsThe results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to the cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, machine learning (ML) approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression algorithm performing the best. ConclusionOverall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.


2020 ◽  
Author(s):  
Sergey Shityakov ◽  
Jane Pei-Chen Chang ◽  
Ching-Fang Sun ◽  
David Ta-Wei Guu ◽  
Thomas Dandekar ◽  
...  

Abstract BackgroundOmega-3 polyunsaturated fatty acids (PUFAs), such as eicosapentaenoic (EPA) and docosahexaenoic (DHA) acids have beneficial effects on human health but their effect on gene expression in elderly individuals (age ≥ 65) is largely unknown. To examine this, the gene expression profiles were analyzed in the healthy subjects (n = 96) at baseline and after 26 weeks of supplementation with EPA+DHA to determine up-regulated and down-regulated differentially expressed genes (DEGs) triggered by PUFAs. The protein-protein interaction networks were constructed by mapping these DEGs to a human interactome and linking them to the specific pathways.ResultsThe results revealed that up-regulated DEGs were associated with neurotrophin/MAPK signaling, whereas the down-regulated DEGs were linked to the cancer, acute myeloid leukemia, and long-term depression pathways. Additionally, machine learning (ML) approaches were able to cluster the EPA/DHA-treated and control groups by the logistic regression algorithm performing the best. ConclusionOverall, this study highlights the pivotal changes in DEGs induced by PUFAs and provides the rationale for the implementation of ML algorithms as predictive models for this type of biomedical data.


2018 ◽  
Vol 6 (4) ◽  
pp. 129-140
Author(s):  
Zhi-Jian Li ◽  
Xing-Ling Sui ◽  
Xue-Bo Yang ◽  
Wen Sun

AbstractTo reveal the biology of AML, we compared gene-expression profiles between normal hematopoietic cells from 38 healthy donors and leukemic blasts (LBs) from 26 AML patients. We defined the comparison of LB and unselected BM as experiment 1, LB and CD34+ isolated from BM as experiment 2, LB and unselected PB as experiment 3, and LB and CD34+ isolated from PB as experiment 4. Then, protein–protein interaction network of DEGs was constructed to identify critical genes. Regulatory impact factors were used to identify critical transcription factors from the differential co-expression network constructed via reanalyzing the microarray profile from the perspective of differential co-expression. Gene ontology enrichment was performed to extract biological meaning. The comparison among the number of DEGs obtained in four experiments showed that cells did not tend to differentiation and CD34+ was more similar to cancer stem cells. Based on the results of protein–protein interaction network,CREBBP,F2RL1,MCM2, andTP53were respectively the key genes in experiments 1, 2, 3, and 4. From gene ontology analysis, we found that immune response was the most common one in four stages. Our results might provide a platform for determining the pathology and therapy of AML.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Zhengqing Zhu ◽  
Lei Zhong ◽  
Ronghang Li ◽  
Yuzhe Liu ◽  
Xiangrun Chen ◽  
...  

Osteoarthritis (OA) is a common cause of morbidity and disability worldwide. However, the pathogenesis of OA is unclear. Therefore, this study was conducted to characterize the pathogenesis and implicated genes of OA. The gene expression profiles of GSE82107 and GSE55235 were downloaded from the Gene Expression Omnibus database. Altogether, 173 differentially expressed genes including 68 upregulated genes and 105 downregulated genes in patients with OA were selected based on the criteria of ∣log fold‐change∣>1 and an adjusted p value < 0.05. Protein-protein interaction network analysis showed that FN1, COL1A1, IGF1, SPP1, TIMP1, BGN, COL5A1, MMP13, CLU, and SDC1 are the top ten genes most closely related to OA. Quantitative reverse transcription-polymerase chain reaction showed that the expression levels of COL1A1, COL5A1, TIMP1, MMP13, and SDC1 were significantly increased in OA. This study provides clues for the molecular mechanism and specific biomarkers of OA.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
Bi-Qing Li ◽  
Jin You ◽  
Lei Chen ◽  
Jian Zhang ◽  
Ning Zhang ◽  
...  

Lung cancer is one of the leading causes of cancer mortality worldwide. The main types of lung cancer are small cell lung cancer (SCLC) and nonsmall cell lung cancer (NSCLC). In this work, a computational method was proposed for identifying lung-cancer-related genes with a shortest path approach in a protein-protein interaction (PPI) network. Based on the PPI data from STRING, a weighted PPI network was constructed. 54 NSCLC- and 84 SCLC-related genes were retrieved from associated KEGG pathways. Then the shortest paths between each pair of these 54 NSCLC genes and 84 SCLC genes were obtained with Dijkstra’s algorithm. Finally, all the genes on the shortest paths were extracted, and 25 and 38 shortest genes with a permutationPvalue less than 0.05 for NSCLC and SCLC were selected for further analysis. Some of the shortest path genes have been reported to be related to lung cancer. Intriguingly, the candidate genes we identified from the PPI network contained more cancer genes than those identified from the gene expression profiles. Furthermore, these genes possessed more functional similarity with the known cancer genes than those identified from the gene expression profiles. This study proved the efficiency of the proposed method and showed promising results.


2005 ◽  
Vol 03 (06) ◽  
pp. 1371-1389 ◽  
Author(s):  
GUANGHUA XIAO ◽  
WEI PAN

Prediction of biological functions of genes is an important issue in basic biology research and has applications in drug discoveries and gene therapies. Previous studies have shown either gene expression data or protein-protein interaction data alone can be used for predicting gene functions. In particular, clustering gene expression profiles has been widely used for gene function prediction. In this paper, we first propose a new method for gene function prediction using protein-protein interaction data, which will facilitate combining prediction results based on clustering gene expression profiles. We then propose a new method to combine the prediction results based on either source of data by weighting on the evidence provided by each. Using protein-protein interaction data downloaded from the GRID database, published gene expression profiles from 300 microarray experiments for the yeast S. cerevisiae, we show that this new combined analysis provides improved predictive performance over that of using either data source alone in a cross-validated analysis of the MIPS gene annotations. Finally, we propose a logistic regression method that is flexible enough to combine information from any number of data sources while maintaining computational feasibility.


2019 ◽  
Author(s):  
Guangxin Yan ◽  
Zhaoyu Liu

AbstractHepatocellular carcinoma is one of the most common tumors in the world and has a high mortality rate. This study elucidates the mechanism of hepatocellular carcinoma- (HCC) related development. The HCC gene expression profile (GSE54238, GSE84004) was downloaded from Gene Expression Omnibus for comprehensive analysis. A total of 359 genes were identified, of which 195 were upregulated and 164 were downregulated. Analysis of the condensed results showed that “extracellular allotrope” is a substantially enriched term. “Cell cycle”, “metabolic pathway” and “DNA replication” are three significantly enriched Kyoto Encyclopedia of Genes and Genomespathways. Subsequently, a protein-protein interaction network was constructed. The most important module in the protein-protein interaction network was selected for path enrichment analysis. The results showed thatCCNA2, PLK1, CDC20, UBE2CandAURKAwere identified as central genes, and the expression of these five hub genes in liver cancer was significantly increased in The Cancer Genome Atlas. Univariate regression analysis was also performed to show that the overall survival and disease-free survival of patients in the high expression group were longer than in the expression group. In addition, genes in important modules are mainly involved in “cell cycle”, “DNA replication” and “oocyte meiosis” signaling pathways. Finally, through upstream miRNA analysis, mir-300 and mir-381-3p were found to coregulateCCNA2,AURKAandUBE2C. These results provide a set of targets that can help researchers to further elucidate the underlying mechanism of liver cancer.


2021 ◽  
Author(s):  
Nikoleta Vavouraki ◽  
James E. Tomkins ◽  
Eleanna Kara ◽  
Henry Houlden ◽  
John Hardy ◽  
...  

AbstractThe Hereditary Spastic Paraplegias are a group of neurodegenerative diseases characterized by spasticity and weakness in the lower body. Despite the identification of causative mutations in over 70 genes, the molecular aetiology remains unclear. Due to the combination of genetic diversity and variable clinical presentation, the Hereditary Spastic Paraplegias are a strong candidate for protein-protein interaction network analysis as a tool to understand disease mechanism(s) and to aid functional stratification of phenotypes. In this study, experimentally validated human protein-protein interactions were used to create a protein-protein interaction network based on the causative Hereditary Spastic Paraplegia genes. Network evaluation as a combination of both topological analysis and functional annotation led to the identification of core proteins in putative shared biological processes such as intracellular transport and vesicle trafficking. The application of machine learning techniques suggested a functional dichotomy linked with distinct sets of clinical presentations, suggesting there is scope to further classify conditions currently described under the same umbrella term of Hereditary Spastic Paraplegias based on specific molecular mechanisms of disease.


Sign in / Sign up

Export Citation Format

Share Document