scholarly journals Identification of Molecular Biomarkers Associated With Stroke Progression Using WGCNA and SVM-RFE

Author(s):  
Si-tong Liu ◽  
You Zhang ◽  
Xin-gui Wu ◽  
Chang-xing Lu ◽  
Qi-Ping Hu

Abstract Background: Stroke is the second most common cause of death worldwide and the leading cause of long-term severe disability with neurological impairment worsening within hours after stroke onset and being especially involved with motor function. So far, there are no established and reliable biomarkers to prognose stroke. Early detection of biomarkers that can prognose stroke is of great importance for clinical intervention and prevention of clinical deterioration of stroke.Methods: TGSE119121 dataset was retrieved from the Gene Expression Integrated Database (Gene Expression Omnibus, GEO) and weighted gene co-expression network analysis (WGCNA) was conducted to identify the key modules that could regulate disease progression. Moreover, functional enrichment analysis was conducted to study the biological functions of the key module genes. The GSE16561 dataset was further analyzed by the Support Vector Machines coupled with Recursive Feature Elimination (SVM-RFE )algorithm to identify the top genes regulating disease progression. The hub genes revealed by WGCNA were associated with disease progression using the receiver operating characteristic curve (ROC) analysis. Subsequently, functional enrichment of the hub genes was performed by deploying gene set variation analysis (GSVA). The changes at gene level were transformed into the changes at pathway level to identify the biological function of each sample. Finally, the expression level of the hub gene in the rat infarction model of MCAO was measured using RT-qPCR for validation. Results: WGCNA analysis revealed four hub genes: DEGS1, HSDL2, ST8SIA4 and STK3. The result of GSVA showed that the hub genes were involved in stroke progression by regulating the p53 signal pathway, the PI3K signal pathway, and the inflammatory response pathway. The results of RT-qPCR indicated that the expression of the four HUB genes was increased significantly in the rat model of MCAO.Conclusion: Several genes, such as DEGS, HSDL2, ST8SIA4 and STK3, were identified and associated with the progression of the disease. Moreover, it was hypothesized that these genes may be involved in the progression stroke by regulating the P53 signal, the PI3K signal, and the inflammatory response pathway, respectively. These genes have potential prognostic value and may serve as biomarkers for predicting stroke progression. The early identification of the patients at risk of progression is essential to prevent clinical deterioration and provide a reference for future research.

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257343
Author(s):  
Shaoshuo Li ◽  
Baixing Chen ◽  
Hao Chen ◽  
Zhen Hua ◽  
Yang Shao ◽  
...  

Objectives Smoking is a significant independent risk factor for postmenopausal osteoporosis, leading to genome variations in postmenopausal smokers. This study investigates potential biomarkers and molecular mechanisms of smoking-related postmenopausal osteoporosis (SRPO). Materials and methods The GSE13850 microarray dataset was downloaded from Gene Expression Omnibus (GEO). Gene modules associated with SRPO were identified using weighted gene co-expression network analysis (WGCNA), protein-protein interaction (PPI) analysis, and pathway and functional enrichment analyses. Feature genes were selected using two machine learning methods: support vector machine-recursive feature elimination (SVM-RFE) and random forest (RF). The diagnostic efficiency of the selected genes was assessed by gene expression analysis and receiver operating characteristic curve. Results Eight highly conserved modules were detected in the WGCNA network, and the genes in the module that was strongly correlated with SRPO were used for constructing the PPI network. A total of 113 hub genes were identified in the core network using topological network analysis. Enrichment analysis results showed that hub genes were closely associated with the regulation of RNA transcription and translation, ATPase activity, and immune-related signaling. Six genes (HNRNPC, PFDN2, PSMC5, RPS16, TCEB2, and UBE2V2) were selected as genetic biomarkers for SRPO by integrating the feature selection of SVM-RFE and RF. Conclusion The present study identified potential genetic biomarkers and provided a novel insight into the underlying molecular mechanism of SRPO.


2021 ◽  
Author(s):  
Jian Lin ◽  
Yuanhua Lu ◽  
Bizhou Wang ◽  
Ping Jiao ◽  
Jie Ma

Abstract Background Type 1 diabetes mellitus (T1DM) is a chronic autoimmune disease caused by severe loss of pancreatic β cells. Immune cells are key mediators of β cell destruction. This study attempted to investigate the role of immune cells and immune-related genes in the occurrence and development of T1DM. Methods The raw gene expression profile of the samples from 12 T1DM patients and 10 normal controls was obtained from Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were identified by Limma package in R. The least absolute shrinkage and selection operator (LASSO) - support vector machines (SVM) were used to screen the hub genes. CIBERSORT algorithm was used to identify the different immune cells in distribution between T1DM and normal samples. Correlation of the hub genes and immune cells was analyzed by Spearman, and gene-GO-BP and gene-pathway interaction networks were constructed by Cytoscape plug-in ClueGO. Receiver operating characteristic (ROC) curves were used to assess diagnostic value of genes in T1DM. Results The 50 immune-related DEGs were obtained between the T1DM and normal samples. Then, the 50 immune-related DEGs were further screened to obtain the 5 hub genes. CIBERSORT analysis revealed that the distribution of plasma cells, resting mast cells, resting NK cells and neutrophils had significant difference between T1DM and normal samples. Natural cytotoxicity triggering receptor 3 (NCR3) was significantly related to the activated NK cells, M0 macrophages, monocytes, resting NK cells, and resting memory CD4+ T cells. Moreover, tumor necrosis factor (TNF) was significantly associated with naive B cell and naive CD4+ T cell. NCR3 [Area under curve (AUC) = 0.918] possessed a higher accuracy than TNF (AUC = 0.763) in diagnosis of T1DM. Conclusions The immune-related genes (NCR3 and TNF) and immune cells (NK cells) may play a vital regulatory role in the occurrence and development of T1DM, which possibly provide new ideas and potential targets for the immunotherapy of diabetes mellitus (DM).


2021 ◽  
Author(s):  
Teng-di Fan ◽  
Di-kai Bei ◽  
Song-wei Li

Abstract Objective: To design a weighted co-expression network and build gene expression signature-based nomogram (GESBN) models for predicting the likelihood of bone metastasis in breast cancer (BC) patients. Methods: Dataset GSE124647 was used as a training set, and GSE14020 was taken as a validation set. In the training cohort, limma package in R was adopted to obtain differentially expressed genes (DEGs) between BC non-bone metastasis and bone metastasis patients, which were used for functional enrichment analysis. After weighted co-expression network analysis (WGCNA), univariate Cox regression and Kaplan-Meier plotter analyses were performed to screen potential prognosis-related genes. Then, GESBN models were constructed and evaluated. Further, the expression levels of genes in the models were explored in the training set, which was validated in GSE14020. Finally, the prognostic value of hub genes in BC was explored. Results: A total of 1858 DEGs were obtained. WGCNA result showed that the blue module was most significantly related to bone metastasis and prognosis. After survival analyses, GAJ1, SLC24A3, ITGBL1, and SLC44A1 were subjected to construct a GESBN model for overall survival. While GJA1, IGFBP6, MDFI, ITGFBI, ANXA2, and SLC24A3 were subjected to build a GESBN model for progression-free survival. Kaplan-Meier plotter and receiver operating characteristic analyses presented the reliable prediction ability of the models. Besides, GJA1, IGFBP6, ITGBL1, SLC44A1, and TGFBI expressions were significantly different between the two groups in GSE124647 and GSE14020. The hub genes had a significant impact on patient prognosis. Conclusion: Both the four-gene signature and six-gene signature could accurately predict patient prognosis, which may provide novel treatment insights for BC bone metastasis.


2019 ◽  
Author(s):  
Jiaqi Zhang ◽  
Xue Wang ◽  
Lin Xu ◽  
Zedan Zhang ◽  
Fengyun Wang ◽  
...  

Abstract Objectives: To reveal the molecular mechanisms of ulcerative colitis (UC) and provide potential biomarkers for UC gene therapy. Methods: We downloaded the GSE87473 microarray dataset from the Gene Expression Omnibus (GEO) and identified the differentially expressed genes (DEGs) between UC samples and normal samples. Then ,a module partition analysis was performed based on a weighted gene co-expression network analysis (WGCNA),followed by pathway and functional enrichment analyses. Furthermore, we investigated the hub genes . At last, data validation was performed to ensure the reliability of the hub genes. Results: Between UC group and normal group, 988 DEGs were investigated . The DEGs were clustered into 5 modules using WGCNA. These DEGs were mainly enriched in functions such as the immune response, the inflammatory response and chemotaxis, and they were mainly enriched in KEGG pathways such as the cytokine-cytokine receptor interaction , chemokine signaling pathway, and complement and coagulation cascades. The hub genes, including dual oxidase maturation factor 2(DUOXA2), serum amyloid A (SAA) 1 and SAA2, TNFAIP3-interacting protein 3(TNIP3), C-X-C motif chemokine (CXCL1), solute carrier family 6 member 14(SLC6A14) and complement decay-accelerating factor (CD antigen CD55),were revealed as potential tissue biomarkers for UC diagnosis or treatment. Conclusions: This study provides supportive evidence that DUOXA2, A-SAA, TNIP3, CXCL1, SLC6A14 and CD55 might be used as potential biomarkers for tissue biopsy of UC, especially SLC6A14 and CD55, which may be new targets for UC gene therapy. Moreover, the DUOX2/DUOXA2, NF-κB /TNIP3 and CXCL1/CXCR2 pathways might play an important role in the progression of UC through the chemokine signaling pathway and inflammatory response.


2020 ◽  
Author(s):  
Xi Pan ◽  
Jian-Hao Liu

Abstract Background Nasopharyngeal carcinoma (NPC) is a heterogeneous carcinoma that the underlying molecular mechanisms involved in the tumor initiation, progression, and migration are largely unclear. The purpose of the present study was to identify key biomarkers and small-molecule drugs for NPC screening, diagnosis, and therapy via gene expression profile analysis. Methods Raw microarray data of NPC were retrieved from the Gene Expression Omnibus (GEO) database and analyzed to screen out the potential differentially expressed genes (DEGs). The key modules associated with histology grade and tumor stage was identified by using weighted correlation network analysis (WGCNA). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of genes in the key module were performed to identify potential mechanisms. Candidate hub genes were obtained, which based on the criteria of module membership (MM) and high connectivity. Then we used receiver operating characteristic (ROC) curve to evaluate the diagnostic value of hub genes. The Connectivity map database was further used to screen out small-molecule drugs of hub genes. Results A total of 430 DEGs were identified based on two GEO datasets. The green gene module was considered as key module for the tumor stage of NPC via WGCNA analysis. The results of functional enrichment analysis revealed that genes in the green module were enriched in regulation of cell cycle, p53 signaling pathway, cell part morphogenesis. Furthermore, four DEGs-related hub genes in the green module were considered as the final hub genes. Then ROC revealed that the final four hub genes presented with high areas under the curve, suggesting these hub genes may be diagnostic biomarkers for NPC. Meanwhile, we screened out several small-molecule drugs that have provided potentially therapeutic goals for NPC. Conclusions Our research identified four potential prognostic biomarkers and several candidate small-molecule drugs for NPC, which may contribute to the new insights for NPC therapy.


2020 ◽  
Vol 25 (1) ◽  
Author(s):  
Xue Jiang ◽  
Zhijie Xu ◽  
Yuanyuan Du ◽  
Hongyu Chen

Abstract Background Immunoglobulin A nephropathy (IgAN) is the most common primary glomerulopathy worldwide. However, the molecular events underlying IgAN remain to be fully elucidated. This study aimed to identify novel biomarkers of IgAN through bioinformatics analysis and elucidate the possible molecular mechanism. Methods Based on the microarray datasets GSE93798 and GSE37460 downloaded from the Gene Expression Omnibus database, the differentially expressed genes (DEGs) between IgAN samples and normal controls were identified. Using the DEGs, we further performed a series of functional enrichment analyses. Protein–protein interaction (PPI) networks of the DEGs were constructed using the STRING online search tool and were visualized using Cytoscape. Next, hub genes were identified and the most important module among the DEGs, Biological Networks Gene Ontology tool (BiNGO), was used to elucidate the molecular mechanism of IgAN. Results In total, 148 DEGs were identified, comprising 53 upregulated genes and 95 downregulated genes. Gene Ontology (GO) analysis indicated that the DEGs for IgAN were mainly enriched in extracellular exosome, region and space, fibroblast growth factor stimulus, inflammatory response, and innate immunity. Module analysis showed that genes in the top 1 significant module of the PPI network were mainly associated with innate immune response, integrin-mediated signaling pathway and inflammatory response. The top 10 hub genes were constructed in the PPI network, which could well distinguish the IgAN and control group in monocyte and tissue samples. We finally identified the integrin subunit beta 2 (ITGB2) and Fc fragment of IgE receptor Ig (FCER1G) genes that may play important roles in the development of IgAN. Conclusions We identified key genes along with the pathways that were most closely related to IgAN initiation and progression. Our results provide a more detailed molecular mechanism for the development of IgAN and novel candidate gene targets of IgAN.


2020 ◽  
Vol 12 (18) ◽  
pp. 7365
Author(s):  
Taejung Park ◽  
Chayoung Kim

The current study seeks to identify variables that affect the career decision-making of high school graduates with respect to the choice of university (re-)entrance in South Korea where education has great importance as a tool for self-cultivation and social prestige. For pattern recognition, we adopted a support vector machine with recursive feature elimination (SVM-RFE) with a big-data of survey of Korean college candidates. Based on the SVM-RFE analysis results, new enrollers were mostly affected by the mesosystems of interactions with parents, while re-enrollers were affected by the macrosystems of social awareness as well as individual estimates of talent and aptitude of individual systems. By predicting the variables that affect the high school graduates’ preparation for university re-entrance, some survey questions provide information on why they make the university choice based on interactions with their parents or acquaintances. Along with these empirical results, implications for future research are also presented.


Author(s):  
Chengzhang Li ◽  
Jiucheng Xu

Background: Hepatocellular carcinoma (HCC) is a major threat to public health. However, few effective therapeutic strategies exist. We aimed to identify potentially therapeutic target genes of HCC by analyzing three gene expression profiles. Methods: The gene expression profiles were analyzed with GEO2R, an interactive web tool for gene differential expression analysis, to identify common differentially expressed genes (DEGs). Functional enrichment analyses were then conducted followed by a protein-protein interaction (PPI) network construction with the common DEGs. The PPI network was employed to identify hub genes, and the expression level of the hub genes was validated via data mining the Oncomine database. Survival analysis was carried out to assess the prognosis of hub genes in HCC patients. Results: A total of 51 common up-regulated DEGs and 201 down-regulated DEGs were obtained after gene differential expression analysis of the profiles. Functional enrichment analyses indicated that these common DEGs are linked to a series of cancer events. We finally identified 10 hub genes, six of which (OIP5, ASPM, NUSAP1, UBE2C, CCNA2, and KIF20A) are reported as novel HCC hub genes. Data mining the Oncomine database validated that the hub genes have a significant high level of expression in HCC samples compared normal samples (t-test, p < 0.05). Survival analysis indicated that overexpression of the hub genes is associated with a significant reduction (p < 0.05) in survival time in HCC patients. Conclusions: We identified six novel HCC hub genes that might be therapeutic targets for the development of drugs for some HCC patients.


Author(s):  
JUANA CANUL-REICH ◽  
LAWRENCE O. HALL ◽  
DMITRY B. GOLDGOF ◽  
JOHN N. KORECKI ◽  
STEVEN ESCHRICH

Gene-expression microarray datasets often consist of a limited number of samples with a large number of gene-expression measurements, usually on the order of thousands. Therefore, dimensionality reduction is critical prior to any classification task. In this work, the iterative feature perturbation method (IFP), an embedded gene selector, is introduced and applied to four microarray cancer datasets: colon cancer, leukemia, Moffitt colon cancer, and lung cancer. We compare results obtained by IFP to those of support vector machine-recursive feature elimination (SVM-RFE) and the t-test as a feature filter using a linear support vector machine as the base classifier. Analysis of the intersection of gene sets selected by the three methods across the four datasets was done. Additional experiments included an initial pre-selection of the top 200 genes based on their p values. IFP and SVM-RFE were then applied on the reduced feature sets. These results showed up to 3.32% average performance improvement for IFP across the four datasets. A statistical analysis (using the Friedman/Holm test) for both scenarios showed the highest accuracies came from the t-test as a filter on experiments without gene pre-selection. IFP and SVM-RFE had greater classification accuracy after gene pre-selection. Analysis showed the t-test is a good gene selector for microarray data. IFP and SVM-RFE showed performance improvement on a reduced by t-test dataset. The IFP approach resulted in comparable or superior average class accuracy when compared to SVM-RFE on three of the four datasets. The same or similar accuracies can be obtained with different sets of genes.


2021 ◽  
Author(s):  
Pejman Morovat ◽  
Saman Morovat ◽  
Arash M. Ashrafi ◽  
Shahram Teimourian

Abstract Hepatocellular carcinoma (HCC) is one of the most prevalent cancers worldwide, which has a high mortality rate and poor treatment outcomes with yet unknown molecular basis. It seems that gene expression plays a pivotal role in the pathogenesis of the disease. Circular RNAs (circRNAs) can interact with microRNAs (miRNAs) to regulate gene expression in various malignancies by acting as competitive endogenous RNAs (ceRNAs). However, the potential pathogenesis roles of the ceRNA network among circRNA/miRNA/mRNA in HCC are unclear. In this study, first, the HCC circRNA expression data were obtained from three Gene Expression Omnibus microarray datasets (GSE164803, GSE94508, GSE97332), and the differentially expressed circRNAs (DECs) were identified using R limma package. Also, the liver hepatocellular carcinoma (LIHC) miRNA and mRNA sequence data were retrieved from TCGA, and differentially expressed miRNAs (DEMIs) and mRNAs (DEGs) were determined using the R DESeq2 package. Second, CSCD website was used to uncover the binding sites of miRNAs on DECs. The DECs' potential target miRNAs were revealed by conducting an intersection between predicted miRNAs from CSCD and downregulated DEMIs. Third, some related genes were uncovered by intersecting targeted genes predicted by miRWalk and targetscan online tools with upregulated DEGs. The ceRNA network was then built using the Cytoscape software. The functional enrichment and the overall survival time of these potential targeted genes were analyzed, and a PPI network was constructed in the STRING database. Network visualization was performed by Cytoscape, and ten hub genes were detected using the CytoHubba plugin tool. Four DECs (hsa_circ_0000520, hsa_circ_0008616, hsa_circ_0070934, hsa_circ_0004315) were obtained and six miRNAs (hsa-miR-542-5p, hsa-miR-326, hsa-miR-511-5p, hsa-miR-195-5p, hsa-miR-214-3p, and hsa-miR-424-5p) which are regulated by the above DECs were identified. Then 543 overlapped genes regulated by six miRNAs mentioned above were predicted. Functional enrichment analysis showed that these genes are mostly associated with cancer regulation functions. Ten hub genes (TTK،AURKB, KIF20A، KIF23، CEP55، CDC6، DTL، NCAPG، CENPF، PLK4) have been screened from the PPI network of the 204 survival-related genes. KIF20A, NCAPG, TTK, PLK4, and CDC6 were selected for the highest significant p-values. In the end, a circRNA-miRNA-mRNA regulatory axis was established for five final selected hub genes. This study implies the potential pathogenesis of the obtained network and proposes that the two DECs (has_circ_0070934 and has_circ_0004315) may be important prognostic factor for HCC.


Sign in / Sign up

Export Citation Format

Share Document