scholarly journals Analysis of potential genetic biomarkers and molecular mechanism of smoking-related postmenopausal osteoporosis using weighted gene co-expression network analysis and machine learning

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257343
Author(s):  
Shaoshuo Li ◽  
Baixing Chen ◽  
Hao Chen ◽  
Zhen Hua ◽  
Yang Shao ◽  
...  

Objectives Smoking is a significant independent risk factor for postmenopausal osteoporosis, leading to genome variations in postmenopausal smokers. This study investigates potential biomarkers and molecular mechanisms of smoking-related postmenopausal osteoporosis (SRPO). Materials and methods The GSE13850 microarray dataset was downloaded from Gene Expression Omnibus (GEO). Gene modules associated with SRPO were identified using weighted gene co-expression network analysis (WGCNA), protein-protein interaction (PPI) analysis, and pathway and functional enrichment analyses. Feature genes were selected using two machine learning methods: support vector machine-recursive feature elimination (SVM-RFE) and random forest (RF). The diagnostic efficiency of the selected genes was assessed by gene expression analysis and receiver operating characteristic curve. Results Eight highly conserved modules were detected in the WGCNA network, and the genes in the module that was strongly correlated with SRPO were used for constructing the PPI network. A total of 113 hub genes were identified in the core network using topological network analysis. Enrichment analysis results showed that hub genes were closely associated with the regulation of RNA transcription and translation, ATPase activity, and immune-related signaling. Six genes (HNRNPC, PFDN2, PSMC5, RPS16, TCEB2, and UBE2V2) were selected as genetic biomarkers for SRPO by integrating the feature selection of SVM-RFE and RF. Conclusion The present study identified potential genetic biomarkers and provided a novel insight into the underlying molecular mechanism of SRPO.

2020 ◽  
Author(s):  
Xi Pan ◽  
Jian-Hao Liu

Abstract Background Nasopharyngeal carcinoma (NPC) is a heterogeneous carcinoma that the underlying molecular mechanisms involved in the tumor initiation, progression, and migration are largely unclear. The purpose of the present study was to identify key biomarkers and small-molecule drugs for NPC screening, diagnosis, and therapy via gene expression profile analysis. Methods Raw microarray data of NPC were retrieved from the Gene Expression Omnibus (GEO) database and analyzed to screen out the potential differentially expressed genes (DEGs). The key modules associated with histology grade and tumor stage was identified by using weighted correlation network analysis (WGCNA). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses of genes in the key module were performed to identify potential mechanisms. Candidate hub genes were obtained, which based on the criteria of module membership (MM) and high connectivity. Then we used receiver operating characteristic (ROC) curve to evaluate the diagnostic value of hub genes. The Connectivity map database was further used to screen out small-molecule drugs of hub genes. Results A total of 430 DEGs were identified based on two GEO datasets. The green gene module was considered as key module for the tumor stage of NPC via WGCNA analysis. The results of functional enrichment analysis revealed that genes in the green module were enriched in regulation of cell cycle, p53 signaling pathway, cell part morphogenesis. Furthermore, four DEGs-related hub genes in the green module were considered as the final hub genes. Then ROC revealed that the final four hub genes presented with high areas under the curve, suggesting these hub genes may be diagnostic biomarkers for NPC. Meanwhile, we screened out several small-molecule drugs that have provided potentially therapeutic goals for NPC. Conclusions Our research identified four potential prognostic biomarkers and several candidate small-molecule drugs for NPC, which may contribute to the new insights for NPC therapy.


2020 ◽  
Author(s):  
Junhong Li ◽  
Yang Zhai ◽  
Peng Wu ◽  
Yueqiang Hu ◽  
Wei Chen ◽  
...  

Abstract BACKGROUD: Microarray-based gene expression profiling is widely used in biomedical research. Weighted gene co-expression network analysis (WGCNA) links microarray data directly to clinical traits and identifies rules for predicting pathological stage and prognosis of disease.WGCNA is useful in understandingmany biological processes. Stroke is a common disease worldwide, however, molecular mechanisms of its pathogenesis are largely unknown. The aim of this study was to construct gene co-expression networks for identification of key modules and hub genes associated with stroke pathogenesis.METHODS: Gene microarray expression profiles of stroke samples were retrieved from the Gene Expression Omnibus (GEO) database. Differentially expressed genes (DEGs) were screened by the limma package in R software. WGCNA was used to construct free-scale gene co-expression networks to explore the associations between gene sets and clinical features, and to identify key modules and hub genes. Subsequently, functional enrichment analyses were performed. Further, receiver operating characteristic (ROC) curve analysis was carried out to validate expression of hub genes and literature validation was performed as well.RESULTS: A total of 11,747 most variant genes were used for co-expression network construction. Pink and yellow modules were significantly correlated to stroke pathogenesis. Functional enrichment analysis showed that the pink module was mainly involved in regulation of neuron regeneration, and repair of DNA damage.On the other hand, yellow module was mainly enriched in ion transport system dysfunction which was correlated with neuron death. A total of eight hub genes (PRR11, NEDD9, Notch2, RUNX1-IT1, ANP32A-IT1, ASTN2, SAMHD1 and STIM1) were identified and validated at transcriptional levels and through existing literature.CONCLUSION: The eight hub genes (PRR11, NEDD9, Notch2, RUNX1-IT1, ANP32A-IT1, ASTN2, SAMHD1 and STIM1) identified in the study are potentialbiomarkers and therapeutic targets for effective diagnosis and treatment of stroke.


2020 ◽  
Author(s):  
Jinbao Yin ◽  
Chen Lin ◽  
Meng jiang ◽  
Xinbing Tang ◽  
Danlin Xie ◽  
...  

Abstract BackgroundAs a highly prevalent tumor disease worldwide, Further elucidation of the molecular mechanisms of the occurrence, development and prognosis of breast cancer remain an urgent need. Identifying hub genes involved in these pathogenesis and progression can potentially help to unveil its mechanism and provide novel diagnostic and prognostic markers for breast cancer.MethodsIn this study, we systematically integrated multiple bioinformatic methods, including robust rank aggregation (RRA), functional enrichment analysis, protein-protein interaction (PPI) networks construction and analysis, weighted gene co-expression network analysis (WGCNA), ROC and Kaplan-Meier analyses, DNA methylation analyses and genomic mutation analyses, GSEA and GSVA, based on ten mRNA datasets to identify and investigate novel hub genes involved in breast cancer. In parallel, RNA in situ detection technology was applied to validate those novel hub gene.ResultsEZH2 was recognized as a key gene by PPI network analysis. CENPL, ISG20L2, LSM4 and MRPL3 were identified as four novel hub genes through the WGCNA analysis and literate search. Among those five hub genes, many studies on EZH2 gene in breast cancer have been reported, but no studies are related to the roles of CENPL, ISG20L2, MRPL3 and LSM4 in breast cancer. These novel four hub genes were up-regulated in breast cancer tissues and associated with tumor progression. ROC and Kaplan-Meier indicated these four hub genes all showed good diagnostic performance and prognostic values for breast cancer. The preliminary analysis revealed those novel hub genes are four potentially candidate genes for further exploring the molecular mechanism of breast cancer.ConclusionWe identify four novel hub genes (CENPL, ISG20L2, MRPL3, and LSM4) that are likely playing key roles in the molecular mechanism of occurrence and development of breast cancer. Those hub genes are four potentially candidate genes served as promising candidate diagnostic biomarkers and prognosis predictors for breast cancer, and their exact functional mechanisms in breast cancer deserve further in-depth study.


2021 ◽  
Vol 7 ◽  
Author(s):  
Tao Yan ◽  
Shijie Zhu ◽  
Miao Zhu ◽  
Chunsheng Wang ◽  
Changfa Guo

Background: Atrial fibrillation (AF) is the most common tachyarrhythmia in the clinic, leading to high morbidity and mortality. Although many studies on AF have been conducted, the molecular mechanism of AF has not been fully elucidated. This study was designed to explore the molecular mechanism of AF using integrative bioinformatics analysis and provide new insights into the pathophysiology of AF.Methods: The GSE115574 dataset was downloaded, and Cibersort was applied to estimate the relative expression of 22 kinds of immune cells. Differentially expressed genes (DEGs) were identified through the limma package in R language. Weighted gene correlation network analysis (WGCNA) was performed to cluster DEGs into different modules and explore relationships between modules and immune cell types. Functional enrichment analysis was performed on DEGs in the significant module, and hub genes were identified based on the protein-protein interaction (PPI) network. Hub genes were then verified using quantitative real-time polymerase chain reaction (qRT-PCR).Results: A total of 2,350 DEGs were identified and clustered into eleven modules using WGCNA. The magenta module with 246 genes was identified as the key module associated with M1 macrophages with the highest correlation coefficient. Three hub genes (CTSS, CSF2RB, and NCF2) were identified. The results verified using three other datasets and qRT-PCR demonstrated that the expression levels of these three genes in patients with AF were significantly higher than those in patients with SR, which were consistent with the bioinformatic analysis.Conclusion: Three novel genes identified using comprehensive bioinformatics analysis may play crucial roles in the pathophysiological mechanism in AF, which provide potential therapeutic targets and new insights into the treatment and early detection of AF.


2021 ◽  
Vol 18 (6) ◽  
pp. 8997-9015
Author(s):  
Ahmed Hammad ◽  
◽  
Mohamed Elshaer ◽  
Xiuwen Tang ◽  
◽  
...  

<abstract> <p>Colorectal cancer (CRC) is one of the most common malignancies worldwide. Biomarker discovery is critical to improve CRC diagnosis, however, machine learning offers a new platform to study the etiology of CRC for this purpose. Therefore, the current study aimed to perform an integrated bioinformatics and machine learning analyses to explore novel biomarkers for CRC prognosis. In this study, we acquired gene expression microarray data from Gene Expression Omnibus (GEO) database. The microarray expressions GSE103512 dataset was downloaded and integrated. Subsequently, differentially expressed genes (DEGs) were identified and functionally analyzed via Gene Ontology (GO) and Kyoto Enrichment of Genes and Genomes (KEGG). Furthermore, protein protein interaction (PPI) network analysis was conducted using the STRING database and Cytoscape software to identify hub genes; however, the hub genes were subjected to Support Vector Machine (SVM), Receiver operating characteristic curve (ROC) and survival analyses to explore their diagnostic values. Meanwhile, TCGA transcriptomics data in Gene Expression Profiling Interactive Analysis (GEPIA) database and the pathology data presented by in the human protein atlas (HPA) database were used to verify our transcriptomic analyses. A total of 105 DEGs were identified in this study. Functional enrichment analysis showed that these genes were significantly enriched in biological processes related to cancer progression. Thereafter, PPI network explored a total of 10 significant hub genes. The ROC curve was used to predict the potential application of biomarkers in CRC diagnosis, with an area under ROC curve (AUC) of these genes exceeding 0.92 suggesting that this risk classifier can discriminate between CRC patients and normal controls. Moreover, the prognostic values of these hub genes were confirmed by survival analyses using different CRC patient cohorts. Our results demonstrated that these 10 differentially expressed hub genes could be used as potential biomarkers for CRC diagnosis.</p> </abstract>


2022 ◽  
Author(s):  
Si-tong Liu ◽  
You Zhang ◽  
Xin-gui Wu ◽  
Chang-xing Lu ◽  
Qi-Ping Hu

Abstract Background: Stroke is the second most common cause of death worldwide and the leading cause of long-term severe disability with neurological impairment worsening within hours after stroke onset and being especially involved with motor function. So far, there are no established and reliable biomarkers to prognose stroke. Early detection of biomarkers that can prognose stroke is of great importance for clinical intervention and prevention of clinical deterioration of stroke.Methods: TGSE119121 dataset was retrieved from the Gene Expression Integrated Database (Gene Expression Omnibus, GEO) and weighted gene co-expression network analysis (WGCNA) was conducted to identify the key modules that could regulate disease progression. Moreover, functional enrichment analysis was conducted to study the biological functions of the key module genes. The GSE16561 dataset was further analyzed by the Support Vector Machines coupled with Recursive Feature Elimination (SVM-RFE )algorithm to identify the top genes regulating disease progression. The hub genes revealed by WGCNA were associated with disease progression using the receiver operating characteristic curve (ROC) analysis. Subsequently, functional enrichment of the hub genes was performed by deploying gene set variation analysis (GSVA). The changes at gene level were transformed into the changes at pathway level to identify the biological function of each sample. Finally, the expression level of the hub gene in the rat infarction model of MCAO was measured using RT-qPCR for validation. Results: WGCNA analysis revealed four hub genes: DEGS1, HSDL2, ST8SIA4 and STK3. The result of GSVA showed that the hub genes were involved in stroke progression by regulating the p53 signal pathway, the PI3K signal pathway, and the inflammatory response pathway. The results of RT-qPCR indicated that the expression of the four HUB genes was increased significantly in the rat model of MCAO.Conclusion: Several genes, such as DEGS, HSDL2, ST8SIA4 and STK3, were identified and associated with the progression of the disease. Moreover, it was hypothesized that these genes may be involved in the progression stroke by regulating the P53 signal, the PI3K signal, and the inflammatory response pathway, respectively. These genes have potential prognostic value and may serve as biomarkers for predicting stroke progression. The early identification of the patients at risk of progression is essential to prevent clinical deterioration and provide a reference for future research.


Author(s):  
Qingchun Liang ◽  
Qin Zhou ◽  
Jinhe Li ◽  
Zhugui Chen ◽  
Zhihao Zhang ◽  
...  

Abstract Acute lung injury (ALI) is an inflammatory pulmonary disease that can easily develop into serious acute respiratory distress syndrome, which has high morbidity and mortality. However, the molecular mechanism of ALI remains unclear, and few molecular biomarkers for diagnosis and treatment have been identified. In this study, we aimed to identify novel molecular biomarkers using a bioinformatics approach. Gene expression data were obtained from the Gene Expression Omnibus database, co-expressed differentially expressed genes (CoDEGs) were identified using R software, and further functional enrichment analyses were conducted using the online tool Database for Annotation, Visualization, and Integrated Discovery. A protein–protein interaction network was established using the STRING database and Cytoscape software. Lipopolysaccharide (LPS)-induced ALI mouse model was constructed and verified. The hub genes were screened and validated in vivo. The transcription factors (TFs) and miRNAs associated with the hub genes were predicted using the NetworkAnalyst database. In total, 71 CoDEGs were screened and found to be mainly involved in the cytokine–cytokine receptor interactions, and the tumor necrosis factor and malaria signaling pathways. Animal experiments showed that the lung injury score, bronchoalveolar lavage fluid protein concentration, and wet-to-dry weight ratio were higher in the LPS group than those in the control group. Real-time polymerase chain reaction analysis indicated that most of the hub genes such as colony-stimulating factor 2 (Csf2) were overexpressed in the LPS group. A total of 20 TFs including nuclear respiratory factor 1 (NRF1) and two miRNAs were predicted to be regulators of the hub genes. In summary, Csf2 may serve as a novel diagnostic and therapeutic target for ALI. NRF1 and mmu-mir-122-5p may be key regulators in the development of ALI.


Biomolecules ◽  
2020 ◽  
Vol 10 (9) ◽  
pp. 1207
Author(s):  
Dongmei Ai ◽  
Yuduo Wang ◽  
Xiaoxin Li ◽  
Hongfei Pan

An effective feature extraction method is key to improving the accuracy of a prediction model. From the Gene Expression Omnibus (GEO) database, which includes 13,487 genes, we obtained microarray gene expression data for 238 samples from colorectal cancer (CRC) samples and normal samples. Twelve gene modules were obtained by weighted gene co-expression network analysis (WGCNA) on 173 samples. By calculating the Pearson correlation coefficient (PCC) between the characteristic genes of each module and colorectal cancer, we obtained a key module that was highly correlated with CRC. We screened hub genes from the key module by considering module membership, gene significance, and intramodular connectivity. We selected 10 hub genes as a type of feature for the classifier. We used the variational autoencoder (VAE) for 1159 genes with significantly different expressions and mapped the data into a 10-dimensional representation, as another type of feature for the cancer classifier. The two types of features were applied to the support vector machines (SVM) classifier for CRC. The accuracy was 0.9692 with an AUC of 0.9981. The result shows a high accuracy of the two-step feature extraction method, which includes obtaining hub genes by WGCNA and a 10-dimensional representation by variational autoencoder (VAE).


2019 ◽  
Vol 39 (7) ◽  
Author(s):  
Yadong Wu ◽  
Feng liu ◽  
Siyang Luo ◽  
Xinhai Yin ◽  
Dengqi He ◽  
...  

Abstract Breast cancer (BC) is the most common leading cause of cancer-related death in women worldwide. Gene expression profiling analysis for human BCs has been studied previously. However, co-expression analysis for BC cell lines is still devoid to date. The aim of the study was to identify key pathways and hub genes that may serve as a biomarker for BC and uncover potential molecular mechanism using weighted correlation network analysis. We analyzed microarray data of BC cell lines (GSE 48213) listed in the Gene Expression Omnibus database. Gene co-expression networks were used to construct and explore the biological function in hub modules using the weighted correlation network analysis algorithm method. Meanwhile, Gene ontology and KEGG pathway analysis were performed using Cytoscape plug-in ClueGo. The network of the key module was also constructed using Cytoscape. A total of 5000 genes were selected, 28 modules of co-expressed genes were identified from the gene co–expression network, one of which was found to be significantly associated with a subtype of BC lines. Functional enrichment analysis revealed that the brown module was mainly involved in the pathway of the autophagy, spliceosome, and mitophagy, the black module was mainly enriched in the pathway of colorectal cancer and pancreatic cancer, and genes in midnightblue module played critical roles in ribosome and regulation of lipolysis in adipocytes pathway. Three hub genes CBR3, SF3B6, and RHPN1 may play an important role in the development and malignancy of the disease. The findings of the present study could improve our understanding of the molecular pathogenesis of breast cancer.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Min Li ◽  
Wenye Zhu ◽  
Chu Wang ◽  
Yuanyuan Zheng ◽  
Shibo Sun ◽  
...  

Abstract Background Asthma is a heterogeneous disease that can be divided into four inflammatory phenotypes: eosinophilic asthma (EA), neutrophilic asthma (NA), mixed granulocytic asthma (MGA), and paucigranulocytic asthma (PGA). While research has mainly focused on EA and NA, the understanding of PGA is limited. In this study, we aimed to identify underlying mechanisms and hub genes of PGA. Methods Based on the dataset from Gene Expression Omnibus(GEO), weighted gene coexpression network analysis (WGCNA), differentially expressed genes (DEGs) analysis and protein–protein interaction (PPI) network analysis were conducted to construct a gene network and to identify key gene modules and hub genes. Functional enrichment analyses were performed to investigate the biological process, pathways and immune status of PGA. The hub genes were validated in a separate dataset. Results Compared to non-PGA, PGA had a different gene expression pattern, in which 449 genes were differentially expressed. One gene module significantly associated with PGA was identified. Intersection between the differentially expressed genes (DEGs) and the genes from the module that were most relevant to PGA were mainly enriched in inflammation and immune response regulation. The single sample Gene Set Enrichment Analysis (ssGSEA) suggested a decreased immune infiltration and function in PGA. Finally six hub genes of PGA were identified, including ADCY2, CXCL1, FPRL1, GPR109B, GPR109A and ADCY3, which were validated in a separate dataset of GSE137268. Conclusions Our study characterized distinct gene expression patterns, biological processes and immune status of PGA and identified hub genes, which may improve the understanding of underlying mechanism and provide potential therapeutic targets for PGA.


Sign in / Sign up

Export Citation Format

Share Document