Integrated Bioinformatics and Machine Learning Algorithms Analyses Highlight Related Pathways and Genes Associated with Alzheimer's Disease

2021 ◽  
Vol 17 ◽  
Author(s):  
Hui Zhang ◽  
Qidong Liu ◽  
Xiaoru Sun ◽  
Yaru Xu ◽  
Yiling Fang ◽  
...  

Background: The pathophysiology of Alzheimer's disease (AD) is still not fully studied. Objective: This study aimed to explore the differently expressed key genes in AD and build a predictive model of diagnosis and treatment. Methods: Gene expression data of the entorhinal cortex of AD, asymptomatic AD, and control samples from the GEO database were analyzed to explore the relevant pathways and key genes in the progression of AD. Differentially expressed genes between AD and the other two groups in the module were selected to identify biological mechanisms in AD through KEGG and PPI network analysis in Metascape. Furthermore, genes with a high connectivity degree by PPI network analysis were selected to build a predictive model using different machine learning algorithms. Besides, model performance was tested with five-fold cross-validation to select the best fitting model. Results: A total of 20 co-expression gene clusters were identified after the network was constructed. Module 1 (in black) and module 2 (in royal blue) were most positively and negatively correlated with AD, respectively. Total 565 genes in module 1 and 215 genes in module 2, respectively, overlapped in two differentially expressed genes lists. They were enriched in the G protein-coupled receptor signaling pathway, immune-related processes, and so on. 11 genes were screened by using lasso logistic regression, and they were considered to play an important role in predicting AD samples. The model built by the support vector machine algorithm with 11 genes showed the best performance. Conclusion: This result shed light on the diagnosis and treatment of AD.

PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11203
Author(s):  
Dingyu Chen ◽  
Chao Li ◽  
Yan Zhao ◽  
Jianjiang Zhou ◽  
Qinrong Wang ◽  
...  

Aim Helicobacter pylori cytotoxin-associated protein A (CagA) is an important virulence factor known to induce gastric cancer development. However, the cause and the underlying molecular events of CagA induction remain unclear. Here, we applied integrated bioinformatics to identify the key genes involved in the process of CagA-induced gastric epithelial cell inflammation and can ceration to comprehend the potential molecular mechanisms involved. Materials and Methods AGS cells were transected with pcDNA3.1 and pcDNA3.1::CagA for 24 h. The transfected cells were subjected to transcriptome sequencing to obtain the expressed genes. Differentially expressed genes (DEG) with adjusted P value < 0.05, — logFC —> 2 were screened, and the R package was applied for gene ontology (GO) enrichment and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. The differential gene protein–protein interaction (PPI) network was constructed using the STRING Cytoscape application, which conducted visual analysis to create the key function networks and identify the key genes. Next, the Kaplan–Meier plotter survival analysis tool was employed to analyze the survival of the key genes derived from the PPI network. Further analysis of the key gene expressions in gastric cancer and normal tissues were performed based on The Cancer Genome Atlas (TCGA) database and RT-qPCR verification. Results After transfection of AGS cells, the cell morphology changes in a hummingbird shape and causes the level of CagA phosphorylation to increase. Transcriptomics identified 6882 DEG, of which 4052 were upregulated and 2830 were downregulated, among which q-value < 0.05, FC > 2, and FC under the condition of ≤2. Accordingly, 1062 DEG were screened, of which 594 were upregulated and 468 were downregulated. The DEG participated in a total of 151 biological processes, 56 cell components, and 40 molecular functions. The KEGG pathway analysis revealed that the DEG were involved in 21 pathways. The PPI network analysis revealed three highly interconnected clusters. In addition, 30 DEG with the highest degree were analyzed in the TCGA database. As a result, 12 DEG were found to be highly expressed in gastric cancer, while seven DEG were related to the poor prognosis of gastric cancer. RT-qPCR verification results showed that Helicobacter pylori CagA caused up-regulation of BPTF, caspase3, CDH1, CTNNB1, and POLR2A expression. Conclusion The current comprehensive analysis provides new insights for exploring the effect of CagA in human gastric cancer, which could help us understand the molecular mechanism underlying the occurrence and development of gastric cancer caused by Helicobacter pylori.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7968 ◽  
Author(s):  
Jingwei Liu ◽  
Weixin Liu ◽  
Hao Li ◽  
Qiuping Deng ◽  
Meiqi Yang ◽  
...  

Background As the most frequently occurred tumor in biliary tract, cholangiocarcinoma (CCA) is mainly characterized by its late diagnosis and poor outcome. It is therefore urgent to identify specific genes and pathways associated with its progression and prognosis. Materials and Methods The differentially expressed genes in The Cancer Genome Atlas were analyzed to build the co-expression network by Weighted gene co-expression network analysis (WGCNA). Gene ontology (GO) as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were conducted for the selected genes. Module–clinical trait relationships were analyzed to explore the association with clinicopathological parameters. Log-rank tests and cox regression were used to identify the prognosis-related genes. Results The most related modules with CCA development were tan module containing 181 genes and salmon module with 148 genes. GO analysis suggested enrichment terms of digestion, hormone transport and secretion, epithelial cell proliferation, signal release, fibroblast activation, response to acid chemical, wnt, Nicotinamide adenine dinucleotide phosphate metabolism. KEGG analysis demonstrated 15 significantly altered pathways including glutathione metabolism, wnt, central carbon metabolism, mTOR, pancreatic secretion, protein digestion, axon guidance, retinol metabolism, insulin secretion, salivary secretion, fat digestion. Key genes of SOX2, KIT, PRSS56, WNT9A, SLC4A4, PRRG4, PANX2, PIR, RASSF8, MFSD4A, INS, RNF39, IL1R2, CST1, and PPP3CA might be potential prognostic markers for CCA, of which RNF39 and PRSS56 also showed significant correlation with clinical stage. Discussion Differentially expressed genes and key modules contributing to CCA development were identified by WGCNA. Our results offer novel insights into the characteristics in the etiology, prognosis, and treatment of CCA.


2021 ◽  
Vol 12 ◽  
Author(s):  
Ying Peng ◽  
Cheng Peng ◽  
Zheng Fang ◽  
Gang Chen

Endometriosis, a common disease that presents as polymorphism, invasiveness, and extensiveness, with clinical manifestations including dysmenorrhea, infertility, and menstrual abnormalities, seriously affects quality of life in women. To date, its underlying etiological mechanism of action and the associated regulatory genes remain unclear. This study aimed to identify molecular markers and elucidate mechanisms underlying the development and progression of endometriosis. Specifically, we downloaded five microarray expression datasets, namely, GSE11691, GSE23339, GSE25628, GSE7305, and GSE105764, from the Gene Expression Omnibus (GEO) database. These datasets, obtained from endometriosis tissues, alongside normal controls, were subjected to in-depth bioinformatics analysis for identification of differentially expressed genes (DEGs), followed by analysis of their function and pathways via gene ontology (GO) and KEGG pathway enrichment analyses. Moreover, we constructed a protein–protein interaction (PPI) network to explore the hub genes and modules, and then applied machine learning algorithms support vector machine-recursive feature elimination and least absolute shrinkage and selection operator (LASSO) analysis to identify key genes. Furthermore, we adopted the CIBERSORTx algorithm to estimate levels of immune cell infiltration while the connective map (CMAP) database was used to identify potential therapeutic drugs in endometriosis. As a result, a total of 423 DEGs, namely, 233 and 190 upregulated and downregulated, were identified. On the other hand, a total of 1,733 PPIs were obtained from the PPI network. The DEGs were mainly enriched in immune-related mechanisms. Furthermore, machine learning and LASSO algorithms identified three key genes, namely, apelin receptor (APLNR), C–C motif chemokine ligand 21 (CCL21), and Fc fragment of IgG receptor IIa (FCGR2A). Furthermore, 16 small molecular compounds associated with endometriosis treatment were identified, and their mechanism of action was also revealed. Taken together, the findings of this study provide new insights into the molecular factors regulating occurrence and progression of endometriosis and its underlying mechanism of action. The identified therapeutic drugs and molecular markers may have clinical significance in early diagnosis of endometriosis.


2020 ◽  
Author(s):  
Nida Fatima

Abstract Background: Preoperative prognostication of clinical and surgical outcome in patients with neurosurgical diseases can improve the risk stratification, thus can guide in implementing targeted treatment to minimize these events. Therefore, the author aims to highlight the development and validation of predictive models determining neurosurgical outcomes through machine learning algorithms using logistic regression.Methods: Logistic regression (enter, backward and forward) and least absolute shrinkage and selection operator (LASSO) method for selection of variables from selected database can eventually lead to multiple candidate models. The final model with a set of predictive variables must be selected based upon the clinical knowledge and numerical results.Results: The predictive model which performed best on the discrimination, calibration, Brier score and decision curve analysis must be selected to develop machine learning algorithms. Logistic regression should be compared with the LASSO model. Usually for the big databases, the predictive model selected through logistic regression gives higher Area Under the Curve (AUC) than those with LASSO model. The predictive probability derived from the best model could be uploaded to an open access web application which is easily deployed by the patients and surgeons to make a risk assessment world-wide.Conclusions: Machine learning algorithms provide promising results for the prediction of outcomes following cranial and spinal surgery. These algorithms can provide useful factors for patient-counselling, assessing peri-operative risk factors, and predicting post-operative outcomes after neurosurgery.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e17000-e17000
Author(s):  
Yimin Li ◽  
Mei Lan ◽  
Xinhao Peng ◽  
Zijian Zhang ◽  
Jin Yi Lang

e17000 Background: Cervical cancer represents the fourth most frequently diagnosed malignancy affecting women all over the world. However, effective prognostic biomarkers are still limited for accurate identifying high-risk patients. Here, we provide a co-expression network and machine learning-based signature to predict the survival of cervical cancer. Methods: Utilizing expression profiles of The Cancer Genome Atlas datasets, we identified differentially expressed genes (DEGs) and the most significantly module by differential expression analysis and Weighted Gene Co-expression Network Analysis, respectively. The candidate genes was obtained by combining the both results. Then the prognostic classifier was constructed by LASSO COX regression analysis and validated in testing set. Finally, survival receiver operating characteristic and Cox proportional hazards analysis was used to assess the performance of prognostic prediction. Results: We identified 190 differentially expressed genes (DEGs) between cervical squamous cell cancer(CSCC) and normal samples in purple module. Next we built a 8-mRNA-based signature, and determined a optimal cutoff value with sensitivity of 0.889 and specificity of 0.785. Patients were classified into high-risk and low-risk group with significantly different overall survival(training set: p < 0.0001; testing set: p = 0.039). Furthermore, the prognostic classifier was an independent and powerful prognostic biomarker for OS (HR = 7.05, 95% CI: 2.52-19.71, p < 0.001). Conclusions: The prognostic classifier is a promising predictor of CSCC patients, the novel co-expression network and machine learning-based strategy described in the study may have a broad application in precision medicine.


2021 ◽  
Vol 9 ◽  
Author(s):  
Qinlin Shi ◽  
Bo Tang ◽  
Yanping Li ◽  
Yonglin Li ◽  
Tao Lin ◽  
...  

Objective: Wilms tumor (WT) is a common malignant solid tumor in children. Many tumor biomarkers have been reported; however, there are poorly targetable molecular mechanisms which have been defined in WT. This study aimed to identify the oncogene in WT and explore the potential mechanisms.Methods: Differentially expressed genes (DEGs) in three independent RNA-seq datasets were downloaded from The Cancer Genome Atlas data portal and the Gene Expression Omnibus database (GSE66405 and GSE73209). The common DEGs were then subjected to Gene Ontology enrichment analysis, protein–protein interaction (PPI) network analysis, and gene set enrichment analysis. The protein expression levels of the hub gene were analyzed by immunohistochemical analysis and Western blotting in a 60 WT sample. The univariate Kaplan–Meier analysis for overall survival was performed, and the log-rank test was utilized. A small interfering RNA targeting cell division cycle 20 (CDC20) was transfected into G401 and SK-NEP-1 cell lines. The Cell Counting Kit-8 assay and wound healing assay were used to observe the changes in cell proliferation and migration after transfection. Flow cytometry was used to detect the effect on the cell cycle. Western blot was conducted to study the changes of related functional proteins.Results: We commonly identified 44 upregulation and 272 downregulation differentially expressed genes in three independent RNA-seq datasets. Gene and pathway enrichment analyses of the regulatory networks involving hub genes suggested that cell cycle changes are crucial in WT. The top 15 highly connected genes were found by PPI network analysis. Furthermore, we demonstrated that one candidate biomarker, CDC20, for the diagnosis of WT was detected, and its high expression predicted poor prognosis of WT patients. Moreover, the area under the curve value obtained by receiver operating characteristic curve analysis from paired WT samples was 0.9181. Finally, we found that the suppression of CDC20 inhibited proliferation and migration and resulted in G2/M phase arrest in WT cells. The mechanism may be involved in increasing the protein level of securin, cyclin B1, and cyclin AConclusion: Our results suggest that CDC20 could serve as a candidate diagnostic and prognostic biomarker for WT, and suppression of CDC20 may be a potential approach for the prevention and treatment of WT.


Sign in / Sign up

Export Citation Format

Share Document