scholarly journals Breast Cancer Case Identification Based on Deep Learning and Bioinformatics Analysis

2021 ◽  
Vol 12 ◽  
Author(s):  
Dongfang Jia ◽  
Cheng Chen ◽  
Chen Chen ◽  
Fangfang Chen ◽  
Ningrui Zhang ◽  
...  

Mastering the molecular mechanism of breast cancer (BC) can provide an in-depth understanding of BC pathology. This study explored existing technologies for diagnosing BC, such as mammography, ultrasound, magnetic resonance imaging (MRI), computed tomography (CT), and positron emission tomography (PET) and summarized the disadvantages of the existing cancer diagnosis. The purpose of this article is to use gene expression profiles of The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) to classify BC samples and normal samples. The method proposed in this article triumphs over some of the shortcomings of traditional diagnostic methods and can conduct BC diagnosis more rapidly with high sensitivity and have no radiation. This study first selected the genes most relevant to cancer through weighted gene co-expression network analysis (WGCNA) and differential expression analysis (DEA). Then it used the protein–protein interaction (PPI) network to screen 23 hub genes. Finally, it used the support vector machine (SVM), decision tree (DT), Bayesian network (BN), artificial neural network (ANN), convolutional neural network CNN-LeNet and CNN-AlexNet to process the expression levels of 23 hub genes. For gene expression profiles, the ANN model has the best performance in the classification of cancer samples. The ten-time average accuracy is 97.36% (±0.34%), the F1 value is 0.8535 (±0.0260), the sensitivity is 98.32% (±0.32%), the specificity is 89.59% (±3.53%) and the AUC is 0.99. In summary, this method effectively classifies cancer samples and normal samples and provides reasonable new ideas for the early diagnosis of cancer in the future.

Author(s):  
Bong-Hyun Kim ◽  
Kijin Yu ◽  
Peter C W Lee

Abstract Motivation Cancer classification based on gene expression profiles has provided insight on the causes of cancer and cancer treatment. Recently, machine learning-based approaches have been attempted in downstream cancer analysis to address the large differences in gene expression values, as determined by single-cell RNA sequencing (scRNA-seq). Results We designed cancer classifiers that can identify 21 types of cancers and normal tissues based on bulk RNA-seq as well as scRNA-seq data. Training was performed with 7398 cancer samples and 640 normal samples from 21 tumors and normal tissues in TCGA based on the 300 most significant genes expressed in each cancer. Then, we compared neural network (NN), support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF) methods. The NN performed consistently better than other methods. We further applied our approach to scRNA-seq transformed by kNN smoothing and found that our model successfully classified cancer types and normal samples. Availability and implementation Cancer classification by neural network. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Chengzhang Li ◽  
Jiucheng Xu

Background: Hepatocellular carcinoma (HCC) is a major threat to public health. However, few effective therapeutic strategies exist. We aimed to identify potentially therapeutic target genes of HCC by analyzing three gene expression profiles. Methods: The gene expression profiles were analyzed with GEO2R, an interactive web tool for gene differential expression analysis, to identify common differentially expressed genes (DEGs). Functional enrichment analyses were then conducted followed by a protein-protein interaction (PPI) network construction with the common DEGs. The PPI network was employed to identify hub genes, and the expression level of the hub genes was validated via data mining the Oncomine database. Survival analysis was carried out to assess the prognosis of hub genes in HCC patients. Results: A total of 51 common up-regulated DEGs and 201 down-regulated DEGs were obtained after gene differential expression analysis of the profiles. Functional enrichment analyses indicated that these common DEGs are linked to a series of cancer events. We finally identified 10 hub genes, six of which (OIP5, ASPM, NUSAP1, UBE2C, CCNA2, and KIF20A) are reported as novel HCC hub genes. Data mining the Oncomine database validated that the hub genes have a significant high level of expression in HCC samples compared normal samples (t-test, p < 0.05). Survival analysis indicated that overexpression of the hub genes is associated with a significant reduction (p < 0.05) in survival time in HCC patients. Conclusions: We identified six novel HCC hub genes that might be therapeutic targets for the development of drugs for some HCC patients.


2020 ◽  
Author(s):  
Xing Chen ◽  
Junjie Zheng ◽  
Min ling Zhuo ◽  
Ailong Zhang ◽  
Zhenhui You

Abstract Background: Breast cancer (BRCA) represents the most common malignancy among women worldwide that with high mortality. Radiotherapy is a prevalent therapeutic for BRCA that with heterogeneous effectiveness among patients. Methods: we proposed to develop a gene expression-based signature for BRCA radiotherapy sensitivity prediction. Gene expression profiles of BRCA samples from the Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) were obtained and used as training and independent testing dataset, respectively. Differential expression genes (DEGs) in BRCA tumor samples compared with their paracancerous samples in the training set were identified by using edgeR Bioconductor package followed by dimensionality reduction through autoencoder method and univariate Cox regression analysis to screen genes among DEGs that with significant prognosis significance in patients that were previously treated with radiation. LASSO Cox regression method was applied to screen optimal genes for constructing radiotherapy sensitivity prediction signature. Results: 603 DEGs were obtained in BRCA tumor samples, and seven out of which were retained after univariate cox regression analysis. LASSO Cox regression analysis finally remained six genes based on which the radiotherapy sensitivity prediction model was constructed. The signature was proved to be robust in both training and independent testing sets and an independent marker for BRCA radiotherapy sensitivity prediction. Conclusions: this study should be helpful for BRCA patients’ therapeutics selection and clinical decision.


2020 ◽  
Vol 40 (5) ◽  
Author(s):  
Xinhua Liu ◽  
Yonglin Peng ◽  
Ju Wang

Abstract Breast cancer is a common malignant tumor among women whose prognosis is largely determined by the period and accuracy of diagnosis. We here propose to identify a robust DNA methylation-based breast cancer-specific diagnostic signature. Genome-wide DNA methylation and gene expression profiles of breast cancer patients along with their adjacent normal tissues from the Cancer Genome Atlas (TCGA) were obtained as the training set. CpGs that with significantly elevated methylation level in breast cancer than not only their adjacent normal tissues and the other ten common cancers from TCGA but also the healthy breast tissues from the Gene Expression Omnibus (GEO) were finally remained for logistic regression analysis. Another independent breast cancer DNA methylation dataset from GEO was used as the testing set. Lots of CpGs were hyper-methylated in breast cancer samples compared with adjacent normal tissues, which tend to be negatively correlated with gene expressions. Eight CpGs located at RIIAD1, ENPP2, ESPN, and ETS1, were finally retained. The diagnostic model was reliable in separating BRCA from normal samples. Besides, chromatin accessibility status of RIIAD1, ENPP2, ESPN and ETS1 showed great differences between MCF-7 and MDA-MB-231 cell lines. In conclusion, the present study should be helpful for breast cancer early and accurate diagnosis.


2020 ◽  
Author(s):  
Seokhyun Yoon ◽  
Hye Sung Won ◽  
Keunsoo Kang ◽  
Kexin Qiu ◽  
Woong June Park ◽  
...  

AbstractThe cost of next-generation sequencing technologies is rapidly declining, making RNA-seq-based gene expression profiling (GEP) an affordable technique for predicting receptor expression status and intrinsic subtypes in breast cancer (BRCA) patients. Based on the expression levels of co-expressed genes, GEP-based receptor-status prediction can classify clinical subtypes more accurately than can immunohistochemistry (IHC). Using data from the cancer genome atlas TCGA BRCA and METABRIC datasets, we identified common predictor genes found in both datasets and performed receptor-status prediction based on these genes. By assessing the survival outcomes of patients classified using GEP- or IHC-based receptor status, we compared the prognostic value of the two methods. We found that GEP-based HR prediction provided higher concordance with the intrinsic subtypes and a stronger association with treatment outcomes than did IHC-based hormone receptor (HR) status. GEP-based prediction improved the identification of patients who could benefit from hormone therapy, even in patients with non-luminal BRCA. We also confirmed that non-matching subgroup classification affected the survival of BRCA patients and that this could be largely overcome by GEP-based receptor-status prediction. In conclusion, GEP-based prediction provides more reliable classification of HR status, improving therapeutic decision making for breast cancer patients.


Diagnostics ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 726
Author(s):  
Hoang Dang Khoa Ta ◽  
Wan-Chun Tang ◽  
Nam Nhut Phan ◽  
Gangga Anuraga ◽  
Sz-Ying Hou ◽  
...  

Breast cancer (BRCA) is one of the most complex diseases and involves several biological processes. Members of the L-antigen (LAGE) family participate in the development of various cancers, but their expressions and prognostic values in breast cancer remain to be clarified. High-throughput methods for exploring disease progression mechanisms might play a pivotal role in the improvement of novel therapeutics. Therefore, gene expression profiles and clinical data of LAGE family members were acquired from the cBioportal database, followed by verification using the Oncomine and The Cancer Genome Atlas (TCGA) databases. In addition, the Kaplan-Meier method was applied to explore correlations between expressions of LAGE family members and prognoses of breast cancer patients. MetaCore, GlueGo, and GluePedia were used to comprehensively study the transcript expression signatures of LAGEs and their co-expressed genes together with LAGE-related signal transduction pathways in BRCA. The result indicated that higher LAGE3 messenger (m)RNA expressions were observed in BRCA tissues than in normal tissues, and they were also associated with the stage of BRCA patients. Kaplan-Meier plots showed that overexpression of LAGE1, LAGE2A, LAGE2B, and LAGE3 were highly correlated to poor survival in most types of breast cancer. Significant associations of LAGE family genes were correlated with the cell cycle, focal adhesion, and extracellular matrix (ECM) receptor interactions as indicated by functional enrichment analyses. Collectively, LAGE family members’ gene expression levels were related to adverse clinicopathological factors and prognoses of BRCA patients; therefore, LAGEs have the potential to serve as prognosticators of BRCA patients.


2021 ◽  
Author(s):  
Gang Chen ◽  
Mingwei Yu ◽  
Jianqiao Cao ◽  
Huishan Zhao ◽  
Yuanping Dai ◽  
...  

Abstract Background: Breast cancer (BC) is a malignancy with a high incidence among women in the world, and it is very urgent to identify significant biomarkers and molecular therapy methods.Methods: Total 58 normal tissues and 203 cancer tissues were collected from three Gene Expression Omnibus (GEO) gene expression profiles, and the differential expressed genes (DEGs) were identified. Subsequently, the Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway were analyzed. Additionally, hub genes were screened by constructing a protein-protein interaction (PPI) network. Then, we explored the prognostic values and molecular mechanism of these hub genes Kaplan-Meier (KM) curve and Gene Set Enrichment Analysis (GSEA). Results: 42 up-regulated and 82 down-regulated DEGs were screened out from GEO datasets. GO and KEGG pathway analysis revealed that DEGs were mainly related to cell cycles and cell proliferation. Furthermore, 12 hub genes (FN1, AURKA, CCNB1, BUB1B, PRC1, TPX2, NUSAP1, TOP2A, KIF20A, KIF2C, RRM2, ASPM) with a high degree of genes were selected, among which, 11 hub gene were significantly correlated with the prognosis of patients with BC. From GSEA reviewed correlated with KEGG_CELL_CYCLE and HALLMARK_P53_PATHWAY. Conclusion: this study identified 11 key genes as BC potential prognosis biomarkers on the basis of integrated bioinformatics analysis. This finding will improve our knowledge of the BC progress and mechanisms.


PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10468
Author(s):  
Kai Zhang ◽  
Kuikui Jiang ◽  
Ruoxi Hong ◽  
Fei Xu ◽  
Wen Xia ◽  
...  

Background Tamoxifen resistance in breast cancer is an unsolved problem in clinical practice. The aim of this study was to determine the potential mechanisms of tamoxifen resistance through bioinformatics analysis. Methods Gene expression profiles of tamoxifen-resistant MCF-7/TR and MCF-7 cells were acquired from the Gene Expression Omnibus dataset GSE26459, and differentially expressed genes (DEGs) were detected with R software. We conducted Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses using Database for Annotation, Visualization and Integrated Discovery. A protein–protein interaction (PPI) network was generated, and we analyzed hub genes in the network with the Search Tool for the Retrieval of Interacting Genes database. Finally, we used siRNAs to silence the target genes and conducted the MTS assay. Results We identified 865 DEGs, 399 of which were upregulated. GO analysis indicated that most genes are related to telomere organization, extracellular exosomes, and binding-related items for protein heterodimerization. PPI network construction revealed that the top 10 hub genes—ACLY, HSPD1, PFAS, GART, TXN, HSPH1, HSPE1, IRAS, TRAP1, and ATIC—might be associated with tamoxifen resistance. Consistently, RT-qPCR analysis indicated that the expression of these 10 genes was increased in MCF-7/TR cells comparing with MCF-7 cells. Four hub genes (TXN, HSPD1, HSPH1 and ATIC) were related to overall survival in patients who accepted tamoxifen. In addition, knockdown of HSPH1 by siRNA may lead to reduced growth of MCF-7/TR cell with a trend close to significance (P = 0.07), indicating that upregulation of HSPH1 may play a role in tamoxifen resistance. Conclusion This study revealed a number of critical hub genes that might serve as therapeutic targets in breast cancer resistant to tamoxifen and provided potential directions for uncovering the mechanisms of tamoxifen resistance.


2019 ◽  
Vol 39 (9) ◽  
Author(s):  
Keling Liu ◽  
Qingmei Fu ◽  
Yao Liu ◽  
Chenhong Wang

Abstract Preeclampsia (PE) is a disorder of pregnancy that is characterised by hypertension and a significant amount of proteinuria beginning after 20 weeks of pregnancy. It is closely associated with high maternal morbidity, mortality, maternal organ dysfunction or foetal growth restriction. Therefore, it is necessary to identify early and novel diagnostic biomarkers of PE. In the present study, we performed a multi-step integrative bioinformatics analysis of microarray data for identifying hub genes as diagnostic biomarkers of PE. With the help of gene expression profiles of the Gene Expression Omnibus (GEO) dataset GSE60438, a total of 268 dysregulated genes were identified including 131 up- and 137 down-regulated differentially expressed genes (DEGs). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of DEGs suggested that DEGs were significantly enriched in disease-related biological processes (BPs) such as hormone activity, immune response, steroid hormone biosynthesis, metabolic pathways, and other signalling pathways. Using the STRING database, we established a protein–protein interaction (PPI) network based on the above DEGs. Module analysis and identification of hub genes were performed to screen a total of 17 significant hub genes. The support vector machines (SVMs) model was used to predict the potential application of biomarkers in PE diagnosis with an area under the receiver operating characteristic (ROC) curve (AUC) of 0.958 in the training set and 0.834 in the test set, suggesting that this risk classifier has good discrimination between PE patients and control samples. Our results demonstrated that these 17 differentially expressed hub genes can be used as potential biomarkers for diagnosis of PE.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Baojie Wu ◽  
Shuyi Xi

Abstract Background This study aimed to explore and identify key genes and signaling pathways that contribute to the progression of cervical cancer to improve prognosis. Methods Three gene expression profiles (GSE63514, GSE64217 and GSE138080) were screened and downloaded from the Gene Expression Omnibus database (GEO). Differentially expressed genes (DEGs) were screened using the GEO2R and Venn diagram tools. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed. Gene set enrichment analysis (GSEA) was performed to analyze the three gene expression profiles. Moreover, a protein–protein interaction (PPI) network of the DEGs was constructed, and functional enrichment analysis was performed. On this basis, hub genes from critical PPI subnetworks were explored with Cytoscape software. The expression of these genes in tumors was verified, and survival analysis of potential prognostic genes from critical subnetworks was conducted. Functional annotation, multiple gene comparison and dimensionality reduction in candidate genes indicated the clinical significance of potential targets. Results A total of 476 DEGs were screened: 253 upregulated genes and 223 downregulated genes. DEGs were enriched in 22 biological processes, 16 cellular components and 9 molecular functions in precancerous lesions and cervical cancer. DEGs were mainly enriched in 10 KEGG pathways. Through intersection analysis and data mining, 3 key KEGG pathways and related core genes were revealed by GSEA. Moreover, a PPI network of 476 DEGs was constructed, hub genes from 12 critical subnetworks were explored, and a total of 14 potential molecular targets were obtained. Conclusions These findings promote the understanding of the molecular mechanism of and clinically related molecular targets for cervical cancer.


Sign in / Sign up

Export Citation Format

Share Document