scholarly journals An Improved, Assay Platform Agnostic, Absolute Single Sample Breast Cancer Subtype Classifier

Cancers ◽  
2020 ◽  
Vol 12 (12) ◽  
pp. 3506
Author(s):  
Mi-kyoung Seo ◽  
Soonmyung Paik ◽  
Sangwoo Kim

While intrinsic molecular subtypes provide important biological classification of breast cancer, the subtype assignment of individuals is influenced by assay technology and study cohort composition. We sought to develop a platform-independent absolute single-sample subtype classifier based on a minimal number of genes. Pairwise ratios for subtype-specific differentially expressed genes from un-normalized expression data from 432 breast cancer (BC) samples of The Cancer Genome Atlas (TCGA) were used as inputs for machine learning. The subtype classifier with the fewest number of genes and maximal classification power was selected during cross-validation. The final model was evaluated on 5816 samples from 10 independent studies profiled with four different assay platforms. Upon cross-validation within the TCGA cohort, a random forest classifier (MiniABS) with 11 genes achieved the best accuracy of 88.2%. Applying MiniABS to five validation sets of RNA-seq and microarray data showed an average accuracy of 85.15% (vs. 77.72% for Absolute Intrinsic Molecular Subtype (AIMS)). Only MiniABS could be applied to five low-throughput datasets, showing an average accuracy of 87.93%. The MiniABS can absolutely subtype BC using the raw expression levels of only 11 genes, regardless of assay platform, with higher accuracy than existing methods.

2021 ◽  
Vol 11 ◽  
Author(s):  
Xiaoxiao Zhong ◽  
Jun Li ◽  
Xin Wu ◽  
Xianrui Wu ◽  
Lin Hu ◽  
...  

We aimed to identify a signature comprising N6-methyladenosine (m6A)-related long non-coding RNAs (lncRNAs) and molecular subtypes associated with breast cancer (BRCA). We obtained data of BRCA samples from The Cancer Genome Atlas database. The m6A-related lncRNA prognostic signature (m6A-LPS) included 10 lncRNAs previously identified as prognostic m6A-related lncRNAs and was constructed using integrated bioinformatics analysis and validated. Accordingly, a risk score based on the m6A-LPS signature was established and shown to confirm differences in survival between high-risk and low-risk groups. Three distinct genotypes were identified, whose characteristics included features of the tumor immune microenvironment in each subtype. Our results indicated that patients in Cluster 2 might have a worse prognostic outcome than those in other clusters. The three genotypes and risk subgroups were enriched in different biological processes and pathways, respectively. We then constructed a competing endogenous RNA network based on the prognostic m6A-related lncRNAs. Finally, we validated the expression levels of target lncRNAs in 72 clinical samples. In summary, the m6A-LPS and the potentially novel genotype may provide a theoretical basis for further study of the molecular mechanism of BRCA and may provide novel insights into precision medicine.


2018 ◽  
Vol 50 (3) ◽  
pp. 658-669 ◽  
Author(s):  
Ki-Tae Hwang ◽  
Kwangsoo Kim ◽  
Ji Hyun Chang ◽  
Sohee Oh ◽  
Young A Kim ◽  
...  

2020 ◽  
Vol 22 (Supplement_2) ◽  
pp. ii28-ii28
Author(s):  
Alvaro Alvarado ◽  
Kaleab Tessema ◽  
Kunal Patel ◽  
Riki Kawaguchi ◽  
Richard Everson ◽  
...  

Abstract Despite efforts to gain a deeper understanding of its molecular architecture, glioblastoma (GBM) remains uniformly fatal. While genome-based molecular subtyping has revealed that GBMs may be parsed into several molecularly distinct categories, this insight has yielded little progress towards extending patient survival. In particular, the great phenotypic heterogeneity of GBM – both inter and intratumorally – has hindered therapeutic efforts. To this end, we interrogated tumor samples using a pathway-based approach to resolve tumoral heterogeneity. Gene set enrichment analysis (GSEA) was applied to gene expression data and used to provide an overview of each sample that can be compared to other samples by generating sample clusters based on overall patterns of enrichment. The Cancer Genome Atlas (TCGA) samples were clustered using the canonical and oncogenic signatures and in both cases the clustering was distinct from the molecular subtype previously reported and clusters were informative of patient survival. We also analyzed single cell RNA sequencing datasets and uniformly found two clusters of cells enriched for cell cycle regulation and survival pathways. We have validated our approach by generating gene lists from common elements found in the top contributing genesets for a particular cluster and testing the top targets in appropriate gliomasphere patient-derived lines. Samples enriched for cell cycle related genesets showed a decrease in sphere formation capacity when E2F1, out top target, was silenced and when treated with fulvestrant and calcitriol, which were identified as potential drugs targeting this genelist. Conversely, no changes were observed in samples not enriched for this gene list. Finally, we interrogated spatial heterogeneity and found higher enrichment of the proliferative signature in contrast enhancing compared with non-enhancing regions. Our studies relate inter- and intratumoral heterogeneity to critical cellular pathways dysregulated in GBM, with the ultimate goal of establishing a pipeline for patient- and tumor-specific precision medicine.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
Dongquan Chen ◽  
Yufeng Li ◽  
Lizhong Wang ◽  
Kai Jiao

Breast cancer (BC) is the second most common cancer diagnosed in American women and is also the second leading cause of cancer death in women. Research has focused heavily on BC metastasis. Multiple signaling pathways have been implicated in regulating BC metastasis. Our knowledge of regulation of BC metastasis is, however, far from complete. Identification of new factors during metastasis is an essential step towards future therapy. Our labs have focused on Semaphorin 6D (SEMA6D), which was implicated in immune responses, heart development, and neurogenesis. It will be interesting to know SEMA6D-related genomic expression profile and its implications in clinical outcome. In this study, we examined the public datasets of breast invasive carcinoma from The Cancer Genome Atlas (TCGA). We analyzed the expression of SEMA6D along with its related genes, their functions, pathways, and potential as copredictors for BC patients’ survival. We found 6-gene expression profile that can be used as such predictors. Our study provides evidences for the first time that breast invasive carcinoma may contain a subtype based on SEMA6D expression. The expression of SEMA6D gene may play an important role in promoting patient survival, especially among triple negative breast cancer patients.


2018 ◽  
Vol Volume 11 ◽  
pp. 1-11 ◽  
Author(s):  
Chundi Gao ◽  
Huayao Li ◽  
Jing Zhuang ◽  
HongXiu Zhang ◽  
Kejia Wang ◽  
...  

2020 ◽  
Author(s):  
Zhuoran Xu ◽  
Akanksha Verma ◽  
Uska Naveed ◽  
Samuel Bakhoum ◽  
Pegah Khosravi ◽  
...  

Chromosomal instability (CIN) is a hallmark of human cancer that involves mis-segregation of chromosomes during mitosis, leading to aneuploidy and genomic copy number heterogeneity. CIN is a prognostic marker in a variety of cancers, yet, gold-standard experimental assessment of chromosome mis-segregation is difficult in the routine clinical setting. As a result, CIN status is not readily testable for cancer patients in such setting. On the other hand, the gold-standard for cancer diagnosis and grading, histopathological examinations, are ubiquitously available. In this study, we sought to explore whether CIN status can be predicted using hematoxylin and eosin (H&E) histology in breast cancer patients. Specifically, we examined whether CIN, defined using a genomic aneuploidy burden approach, can be predicted using a deep learning-based model. We applied transfer learning on convolutional neural network (CNN) models to extract histological features and trained a multilayer perceptron (MLP) after aggregating patch features obtained from whole slide images. When applied to a breast cancer cohort of 1,010 patients (Training set: n=858 patients, Test set: n=152 patients) from The Cancer Genome Atlas (TCGA) where 485 patients have high CIN status, our model accurately classified CIN status, achieving an area under the curve (AUC) of 0.822 with 81.2% sensitivity and 68.7% specificity in the test set. Patch-level predictions of CIN status suggested intra-tumor spatial heterogeneity within slides. Moreover, presence of patches with high predicted CIN score within an entire slide was more predictive of clinical outcome than the average CIN score of the slide, thus underscoring the clinical importance of spatial heterogeneity. Overall, we demonstrated the ability of deep learning methods to predict CIN status based on histopathology slide images. Our model is not breast cancer subtype specific and the method can be potentially extended to other cancer types.


2019 ◽  
Vol 26 (1) ◽  
pp. 31-46 ◽  
Author(s):  
Eva Baxter ◽  
Karolina Windloch ◽  
Greg Kelly ◽  
Jason S Lee ◽  
Frank Gannon ◽  
...  

Up to 80% of endometrial and breast cancers express oestrogen receptor alpha (ERα). Unlike breast cancer, anti-oestrogen therapy has had limited success in endometrial cancer, raising the possibility that oestrogen has different effects in both cancers. We investigated the role of oestrogen in endometrial and breast cancers using data from The Cancer Genome Atlas (TCGA) in conjunction with cell line studies. Using phosphorylation of ERα (ERα-pSer118) as a marker of transcriptional activation of ERα in TCGA datasets, we found that genes associated with ERα-pSer118 were predominantly unique between tumour types and have distinct regulators. We present data on the alternative and novel roles played by SMAD3, CREB-pSer133 and particularly XBP1 in oestrogen signalling in endometrial and breast cancer.


2021 ◽  
Author(s):  
Rada Tazhitdinova ◽  
Alexander V Timoshenko

Abstract Purpose This study aimed to assess the functional associations between genes of the glycobiological landscape encoding galectins and O-GlcNAc cycle enzymes in the context of breast cancer biology and clinical applications. Methods An in silico analysis of the breast cancer data from The Cancer Genome Atlas was conducted comparing expression, pairwise correlations, and prognostic value for 17 genes encoding galectins, O-GlcNAc cycle enzymes, and cell stemness-related transcription factors. Results Multiple general and breast cancer subtype-specific differences in galectin/O-GlcNAc genetic landscape markers were observed and classified. Specifically, LGALS12 was found to be significantly downregulated in breast cancer tissues across all subtypes while LGALS2 and GFPT1 showed potential as prognostic markers. Remarkably, there was an overall loss of both correlation strength and correlation relationship between expression of galectin/O-GlcNAc landscape genes in the breast cancer samples versus normal tissues. Six gene pairs (GFPT1/LGALS1, GFPT1/LGALS3, GFPT1/LGALS12, GFPT1/KLF4, OGT/LGALS12, and OGT/KLF4) were found to be potential diagnostic markers for breast cancer. Conclusions These findings indicate that the glycobiological landscape of breast cancer underwent significant remodeling, which might be associated with switching galectin gene regulation within a framework of O-GlcNAc homeostasis.


Sign in / Sign up

Export Citation Format

Share Document