scholarly journals Deep Learning Features Encode Interpretable Morphologies within Histological Images

Author(s):  
Ali Foroughi pour ◽  
Brian White ◽  
Jonghanne Park ◽  
Todd Sheridan ◽  
Jeffrey Chuang

Abstract Convolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture), which we show can be construed as abstract morphological genes (“mones”) with strong independent associations to biological phenotypes. We observe that many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC=97.1%±2.8% for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC=99.2%±0.12%). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values.

2021 ◽  
Author(s):  
Ali Foroughi Pour ◽  
Brian White ◽  
Jonghanne Park ◽  
Todd B. Sheridan ◽  
Jeffrey H. Chuang

ABSTRACTConvolutional neural networks (CNNs) are revolutionizing digital pathology by enabling machine learning-based classification of a variety of phenotypes from hematoxylin and eosin (H&E) whole slide images (WSIs), but the interpretation of CNNs remains difficult. Most studies have considered interpretability in a post hoc fashion, e.g. by presenting example regions with strongly predicted class labels. However, such an approach does not explain the biological features that contribute to correct predictions. To address this problem, here we investigate the interpretability of H&E-derived CNN features (the feature weights in the final layer of a transfer-learning-based architecture), which we show can be construed as abstract morphological genes (“mones”) with strong independent associations to biological phenotypes. We observe that many mones are specific to individual cancer types, while others are found in multiple cancers especially from related tissue types. We also observe that mone-mone correlations are strong and robustly preserved across related cancers. Importantly, linear mone-based classifiers can very accurately separate 38 distinct classes (19 tumor types and their adjacent normals, AUC=97.1% ± 2.8% for each class prediction), and linear classifiers are also highly effective for universal tumor detection (AUC=99.2% ± 0.12%). This linearity provides evidence that individual mones or correlated mone clusters may be associated with interpretable histopathological features or other patient characteristics. In particular, the statistical similarity of mones to gene expression values allows integrative mone analysis via expression-based bioinformatics approaches. We observe strong correlations between individual mones and individual gene expression values, notably mones associated with collagen gene expression in ovarian cancer. Mone-expression comparisons also indicate that immunoglobulin expression can be identified using mones in colon adenocarcinoma and that immune activity can be identified across multiple cancer types, and we verify these findings by expert histopathological review. Our work demonstrates that mones provide a morphological H&E decomposition that can be effectively associated with diverse phenotypes, analogous to the interpretability of transcription via gene expression values.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1420-D1430
Author(s):  
Dongqing Sun ◽  
Jin Wang ◽  
Ya Han ◽  
Xin Dong ◽  
Jun Ge ◽  
...  

Abstract Cancer immunotherapy targeting co-inhibitory pathways by checkpoint blockade shows remarkable efficacy in a variety of cancer types. However, only a minority of patients respond to treatment due to the stochastic heterogeneity of tumor microenvironment (TME). Recent advances in single-cell RNA-seq technologies enabled comprehensive characterization of the immune system heterogeneity in tumors but posed computational challenges on integrating and utilizing the massive published datasets to inform immunotherapy. Here, we present Tumor Immune Single Cell Hub (TISCH, http://tisch.comp-genomics.org), a large-scale curated database that integrates single-cell transcriptomic profiles of nearly 2 million cells from 76 high-quality tumor datasets across 27 cancer types. All the data were uniformly processed with a standardized workflow, including quality control, batch effect removal, clustering, cell-type annotation, malignant cell classification, differential expression analysis and functional enrichment analysis. TISCH provides interactive gene expression visualization across multiple datasets at the single-cell level or cluster level, allowing systematic comparison between different cell-types, patients, tissue origins, treatment and response groups, and even different cancer-types. In summary, TISCH provides a user-friendly interface for systematically visualizing, searching and downloading gene expression atlas in the TME from multiple cancer types, enabling fast, flexible and comprehensive exploration of the TME.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Gaojianyong Wang ◽  
Dimitris Anastassiou

Abstract Analysis of large gene expression datasets from biopsies of cancer patients can identify co-expression signatures representing particular biomolecular events in cancer. Some of these signatures involve genomically co-localized genes resulting from the presence of copy number alterations (CNAs), for which analysis of the expression of the underlying genes provides valuable information about their combined role as oncogenes or tumor suppressor genes. Here we focus on the discovery and interpretation of such signatures that are present in multiple cancer types due to driver amplifications and deletions in particular regions of the genome after doing a comprehensive analysis combining both gene expression and CNA data from The Cancer Genome Atlas.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e14553-e14553
Author(s):  
Gordon Vansant ◽  
Adam Jendrisak ◽  
Ramsay Sutton ◽  
Sarah Orr ◽  
David Lu ◽  
...  

e14553 Background: Different cancers subtypes can often be effectively treated with similar Rx classes (i.e. platinum or taxane Rx). Yet, within a disease patient therapy benefit can be variable. The origins of precision medicine derive from pathologic sub-stratification to guide therapy (e.g. SCLC vs. NSCLC). Using the Epic Sciences platform, we performed FPC analysis of ~100,000 single CTCs from multiple indications and sought to utilize high resolution digital pathology and machine learning to index metastatic cancers for the purpose of improving our understanding of therapy response and precision medicine. Methods: 92,300 CTCs underwent FCP analysis (single cell digital pathology features of cellular and sub-cellular morphometrics) were collected from prostate (1641 pts, 70,747 CTCs), breast (268 pts, 8,718 CTCs), NSCLC ( 110 pts, 1884 CTCs), SCLC ( 141 pts, 8,872 CTCs) and bladder (65 pts, 2079 CTCs) cancer pts. After pre-processing the raw data, a training set was balanced by sampling the same number of CTCs from each indication. K-means clustering was applied on the training set and optimized number of clusters were determined by using the elbow approach. After generating the clusters on the training set, the cluster centers were extracted from k-means, and used to train a k-Nearest Neighbor (k-NN) classifier to predict the cluster assignment for the remaining CTCs (test set). Results: The optimized # of clusters was 9. The % and characteristics of CTCs in each indication are listed below. BCa CTCs were more enriched in cluster c1, which had higher CK expression, while SCLC and some of mCRPC shared the small cell features (c5). Conclusions: Heterogeneous CTC phenotypic subtypes were observed across multiple indications. Each indication harbored subtype heterogeneity and shared clusters with other disease subtypes. Patient cluster subtype analysis to prognosis and therapy benefit are on-going. Analysis of linking of CTC subtypes genotypes (by single cell sequencing) and to patient survival on multiple indications is ongoing.[Table: see text]


2012 ◽  
Vol 11 ◽  
pp. CIN.S9037 ◽  
Author(s):  
Bill Andreopoulos ◽  
Dimitris Anastassiou

Gene expression profiling has provided insights into different cancer types and revealed tissue-specific expression signatures. Alterations in microRNA expression contribute to the pathogenesis of many types of human diseases. Few studies have integrated all levels of gene expression, miRNA and methylation to uncover correlations between these data types. We performed an integrated profiling to discover instances of miRNAs associated with a gene expression and DNA methylation signature across multiple cancer types. Using data from The Cancer Genome Atlas (TCGA), we revealed a concordant gene expression and methylation signature associated with the microRNA hsa-miR-142 across the same samples. In all cancer types examined, we found a signature of co-expression of a gene set R and methylated sites M, which correlate positively (M+) or negatively (M–) with the expression of hsa-miR-142. The set R consistently contains many genes, such as TRAF3IP3, NCKAP1L, CD53, LAPTM5, PTPRC, EVI2B, DOCK2, LCP2, CYBB and FYB. The signature is preserved across glioblastoma, ovarian, breast, colon, kidney, lung, uterine and rectum cancer. There is 28% overlap of methylation sites in M between glioblastoma (GBM) and ovarian cancer. There is 60% overlap of genes in R between GBM and ovarian ( P = 1.3e−-11). Most of the genes in R are known to be expressed in lymphocytes and haematopoietic stem cells, while M reflects membrane proteins involved in cell-cell adhesion functions. We speculate that the hsa-miR-142 associated signature may signal haematopoietic-specific processes and an accumulation of methylation events triggering a progressive loss of cell-cell adhesion. We also observed that GBM samples belonging to the proneural subtype tend to have underexpressed hsa-miR-142 and R genes, hypomethylated M+ and hypermethylated M–, while the mesenchymal samples have the opposite profile.


2006 ◽  
Vol 22 (8) ◽  
pp. 950-958 ◽  
Author(s):  
J.-Y. Koo ◽  
I. Sohn ◽  
S. Kim ◽  
J. W. Lee

PLoS ONE ◽  
2010 ◽  
Vol 5 (10) ◽  
pp. e13696 ◽  
Author(s):  
Kun Xu ◽  
Juan Cui ◽  
Victor Olman ◽  
Qing Yang ◽  
David Puett ◽  
...  

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Guoshu Bi ◽  
Jiaqi Liang ◽  
Yuansheng Zheng ◽  
Runmei Li ◽  
Mengnan Zhao ◽  
...  

Abstract Background Tumor invasiveness reflects many biological changes associated with tumorigenesis, progression, metastasis, and drug resistance. Therefore, we performed a systematic assessment of invasiveness-related molecular features across multiple human cancers. Materials and methods Multi-omics data, including gene expression, miRNA, DNA methylation, and somatic mutation, in approximately 10,000 patients across 30 cancer types from The Cancer Genome Atlas, Gene Expression Omnibus, PRECOG, and our institution were enrolled in this study. Results Based on a robust gene signature, we established an invasiveness score and found that the score was significantly associated with worse prognosis in almost all cancers. Then, we identified common invasiveness-associated dysregulated molecular features between high- and low-invasiveness score group across multiple cancers, as well as investigated their mutual interfering relationships thus determining whether the dysregulation of invasiveness-related genes was caused by abnormal promoter methylation or miRNA expression. We also analyzed the correlations between the drug sensitivity data from cancer cell lines and the expression level of 685 invasiveness-related genes differentially expressed in at least ten cancer types. An integrated analysis of the correlations among invasiveness-related genetic features and drug response were conducted in esophageal carcinoma patients to outline the complicated regulatory mechanism of tumor invasiveness status in multiple dimensions. Moreover, functional enrichment suggests the invasiveness score might serve as a predictive biomarker for cancer patients receiving immunotherapy. Conclusion Our pan-cancer study provides a comprehensive atlas of tumor invasiveness and may guide more precise therapeutic strategies for tumor patients.


Sign in / Sign up

Export Citation Format

Share Document