Development of a Novel Deep Transfer Learning Framework to Characterize Inter- and Intra-Tumor Heterogeneity in Myeloma Patients

Background: Clonal heterogeneity is a known issue in multiple myeloma (MM) and the emergence of drug resistant clones is responsible for the incurability of the disease. Multiple studies of bulk CD138+ bone marrow samples have attempted to stratify MM patients into smaller, more distinct, patient risk groups based on molecular phenotypes. Recently, single cell RNA sequencing (scRNA-seq) technology has been applied in MM to identify cell clones. This leads to a new question: can we classify patients with scRNA-seq data guided by previously defined subtypes, and how do the single cell results correspond with the classification? Methods: We developed a novel, deep transfer learning framework to predict MM patient subtypes in patients with scRNA-seq based on patient classifications from microarray data. While the problem of scRNA-seq batch corrections has been intensively studied using transfer learning, there has been less work on similar comparisons between scRNA-seq and patient-level data. To address this issue, we utilized domain adaptation, a specific transfer learning approach, to combine scRNA-seq profiles and patient-level microarray data using a multitask learning framework. Figure 1 illustrates our computational framework. Its aim is to classify both cells and patients (with scRNA-seq data) according to patient level classifications derived from previous gene expression profiling studies for MM. Specifically, we adopted the 10-subtype classifications derived from microarray data1. Patients with scRNA-seq were summarized into a single vector by averaging gene counts across all the cells. Gene expression profiling data (including scRNA-seq and microarray) for MM patients from multiple studies were input into the transfer learning network consisting of 5 hidden layers. The last hidden layer was used to calculate the maximum mean discrepancy (MMD) between the patients from scRNA-seq and microarray to integrate the datasets. The datasets in this study are summarized in Table 1. Two microarray datasets (GSE19784, GSE2658) and one scRNA-seq dataset (GSE117156) were obtained from NCBI Gene Expression Omnibus. IUSM data were locally generated. One microarray and one scRNA-seq dataset were used in training and testing. GSE19784 was split into 80% training and 20% testing. GSE117156, due to the smaller sample size (11 patients), was split into 90% training and 10% testing. We ran 20 rounds of random cross validation using TensorFlow on a GTX1080 GPU. The expression profiles of patients and single cells from all datasets (GSE19784, GSE117156, GSE2658, IUSM) were input into the trained model after each round of cross validation to produce low-dimensional representations and predictions for each training, testing, and validation sample. Results: We found that our model was able to identify signals in the data based on expression profiles from patient-level and single cell data. The patient classification labels can be consistently reproduced in a held-out test set of patients as well as in a validation cohort of microarray data from 559 MM patients (GSE2658) and scRNA-seq from 4 MM patients from IUSM (Figure 2). These results show that the model can learn the subtypes across multiple datasets and platforms. The 4 IUSM patients tended to cluster similarly to their individual CD138+ cells after training, while GSE2658 patients still maintained some separation between MM subtype clusters (Figure 3). The single cells from our cohort of 4 patients did not necessarily classify to the same subtype as their patient. Conclusions: We found that a domain adaptive classifier can be trained across scRNA-seq and bulk gene expression profiling data from MM patients to integrate data and transfer knowledge. These models showed that single cells within a patient do not necessarily match the patient level molecular characteristics. Not surprisingly, similar results have been found in other cancer types2. As our novel framework is further refined and more patients are sequenced, we expect more unique insights into both inter- and intra-tumor MM heterogeneity. References: 1. Broyl A, Hose D, Lokhorst H, et al. Gene expression profiling for molecular classification of multiple myeloma in newly diagnosed patients. Blood. 2010;116(14):2543-2553. 2. Patel AP, Tirosh I, Trombetta JJ, et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science. 2014;344(6190):1396-1401. Disclosures Abonour: Celgene: Consultancy, Research Funding; BMS: Consultancy; Takeda: Consultancy, Research Funding; Janssen: Consultancy, Research Funding. Roodman:Amgen: Membership on an entity's Board of Directors or advisory committees.

Download Full-text

Gene Expression Profiling of Mycosis Fungoides in Early and Tumor Stage—A Proof-of-Concept Study Using Laser Capture/Single Cell Microdissection and NanoString Analysis

Cells ◽

10.3390/cells10113190 ◽

2021 ◽

Vol 10 (11) ◽

pp. 3190

Author(s):

Justine Lai ◽

Jing Li ◽

Robert Gniadecki ◽

Raymond Lai

Keyword(s):

Gene Expression ◽

Single Cell ◽

Gene Expression Profiling ◽

Mycosis Fungoides ◽

Expression Profiling ◽

Single Cells ◽

Tumor Stage ◽

Proof Of Concept ◽

Opposite Pattern ◽

Laser Capture

A subset of patients with mycosis fungoides (MF) progress to the tumor stage, which correlates with a worse clinical outcome. The molecular events driving this progression are not well-understood. To identify the key molecular drivers, we performed gene expression profiling (GEP) using NanoString. Ten formalin-fixed/paraffin-embedded skin biopsies from six patients (six non-tumor and four tumor MF) were included; non-tumor and tumor samples were available in three patients. Laser capture/single cell microdissection of epidermotropic MF cells was used for non-tumor cases. We found that the RNA extracted from 700–800 single cells was consistently sufficient for GEP, provided that multiplexed target enrichment amplification was used. An un-supervised/hierarchical analysis revealed clustering of non-tumor and tumor cases. Many of the most upregulated or downregulated genes are implicated in the PI3K, RAS, cell cycle/apoptosis and MAPK pathways. Two of the targets, HMGA1 and PTPN11 (encodes SHP2), were validated using immunohistochemistry. HMGA1 was positive in six out of six non-tumor MF samples and negative in five out of five tumor MF samples. An opposite pattern was seen with SHP2. Our study has provided a proof-of-concept that single-cell microdissection/GEP can be applied to archival tissues. Some of our identified gene targets might be key drivers of the disease progression of MF.

Download Full-text

Diagnostic Evidence GAuge of Single cells (DEGAS): A transfer learning framework to infer impressions of cellular and patient phenotypes between patients and single cells

10.1101/2020.06.16.142984 ◽

2020 ◽

Author(s):

Travis S. Johnson ◽

Christina Y. Yu ◽

Zhi Huang ◽

Siwen Xu ◽

Tongxin Wang ◽

...

Keyword(s):

Deep Learning ◽

Single Cell ◽

Transfer Learning ◽

Single Cells ◽

Molecular Data ◽

Neuron Loss ◽

Learning Framework ◽

Patient Level ◽

Wide Range ◽

Cell Data

AbstractWith the rapid advance of single cell sequencing techniques, single cell molecular data are quickly accumulated. However, there lacks a sound approach to properly integrate single cell data with the existing large amount of patient-level disease data. To address such need, we proposed DEGAS (Diagnostic Evidence GAuge of Single cells), a novel deep transfer-learning framework which allows for cellular and clinical information, including cell types, disease risk, and patient subtypes, to be cross-mapped between single cell and patient data, provided they share at least one common type of molecular data. We call such transferrable information “impressions”, which are generated by the deep learning models learned in the DEGAS framework. Using eight datasets from a wide range of diseases including Glioblastoma Multiforme (GBM), Alzheimer’s Disease (AD), and Multiple Myeloma (MM), we demonstrate the feasibility and broad applications of DEGAS in cross-mapping clinical and cellular information across disparate single cell and patient level transcriptomic datasets. Specifically, we correctly mapped clinically known GBM patient subtypes onto single cell data. We also identified previously known neuron loss from AD brains, then mapped the “impression” of AD risk to single cell data. Furthermore, we discovered novel differences in excitatory and inhibitory neuron loss in AD data. From the exploratory MM data, we identified differences in the malignancy of different CD138+ cellular subtypes based on “impressions” of relapse information transferred from MM patients. Through this work, we demonstrated that DEGAS is a powerful framework to cross-infer cellular and patient-level characteristics, which not only unites single cell and patient level transcriptomic data by identifying their latent links using the deep learning approach, but can also prioritize both patient subtypes and cellular subtypes for precision medicine.

Download Full-text