scholarly journals Direct Detection of Expressed Mutations in AML Cells Using Single Cell RNA-Sequencing, and Its Impact on Defining Sources of Expression Heterogeneity

Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 1314-1314
Author(s):  
Allegra A Petti ◽  
Stephen R Williams ◽  
Christopher A Miller ◽  
Ian T Fiddes ◽  
David Chen ◽  
...  

Abstract Background. Acute Myeloid Leukemia (AML) is genetically and epigenetically heterogeneous. Most AML samples display clonal heterogeneity at presentation, which evolves with therapeutic interventions. To better understand the epigenetic consequences of clonal heterogeneity, we are using single-cell RNA-sequencing (scRNA-seq) to characterize expression heterogeneity in AML. To date, scRNA-seq has had limited utility in applications where it is essential to link transcriptional heterogeneity to genetic variation, because it has been difficult to identify specific mutations in individual cells using scRNA-seq data alone. To address this limitation, we developed an approach to use scRNA-seq data to identify expressed mutations in individual AML cells, and link these variants to the expression heterogeneity in the same samples. Methods. We generated duplicate cDNA libraries for each of 5 cryopreserved bone marrow samples from adult patients with de novo AML, using the 10x Genomics Chromium Single Cell 5' Gene Expression workflow for Single Cell RNA Sequencing. Single cell libraries were sequenced to yield a median of 20,474 cells per sample, and 192,427 reads per cell. Transcript alignment, counting, and inter-library normalization were performed using the Cell Ranger pipeline (10x Genomics). The Seurat R package was used for further normalization, filtering, principal component analysis, clustering, and t-SNE visualization. A nearest-neighbor algorithm was developed to assign each cell in the data set to the most transcriptionally similar hematopoietic lineage. For each case, we performed whole genome sequencing (WGS) to identify germline and somatic variants, and define clonal architecture. We then developed bioinformatic methods to determine which cells harbor these mutations, assign those cells to mutationally-defined subclones, and link mutations to defined expression clusters. Results. WGS identified 25-56 coding mutations per sample; we were able to identify 22%-46% of these mutations in at least one cell in the scRNA-seq data, including point mutations (e.g. DNMT3A, U2AF1, TP53, IDH1, IDH2, SRSF2, CEBPA, and others) and indels (e.g. FLT3-ITD, NPMc). Although the libraries were 5' biased, expressed mutations could be identified at long distances from the 5' end of transcripts; for example, an expressed DNMT3AR882H mutation (2.646 Kb from the initiating codon) was easily detected (Fig 1c). The frequency of detected mutations in the single-cell data varied widely (range: 1-1564 cells; median: 11 cells), and as expected, depended heavily on the expression level of the gene, and the size of the clone containing the mutation. Regardless, a median of 1378 cells (6.7%) had at least one identifiable mutation in the 5 samples. Using these data, we were able to 1) distinguish AML cells from normal cells in bone marrow samples (Fig 1a/b), 2) identify major subclones within the AML samples (Fig 1c/d), and 3) identify mutation-specific and subclone-specific expression profiles. In 2 samples with mutationally-defined subclones (one with a CEBPAR142fs mutation, and the other with a GATA2R361C mutation), subclone-specific gene expression profiles were clearly detected in the scRNA-seq data, and could be directly associated with cells containing the mutant transcription factors. In the case with the subclonal GATA2R361C mutation, cells with that mutation were restricted to a subset of expression clusters (Fig 1d). In this subset, we identified an expression signature that is supported by pre-existing knowledge of the GATA2/SPI1 transcriptional regulatory circuit. In addition, we observed that expression heterogeneity frequently occurs independent of mutations defined by specific subclones. For instance, the GATA2R361C subclone contained additional heterogeneity (5 independent expression clusters) that could not be accounted for by mutations (Fig 1a/d). Moreover, the other 3 cases exhibited extensive expression heterogeneity within the AML cells that was not explained by genetically defined subclones. In sum, scRNA-seq data, when adapted to detect mutations, has dramatically improved our understanding of the expression heterogeneity of AML, which arises from two main sources: 1) cell-type composition of the sample, and 2) expression variation among the AML cells themselves (caused by both mutation-associated and mutation-independent factors). Disclosures Williams: 10x Genomics: Employment, Equity Ownership. Fiddes:10x Genomics: Employment, Equity Ownership. Church:10x Genomics: Employment, Equity Ownership.

Science ◽  
2020 ◽  
Vol 371 (6531) ◽  
pp. eaba5257 ◽  
Author(s):  
Anna Kuchina ◽  
Leandra M. Brettner ◽  
Luana Paleologu ◽  
Charles M. Roco ◽  
Alexander B. Rosenberg ◽  
...  

Single-cell RNA sequencing (scRNA-seq) has become an essential tool for characterizing gene expression in eukaryotes, but current methods are incompatible with bacteria. Here, we introduce microSPLiT (microbial split-pool ligation transcriptomics), a high-throughput scRNA-seq method for Gram-negative and Gram-positive bacteria that can resolve heterogeneous transcriptional states. We applied microSPLiT to >25,000 Bacillus subtilis cells sampled at different growth stages, creating an atlas of changes in metabolism and lifestyle. We retrieved detailed gene expression profiles associated with known, but rare, states such as competence and prophage induction and also identified unexpected gene expression states, including the heterogeneous activation of a niche metabolic pathway in a subpopulation of cells. MicroSPLiT paves the way to high-throughput analysis of gene expression in bacterial communities that are otherwise not amenable to single-cell analysis, such as natural microbiota.


2020 ◽  
Vol 36 (13) ◽  
pp. 4021-4029
Author(s):  
Hyundoo Jeong ◽  
Zhandong Liu

Abstract Summary Single-cell RNA sequencing technology provides a novel means to analyze the transcriptomic profiles of individual cells. The technique is vulnerable, however, to a type of noise called dropout effects, which lead to zero-inflated distributions in the transcriptome profile and reduce the reliability of the results. Single-cell RNA sequencing data, therefore, need to be carefully processed before in-depth analysis. Here, we describe a novel imputation method that reduces dropout effects in single-cell sequencing. We construct a cell correspondence network and adjust gene expression estimates based on transcriptome profiles for the local subnetwork of cells of the same type. We comprehensively evaluated this method, called PRIME (PRobabilistic IMputation to reduce dropout effects in Expression profiles of single-cell sequencing), on synthetic and eight real single-cell sequencing datasets and verified that it improves the quality of visualization and accuracy of clustering analysis and can discover gene expression patterns hidden by noise. Availability and implementation The source code for the proposed method is freely available at https://github.com/hyundoo/PRIME. Supplementary information Supplementary data are available at Bioinformatics online.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 2317-2317
Author(s):  
Naoki Watanabe ◽  
Shouguo Gao ◽  
Sachiko Kajigaya ◽  
Carrie Diamond ◽  
Lemlem Alemu ◽  
...  

Deficiency of adenosine deaminase 2 (DADA2) is a rare autosomal recessive disease caused by loss-of-function mutations in the ADA2 gene. DADA2 typically presents in childhood and is characterized by vasculopathy, stroke, inflammation, and immunodeficiency as well as hematologic manifestations, such as bone marrow failure and lymphoproliferation. The ADA2 protein is predominantly expressed in stimulated monocytes, dendritic cells and macrophages. ADA2 increases in the setting of inflammation and/or infection conditions. ADA2 has been reported to have a critical role in maintaining the balance between M1 (pro-inflammatory) and M2 (anti-inflammatory) macrophages. Macrophages of DADA2 patients are polarized towards M1 subset., DADA2 pathogenesis is not well characterized. To elucidate molecular mechanisms in DADA2 deficiency, we analyzed a gene expression profile of CD14+ monocytes derived from peripheral blood using single cell RNA sequencing (scRNA-seq). Blood was collected from DADA2 patients and age- and sex-matched healthy donors; all patients were studied in a registered research protocol (clinicaltrials.gov NCT00071045). Samples were obtained from 14 DADA2 patients and 6 healthy donors; median age of the DADA2 patients was 23 years old (range, 5 - 57 years). Among the 14 patients, 7 had hematological phenotypes: 5 lymphopenia, 3 neutropenia, 3 thrombocytopenia, and 2 with hypocellular bone marrow histology. Low serum immunoglobulins and cutaneous findings were frequent. Nine of the 14 patients had been treated with TNF inhibitors (etanercept and adalimumab). Mutations were distributed throughout the ADA2 gene; although two siblings had the same mutation, even they showed poor genotype-phenotype correlation. Monocytes were isolated by immunomagnetic positive selection with the EasySep™ positive CD14 selection kit Ⅱ, then subjected to scRNA-seq using Single Cell 3' Reagent Kits v2 (10X Genomics). Libraries for scRNA-seq were sequenced on the HiSeq-3000 instrument. Based on scRNA-seq data, we could classify monocytes into three populations by conventional flow cytometric criteria using cell surface protein expression imputed from scRNA-seq: CD14++CD16- classical, CD14++CD16+ intermediate, and CD14+CD16++ nonclassical monocytes (Figure A). CD16 expression was higher in DADA2 patients than in healthy donors (Figure B). A proportion of nonclassical monocytes among total monocytes were significantly higher in DADA2 patients compared to healthy donors (Figure C). On comparison of gene expression of each monocyte subtypes in DADA2 patients with that of healthy donors, there were 215, 237, and 267 differentially expressed upregulated genes in classical, intermediate, and nonclassical monocytes, respectively (at a threshold avg_logFC > 0.2). Approximately 35% of upregulated genes were overlapped among the three monocyte subtypes of DADA2 patients, including immune response genes such as IFITM1, IFITM2, IFITM3, and C3AR1 (Figure D). Common gene pathways were associated with immune function, such as interferon alpha/beta signaling and interferon gamma signaling. Specific genes to classical and intermediate monocytes were less than 10% of all the upregulated genes. Distinctively, the NF-κB pathway was upregulated in nonclassical monocytes, this might contribute to the pathogenesis of DADA2 as inflammatory disease. Overall, each monocyte subtype of DADA2 patients showed upregulation of immune response gene sets compared to controls. DADA2 patients have increased numbers of nonclassical monocytes which may contribute the immune dysregulation and increased inflammation observed in the disease. Figure Disclosures No relevant conflicts of interest to declare.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 1882-1882 ◽  
Author(s):  
Samuel A Danziger ◽  
Mark McConnell ◽  
Jake Gockley ◽  
Mary Young ◽  
Adam Rosenthal ◽  
...  

Abstract Introduction The multiple myeloma (MM) tumor microenvironment (TME) strongly influences patient outcomes as evidenced by the success of immunomodulatory therapies. To develop precision immunotherapeutic approaches, it is essential to identify and enumerate TME cell types and understand their dynamics. Methods We estimated the population of immune and other non-tumor cell types during the course of MM treatment at a single institution using gene expression of paired CD138-selected bone marrow aspirates and whole bone marrow (WBM) core biopsies from 867 samples of 436 newly diagnosed MM patients collected at 5 time points: pre-treatment (N=354), post-induction (N=245), post-transplant (N=83), post-consolidation (N=51), and post-maintenance (N=134). Expression profiles from the aspirates were used to infer the transcriptome contribution of immune and stromal cells in the WBM array data. Unsupervised clustering of these non-tumor gene expression profiles across all time points was performed using the R package ConsensusClusterPlus with Bayesian Information Criterion (BIC) to select the number of clusters. Individual cell types in these TMEs were estimated using the DCQ algorithm and a gene expression signature matrix based on the published LM22 leukocyte matrix (Newman et al., 2015) augmented with 5 bone marrow- and myeloma-specific cell types. Results Our deconvolution approach accurately estimated percent tumor cells in the paired samples compared to estimates from microscopy and flow cytometry (PCC = 0.63, RMSE = 9.99%). TME clusters built on gene expression data from all 867 samples resulted in 5 unsupervised clusters covering 91% of samples. While the fraction of patients in each cluster changed during treatment, no new TME clusters emerged as treatment progressed. These clusters were associated with progression free survival (PFS) (p-Val = 0.020) and overall survival (OS) (p-Val = 0.067) when measured in pre-transplant samples. The most striking outcomes were represented by Cluster 5 (N = 106) characterized by a low innate to adaptive cell ratio and shortened patient survival (Figure 1, 2). This cluster had worse outcomes than others (estimated mean PFS = 58 months compared to 71+ months for other clusters, p-Val = 0.002; estimate mean OS = 105 months compared with 113+ months for other clusters, p-Val = 0.040). Compared to other immune clusters, the adaptive-skewed TME of Cluster 5 is characterized by low granulocyte populations and high antigen-presenting, CD8 T, and B cell populations. As might be expected, this cluster was also significantly enriched for ISS3 and GEP70 high risk patients, as well as Del1p, Del1q, t12;14, and t14:16. Importantly, this TME persisted even when the induction therapy significantly reduced the tumor load (Table 1). At post-induction, outcomes for the 69 / 245 patients in Cluster 5 remain significantly worse (estimate mean PFS = 56 months compared to 71+ months for other clusters, p-Val = 0.004; estimate mean OS = 100 months compared to 121+ months for other clusters, p-Val = 0.002). The analysis of on-treatment samples showed that the number of patients in Cluster 5 decreases from 30% before treatment to 12% after transplant, and of the 63 patients for whom we have both pre-treatment and post-transplant samples, 18/20 of the Cluster 5 patients moved into other immune clusters; 13 into Cluster 4. The non-5 clusters (with better PFS and OS overall) had higher amounts of granulocytes and lower amounts of CD8 T cells. Some clusters (1 and 4) had increased natural killer (NK) cells and decreased dendritic cells, while other clusters (2 and 3) had increased adipocytes and increases in M2 macrophages (Cluster 2) or NK cells (Cluster 3). Taken together, the gain of granulocytes and adipocytes was associated with improved outcome, while increases in the adaptive immune compartment was associated with poorer outcome. Conclusions We identified distinct clusters of patient TMEs from bulk transcriptome profiles by computationally estimating the CD138- fraction of TMEs. Our findings identified differential immune and stromal compositions in patient clusters with opposing clinical outcomes and tracked membership in those clusters during treatment. Adding this layer of TME to the analysis of myeloma patient baseline and on-treatment samples enables us to formulate biological hypotheses and may eventually guide therapeutic interventions to improve outcomes for patients. Disclosures Danziger: Celgene Corporation: Employment, Equity Ownership. McConnell:Celgene Corporation: Employment. Gockley:Celgene Corporation: Employment. Young:Celgene Corporation: Employment, Equity Ownership. Schmitz:Celgene Corporation: Employment, Equity Ownership. Reiss:Celgene Corporation: Employment, Equity Ownership. Davies:MMRF: Honoraria; Celgene: Consultancy, Honoraria, Membership on an entity's Board of Directors or advisory committees; Amgen: Consultancy, Membership on an entity's Board of Directors or advisory committees; TRM Oncology: Honoraria; Abbvie: Consultancy; ASH: Honoraria; Takeda: Consultancy, Membership on an entity's Board of Directors or advisory committees; Janssen: Consultancy, Honoraria. Copeland:Celgene Corporation: Employment, Equity Ownership. Fox:Celgene Corporation: Employment, Equity Ownership. Fitch:Celgene Corporation: Employment, Equity Ownership. Newhall:Celgene Corporation: Employment, Equity Ownership. Barlogie:Celgene: Consultancy, Research Funding; Dana Farber Cancer Institute: Other: travel stipend; Multiple Myeloma Research Foundation: Other: travel stipend; International Workshop on Waldenström's Macroglobulinemia: Other: travel stipend; Millenium: Consultancy, Research Funding; European School of Haematology- International Conference on Multiple Myeloma: Other: travel stipend; ComtecMed- World Congress on Controversies in Hematology: Other: travel stipend; Myeloma Health, LLC: Patents & Royalties: : Co-inventor of patents and patent applications related to use of GEP in cancer medicine licensed to Myeloma Health, LLC. Trotter:Celgene Research SL (Spain), part of Celgene Corporation: Employment, Equity Ownership. Hershberg:Celgene Corporation: Employment, Equity Ownership, Patents & Royalties. Dervan:Celgene Corporation: Employment, Equity Ownership. Ratushny:Celgene Corporation: Employment, Equity Ownership. Morgan:Takeda: Consultancy, Honoraria; Bristol-Myers Squibb: Consultancy, Honoraria; Celgene: Consultancy, Honoraria, Research Funding; Janssen: Research Funding.


2020 ◽  
Author(s):  
Weimiao Wu ◽  
Qile Dai ◽  
Yunqing Liu ◽  
Xiting Yan ◽  
Zuoheng Wang

AbstractSingle-cell RNA sequencing provides an opportunity to study gene expression at single-cell resolution. However, prevalent dropout events result in high data sparsity and noise that may obscure downstream analyses. We propose a novel method, G2S3, that imputes dropouts by borrowing information from adjacent genes in a sparse gene graph learned from gene expression profiles across cells. We applied G2S3 and other existing methods to seven single-cell datasets to compare their performance. Our results demonstrated that G2S3 is superior in recovering true expression levels, identifying cell subtypes, improving differential expression analyses, and recovering gene regulatory relationships, especially for mildly expressed genes.


Author(s):  
Meichen Dong ◽  
Aatish Thennavan ◽  
Eugene Urrutia ◽  
Yun Li ◽  
Charles M Perou ◽  
...  

Abstract Recent advances in single-cell RNA sequencing (scRNA-seq) enable characterization of transcriptomic profiles with single-cell resolution and circumvent averaging artifacts associated with traditional bulk RNA sequencing (RNA-seq) data. Here, we propose SCDC, a deconvolution method for bulk RNA-seq that leverages cell-type specific gene expression profiles from multiple scRNA-seq reference datasets. SCDC adopts an ENSEMBLE method to integrate deconvolution results from different scRNA-seq datasets that are produced in different laboratories and at different times, implicitly addressing the problem of batch-effect confounding. SCDC is benchmarked against existing methods using both in silico generated pseudo-bulk samples and experimentally mixed cell lines, whose known cell-type compositions serve as ground truths. We show that SCDC outperforms existing methods with improved accuracy of cell-type decomposition under both settings. To illustrate how the ENSEMBLE framework performs in complex tissues under different scenarios, we further apply our method to a human pancreatic islet dataset and a mouse mammary gland dataset. SCDC returns results that are more consistent with experimental designs and that reproduce more significant associations between cell-type proportions and measured phenotypes.


Gut ◽  
2020 ◽  
pp. gutjnl-2019-320368 ◽  
Author(s):  
Min Zhang ◽  
Shuofeng Hu ◽  
Min Min ◽  
Yanli Ni ◽  
Zheng Lu ◽  
...  

ObjectiveTumour heterogeneity represents a major obstacle to accurate diagnosis and treatment in gastric adenocarcinoma (GA). Here, we report a systematic transcriptional atlas to delineate molecular and cellular heterogeneity in GA using single-cell RNA sequencing (scRNA-seq).DesignWe performed unbiased transcriptome-wide scRNA-seq analysis on 27 677 cells from 9 tumour and 3 non-tumour samples. Analysis results were validated using large-scale histological assays and bulk transcriptomic datasets.ResultsOur integrative analysis of tumour cells identified five cell subgroups with distinct expression profiles. A panel of differentiation-related genes reveals a high diversity of differentiation degrees within and between tumours. Low differentiation degrees can predict poor prognosis in GA. Among them, three subgroups exhibited different differentiation grade which corresponded well to histopathological features of Lauren’s subtypes. Interestingly, the other two subgroups displayed unique transcriptome features. One subgroup expressing chief-cell markers (eg, LIPF and PGC) and RNF43 with Wnt/β-catenin signalling pathway activated is consistent with the previously described entity fundic gland-type GA (chief cell-predominant, GA-FG-CCP). We further confirmed the presence of GA-FG-CCP in two public bulk datasets using transcriptomic profiles and histological images. The other subgroup specifically expressed immune-related signature genes (eg, LY6K and major histocompatibility complex class II) with the infection of Epstein-Barr virus. In addition, we also analysed non-malignant epithelium and provided molecular evidences for potential transition from gastric chief cells into MUC6+TFF2+ spasmolytic polypeptide expressing metaplasia.ConclusionAltogether, our study offers valuable resource for deciphering gastric tumour heterogeneity, which will provide assistance for precision diagnosis and prognosis.


2020 ◽  
Author(s):  
Min Feng ◽  
Junming Xia ◽  
Shigang Fei ◽  
Xiong Wang ◽  
Yaohong Zhou ◽  
...  

AbstractA wide range of hemocyte types exist in insects but a full definition of the different subclasses is not yet established. The current knowledge of the classification of silkworm hemocytes mainly comes from morphology rather than specific markers, so our understanding of the detailed classification, hemocyte lineage and functions of silkworm hemocytes is very incomplete. Bombyx mori nucleopolyhedrovirus (BmNPV) is a representative member of the baculoviruses, which are a major pathogens that specifically infects silkworms and cause serious loss in sericulture industry. Here, we performed single-cell RNA sequencing (scRNA-seq) of silkworm hemocytes in BmNPV and mock-infected larvae to comprehensively identify silkworm hemocyte subsets and determined specific molecular and cellular characteristics in each hemocyte subset before and after viral infection. A total of 19 cell clusters and their potential marker genes were identified in silkworm hemocytes. Among these hemocyte clusters, clusters 0, 1, 2, 5 and 9 might be granulocytes (GR); clusters 14 and 17 were predicted as plasmatocytes (PL); cluster 18 was tentatively identified as spherulocytes (SP); and clusters 7 and 11 could possibly correspond to oenocytoids (OE). In addition, all of the hemocyte clusters were infected by BmNPV and some infected cells carried high viral-load in silkworm larvae at 3 day post infection (dpi). Interestingly, BmNPV infection can cause severe and diverse changes in gene expression in hemocytes. Cells belonging to the infection group mainly located at the early stage of the pseudotime trajectories. Furthermore, we found that BmNPV infection suppresses the immune response in the major hemocyte types. In summary, our scRNA-seq analysis revealed the diversity of silkworm hemocytes and provided a rich resource of gene expression profiles for a systems-level understanding of their functions in the uninfected condition and as a response to BmNPV.


2017 ◽  
Author(s):  
Simone Rizzetto ◽  
Auda A. Eltahla ◽  
Peijie Lin ◽  
Rowena Bull ◽  
Andrew R. Lloyd ◽  
...  

ABSTRACTSingle cell RNA sequencing (scRNA-seq) has shown great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant sub-populations of T cells, and notably the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, such as RNA library capture, cell quality, and sequencing output have been suggested to affect the quality of scRNA-seq data, but these factors have not been systematically examined.We studied the effect of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. TCRαβ were detected in 1,027 cells (79%), with a success rate between 81% and 100% for datasets with at least 250,000 (PE) reads of length >50 bp.Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.


Sign in / Sign up

Export Citation Format

Share Document