scholarly journals Identifying the Landscape of Intratumoral Microbes via a Single Cell Transcriptomic Analysis

Author(s):  
Welles Robinson ◽  
Fiorella Schischlik ◽  
E. Michael Gertz ◽  
Alejandro A. Schäffer ◽  
Eytan Ruppin

AbstractMicrobial taxa that are differentially abundant between cell types are likely to be intracellular. Here we describe a new computational pipeline called CSI-Microbes (computational identification of Cell type Specific Intracellular Microbes) that aims to identify such putative intracellular species from single cell RNA-seq data in a given tumor sample. CSI-microbes also includes additional steps that can be applied to filter out microbial contaminants from the bona fide microbial residents of cells in the patients. We first test and validate CSI-microbes on a dataset of immune cells deliberately infected with Salmonella. We then apply CSI-microbes to identify intracellular microbes in breast cancer and melanoma. We identify Streptomyces as differentially abundant in the tumor cells of one breast cancer sample. We further identify three bacterial genera and four fungal genera that are differentially abundant and hence likely to be intracellular in the tumor cells in melanoma samples. No cell type specific bacteria were identified in our analysis of brain tumor samples. In sum, CSI-Microbes offers a new way to identify likely intracellular microbes living within specific cell populations in malignant tumors, markedly extending upon previous studies aimed at inferring microbial abundance from bulk tumor expression data.

2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Wei Lin ◽  
Pawan Noel ◽  
Erkut H. Borazanci ◽  
Jeeyun Lee ◽  
Albert Amini ◽  
...  

Abstract Background Solid tumors such as pancreatic ductal adenocarcinoma (PDAC) comprise not just tumor cells but also a microenvironment with which the tumor cells constantly interact. Detailed characterization of the cellular composition of the tumor microenvironment is critical to the understanding of the disease and treatment of the patient. Single-cell transcriptomics has been used to study the cellular composition of different solid tumor types including PDAC. However, almost all of those studies used primary tumor tissues. Methods In this study, we employed a single-cell RNA sequencing technology to profile the transcriptomes of individual cells from dissociated primary tumors or metastatic biopsies obtained from patients with PDAC. Unsupervised clustering analysis as well as a new supervised classification algorithm, SuperCT, was used to identify the different cell types within the tumor tissues. The expression signatures of the different cell types were then compared between primary tumors and metastatic biopsies. The expressions of the cell type-specific signature genes were also correlated with patient survival using public datasets. Results Our single-cell RNA sequencing analysis revealed distinct cell types in primary and metastatic PDAC tissues including tumor cells, endothelial cells, cancer-associated fibroblasts (CAFs), and immune cells. The cancer cells showed high inter-patient heterogeneity, whereas the stromal cells were more homogenous across patients. Immune infiltration varies significantly from patient to patient with majority of the immune cells being macrophages and exhausted lymphocytes. We found that the tumor cellular composition was an important factor in defining the PDAC subtypes. Furthermore, the expression levels of cell type-specific markers for EMT+ cancer cells, activated CAFs, and endothelial cells significantly associated with patient survival. Conclusions Taken together, our work identifies significant heterogeneity in cellular compositions of PDAC tumors and between primary tumors and metastatic lesions. Furthermore, the cellular composition was an important factor in defining PDAC subtypes and significantly correlated with patient outcome. These findings provide valuable insights on the PDAC microenvironment and could potentially inform the management of PDAC patients.


2019 ◽  
Author(s):  
Pawel F. Przytycki ◽  
Katherine S. Pollard

Single-cell and bulk genomics assays have complementary strengths and weaknesses, and alone neither strategy can fully capture regulatory elements across the diversity of cells in complex tissues. We present CellWalker, a method that integrates single-cell open chromatin (scATAC-seq) data with gene expression (RNA-seq) and other data types using a network model that simultaneously improves cell labeling in noisy scATAC-seq and annotates cell-type specific regulatory elements in bulk data. We demonstrate CellWalker’s robustness to sparse annotations and noise using simulations and combined RNA-seq and ATAC-seq in individual cells. We then apply CellWalker to the developing brain. We identify cells transitioning between transcriptional states, resolve enhancers to specific cell types, and observe that autism and other neurological traits can be mapped to specific cell types through their enhancers.


2021 ◽  
Author(s):  
Gulden Olgun ◽  
Vishaka Gopalan ◽  
Sridhar Hannenhalli

Micro-RNAs (miRNA) are critical in development, homeostasis, and diseases, including cancer. However, our understanding of miRNA function at cellular resolution is thwarted by the inability of the standard single cell RNA-seq protocols to capture miRNAs. Here we introduce a machine learning tool -- miRSCAPE -- to infer miRNA expression in a sample from its RNA-seq profile. We establish miRSCAPE's accuracy separately in 10 tissues comprising ~10,000 tumor and normal bulk samples and demonstrate that miRSCAPE accurately infers cell type-specific miRNA activities (predicted vs observed fold-difference correlation ~ 0.81) in two independent datasets where miRNA profiles of specific cell types are available (HEK-GBM, Kidney-Breast-Skin). When trained on human hematopoietic cancers, miRSCAPE can identify active miRNAs in 8 hematopoietic cell lines in mouse with a reasonable accuracy (auROC = 0.67). Finally, we apply miRSCAPE to infer miRNA activities in scRNA clusters in Pancreatic and Lung cancers, as well as in 56 cell types in the Human Cell Landscape (HCL). Across the board, miRSCAPE recapitulates and provides a refined view of known miRNA biology. miRSCAPE is freely available and promises to substantially expand our understanding of gene regulatory networks at cellular resolution.


Author(s):  
Jiebiao Wang ◽  
Kathryn Roeder ◽  
Bernie Devlin

AbstractWhen assessed over a large number of samples, bulk RNA sequencing provides reliable data for gene expression at the tissue level. Single-cell RNA sequencing (scRNA-seq) deepens those analyses by evaluating gene expression at the cellular level. Both data types lend insights into disease etiology. With current technologies, however, scRNA-seq data are known to be noisy. Moreover, constrained by costs, scRNA-seq data are typically generated from a relatively small number of subjects, which limits their utility for some analyses, such as identification of gene expression quantitative trait loci (eQTLs). To address these issues while maintaining the unique advantages of each data type, we develop a Bayesian method (bMIND) to integrate bulk and scRNA-seq data. With a prior derived from scRNA-seq data, we propose to estimate sample-level cell-type-specific (CTS) expression from bulk expression data. The CTS expression enables large-scale sample-level downstream analyses, such as detecting CTS differentially expressed genes (DEGs) and eQTLs. Through simulations, we demonstrate that bMIND improves the accuracy of sample-level CTS expression estimates and power to discover CTS-DEGs when compared to existing methods. To further our understanding of two complex phenotypes, autism spectrum disorder and Alzheimer’s disease, we apply bMIND to gene expression data of relevant brain tissue to identify CTS-DEGs. Our results complement findings for CTS-DEGs obtained from snRNA-seq studies, replicating certain DEGs in specific cell types while nominating other novel genes in those cell types. Finally, we calculate CTS-eQTLs for eleven brain regions by analyzing GTEx V8 data, creating a new resource for biological insights.


BMC Biology ◽  
2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Elin Lundin ◽  
Chenglin Wu ◽  
Albin Widmark ◽  
Mikaela Behm ◽  
Jens Hjerling-Leffler ◽  
...  

Abstract Background Adenosine-to-inosine (A-to-I) RNA editing is a process that contributes to the diversification of proteins that has been shown to be essential for neurotransmission and other neuronal functions. However, the spatiotemporal and diversification properties of RNA editing in the brain are largely unknown. Here, we applied in situ sequencing to distinguish between edited and unedited transcripts in distinct regions of the mouse brain at four developmental stages, and investigate the diversity of the RNA landscape. Results We analyzed RNA editing at codon-altering sites using in situ sequencing at single-cell resolution, in combination with the detection of individual ADAR enzymes and specific cell type marker transcripts. This approach revealed cell-type-specific regulation of RNA editing of a set of transcripts, and developmental and regional variation in editing levels for many of the targeted sites. We found increasing editing diversity throughout development, which arises through regional- and cell type-specific regulation of ADAR enzymes and target transcripts. Conclusions Our single-cell in situ sequencing method has proved useful to study the complex landscape of RNA editing and our results indicate that this complexity arises due to distinct mechanisms of regulating individual RNA editing sites, acting both regionally and in specific cell types.


2021 ◽  
Author(s):  
Kai Kang ◽  
Caizhi David Huang ◽  
Yuanyuan Li ◽  
David M. Umbach ◽  
Leping Li

AbstractBackgroundBiological tissues consist of heterogenous populations of cells. Because gene expression patterns from bulk tissue samples reflect the contributions from all cells in the tissue, understanding the contribution of individual cell types to the overall gene expression in the tissue is fundamentally important. We recently developed a computational method, CDSeq, that can simultaneously estimate both sample-specific cell-type proportions and cell-type-specific gene expression profiles using only bulk RNA-Seq counts from multiple samples. Here we present an R implementation of CDSeq (CDSeqR) with significant performance improvement over the original implementation in MATLAB and with a new function to aid interpretation of deconvolution outcomes. The R package would be of interest for the broader R community.ResultWe developed a novel strategy to substantially improve computational efficiency in both speed and memory usage. In addition, we designed and implemented a new function for annotating CDSeq-estimated cell types using publicly available single-cell RNA sequencing (scRNA-seq) data (single-cell data from 20 major organs are included in the R package). This function allows users to readily interpret and visualize the CDSeq-estimated cell types. We carried out additional validations of the CDSeqR software with in silico and in vitro mixtures and with real experimental data including RNA-seq data from the Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEx) project.ConclusionsThe existing bulk RNA-seq repositories, such as TCGA and GTEx, provide enormous resources for better understanding changes in transcriptomics and human diseases. They are also potentially useful for studying cell-cell interactions in the tissue microenvironment. However, bulk level analyses neglect tissue heterogeneity and hinder investigation in a cell-type-specific fashion. The CDSeqR package can be viewed as providing in silico single-cell dissection of bulk measurements. It enables researchers to gain cell-type-specific information from bulk RNA-seq data.


2021 ◽  
Author(s):  
Lisa Maria Steinheuer ◽  
Sebastian Canzler ◽  
Jörg Hackermüller

AbstractGene correlation network inference from single-cell transcriptomics data potentially allows to gain unprecendented insights into cell type-specific regulatory programs. ScRNA-seq data is severely affected by dropout, which significantly hampers and restrains current downstream analysis. Although newly developed tools are capable to deal with sparse data, no appropriate single-cell network inference workflow has been established. A potential way to end this deadlock is the application of data imputation methods, which already proofed to be useful in specific contexts of single-cell data analysis, e.g., recovering cell clusters. In order to infer cell-type specific networks, two prerequisites must be met: the identification of cluster-specific cell-types and the network inference itself.Here, we propose a benchmarking framework to investigate both objections. By using suitable reference data with inherent correlation structure, six representative imputation tools and appropriate evaluation measures, we were able to systematically infer the impact of data imputation on network inference. Major network structures were found to be preserved in low dropout data sets. For moderately sparse data sets, DCA was able to recover gene correlation structures, although systematically introducing higher correlation values. No imputation tool was able to recover true signals from high dropout data. However, by using an additional biological data set we could show that cell-cell correlation by means of specific marker gene expression was not compromised through data imputation.Our analysis showed that network inference is feasible for low and moderately sparse data sets by using the unimputed and DCA-prepared data, respectively. High sparsity data, on the other side, still pose a major problem since current imputation techniques are not able to facilitate network inference. The annotation of cluster-specific cell-types as a prerequisite is not hampered by data imputation but their power to restore the deeply hidden correlation structures is still not sufficient enough.


2019 ◽  
Vol 21 (Supplement_6) ◽  
pp. vi64-vi64
Author(s):  
Tinyi Chu ◽  
Edward Rice ◽  
Hans Salamanca ◽  
Zhong Wang ◽  
Sharon Longo ◽  
...  

Abstract Glioblastoma is among the most heterogeneous malignancies, making difficult the identification of clinically-relevant interactions between tumor cells and their supportive tumor microenvironment. Moreover, whether the heterogeneity of tumor cells is reflected by changes in the composition of the tumor microenvironment remains poorly defined. To further understand the cellular heterogeneity of GBM, we used our previously validated chromatin run-on and sequencing (ChRO-seq) method to analyze 61 GBMs from a retrospective cohort of patients banked at the State University of New York (Upstate Medical Center) between 1987 and 2007 (characteristics: Male:Female ratio= 2:1; median age at diagnosis= 59 years; median KPS=80; median overall survival= 343 days). We developed a new Bayesian statistical model that uses transcription at cell-type specific enhancers to identify the cellular composition of the tumor microenvironment in each patient. We validated our tool using simulations and scATAC-seq data from the same specimens, showing large improvements in sensitivity and accuracy compared with CYBERSORT. Integrative analysis of cellular composition and matching clinical data revealed correlations between the presence of specific cell types in the tumor mass and clinical variables. Finally, our analysis allowed us to identify transcription factors (e.g., NF-kB, C/EBPB) that control gene expression changes, revealing which cell types are controlled by each transcription factor in the GBM microenvironment. Our study uncovers new insights into the cellular heterogeneity of GBM and its impact on clinical progression and survival.


2018 ◽  
Author(s):  
Jianhua Yin ◽  
Zhisheng Li ◽  
Chen Yan ◽  
Enhao Fang ◽  
Ting Wang ◽  
...  

AbstractThe tumor microenvironment is composed of numerous cell types, including tumor, immune and stromal cells. Cancer cells interact with the tumor microenvironment to suppress anticancer immunity. In this study, we molecularly dissected the tumor microenvironment of breast cancer by single-cell RNA-seq. We profiled the breast cancer tumor microenvironment by analyzing the single-cell transcriptomes of 52,163 cells from the tumor tissues of 15 breast cancer patients. The tumor cells and immune cells from individual patients were analyzed simultaneously at the single-cell level. This study explores the diversity of the cell types in the tumor microenvironment and provides information on the mechanisms of escape from clearance by immune cells in breast cancer.One Sentence SummaryLandscape of tumor cells and immune cells in breast cancer by single cell RNA-seq


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Pawel F. Przytycki ◽  
Katherine S. Pollard

AbstractSingle-cell and bulk genomics assays have complementary strengths and weaknesses, and alone neither strategy can fully capture regulatory elements across the diversity of cells in complex tissues. We present CellWalker, a method that integrates single-cell open chromatin (scATAC-seq) data with gene expression (RNA-seq) and other data types using a network model that simultaneously improves cell labeling in noisy scATAC-seq and annotates cell type-specific regulatory elements in bulk data. We demonstrate CellWalker’s robustness to sparse annotations and noise using simulations and combined RNA-seq and ATAC-seq in individual cells. We then apply CellWalker to the developing brain. We identify cells transitioning between transcriptional states, resolve regulatory elements to cell types, and observe that autism and other neurological traits can be mapped to specific cell types through their regulatory elements.


Sign in / Sign up

Export Citation Format

Share Document