scholarly journals R code and downstream analysis objects for the scRNA-seq atlas of human breast spanning normal, preneoplastic and tumorigenic states

2021 ◽  
Author(s):  
Yunshun Chen ◽  
Bhupinder Pal ◽  
Geoffrey J Lindeman ◽  
Jane E Visvader ◽  
Gordon K Smyth

Breast cancer is a common and highly heterogeneous disease. Understanding the cellular diversity in the mammary gland and its surrounding micro-environment across different states can provide insight into the cancer development in human breast. Recently, a large-scale single-cell RNA expression atlas was constructed of the human breast spanning normal, preneoplastic and tumorigenic states. Single-cell expression profiles of nearly 430,000 cells were obtained from 69 distinct surgical tissue specimens from 55 patients. This article extends the study by providing downstream processed R data objects, complete cell annotation and R code to reproduce all the analyses. Details of all the bioinformatic analyses that produced the results described in the study are provided.

GigaScience ◽  
2020 ◽  
Vol 9 (12) ◽  
Author(s):  
Matthew D Young ◽  
Sam Behjati

Abstract Background Droplet-based single-cell RNA sequence analyses assume that all acquired RNAs are endogenous to cells. However, any cell-free RNAs contained within the input solution are also captured by these assays. This sequencing of cell-free RNA constitutes a background contamination that confounds the biological interpretation of single-cell transcriptomic data. Results We demonstrate that contamination from this "soup" of cell-free RNAs is ubiquitous, with experiment-specific variations in composition and magnitude. We present a method, SoupX, for quantifying the extent of the contamination and estimating "background-corrected" cell expression profiles that seamlessly integrate with existing downstream analysis tools. Applying this method to several datasets using multiple droplet sequencing technologies, we demonstrate that its application improves biological interpretation of otherwise misleading data, as well as improving quality control metrics. Conclusions We present SoupX, a tool for removing ambient RNA contamination from droplet-based single-cell RNA sequencing experiments. This tool has broad applicability, and its application can improve the biological utility of existing and future datasets.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Botao Fa ◽  
Ting Wei ◽  
Yuan Zhou ◽  
Luke Johnston ◽  
Xin Yuan ◽  
...  

AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful tool in detailing the cellular landscape within complex tissues. Large-scale single cell transcriptomics provide both opportunities and challenges for identifying rare cells playing crucial roles in development and disease. Here, we develop GapClust, a light-weight algorithm to detect rare cell types from ultra-large scRNA-seq datasets with state-of-the-art speed and memory efficiency. Benchmarking on diverse experimental datasets demonstrates the superior performance of GapClust compared to other recently proposed methods. When applying our algorithm to an intestine and 68 k PBMC datasets, GapClust identifies the tuft cells and a previously unrecognised subtype of monocyte, respectively.


2021 ◽  
Author(s):  
Yidi Deng ◽  
Jarny Choi ◽  
Kim-Anh Le Cao

Characterizing the molecular identity of a cell is an essential step in single cell RNA-sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data and insufficient phenotype data from the reference. One solution is to project single cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data based on bulk reference atlases. Prior to projection, single cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single cell profiling that will facilitate downstream analysis of scRNA-seq data.


Cancers ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1250
Author(s):  
Guangchun Han ◽  
Ansam Sinjab ◽  
Kieko Hara ◽  
Warapen Treekitkarnmongkol ◽  
Patrick Brennan ◽  
...  

The novel coronavirus SARS-CoV-2 is the causative agent of the COVID-19 pandemic. Severely symptomatic COVID-19 is associated with lung inflammation, pneumonia, and respiratory failure, thereby raising concerns of elevated risk of COVID-19-associated mortality among lung cancer patients. Angiotensin-converting enzyme 2 (ACE2) is the major receptor for SARS-CoV-2 entry into lung cells. The single-cell expression landscape of ACE2 and other SARS-CoV-2-related genes in pulmonary tissues of lung cancer patients remains unknown. We sought to delineate single-cell expression profiles of ACE2 and other SARS-CoV-2-related genes in pulmonary tissues of lung adenocarcinoma (LUAD) patients. We examined the expression levels and cellular distribution of ACE2 and SARS-CoV-2-priming proteases TMPRSS2 and TMPRSS4 in 5 LUADs and 14 matched normal tissues by single-cell RNA-sequencing (scRNA-seq) analysis. scRNA-seq of 186,916 cells revealed epithelial-specific expression of ACE2, TMPRSS2, and TMPRSS4. Analysis of 70,030 LUAD- and normal-derived epithelial cells showed that ACE2 levels were highest in normal alveolar type 2 (AT2) cells and that TMPRSS2 was expressed in 65% of normal AT2 cells. Conversely, the expression of TMPRSS4 was highest and most frequently detected (75%) in lung cells with malignant features. ACE2-positive cells co-expressed genes implicated in lung pathobiology, including COPD-associated HHIP, and the scavengers CD36 and DMBT1. Notably, the viral scavenger DMBT1 was significantly positively correlated with ACE2 expression in AT2 cells. We describe normal and tumor lung epithelial populations that express SARS-CoV-2 receptor and proteases, as well as major host defense genes, thus comprising potential treatment targets for COVID-19 particularly among lung cancer patients.


2019 ◽  
Vol 20 (S24) ◽  
Author(s):  
Yu Zhang ◽  
Changlin Wan ◽  
Pengcheng Wang ◽  
Wennan Chang ◽  
Yan Huo ◽  
...  

Abstract Background Various statistical models have been developed to model the single cell RNA-seq expression profiles, capture its multimodality, and conduct differential gene expression test. However, for expression data generated by different experimental design and platforms, there is currently lack of capability to determine the most proper statistical model. Results We developed an R package, namely Multi-Modal Model Selection (M3S), for gene-wise selection of the most proper multi-modality statistical model and downstream analysis, useful in a single-cell or large scale bulk tissue transcriptomic data. M3S is featured with (1) gene-wise selection of the most parsimonious model among 11 most commonly utilized ones, that can best fit the expression distribution of the gene, (2) parameter estimation of a selected model, and (3) differential gene expression test based on the selected model. Conclusion A comprehensive evaluation suggested that M3S can accurately capture the multimodality on simulated and real single cell data. An open source package and is available through GitHub at https://github.com/zy26/M3S.


Science ◽  
2016 ◽  
Vol 352 (6282) ◽  
pp. 183-185
Author(s):  
L. M. Zahn

GigaScience ◽  
2020 ◽  
Vol 9 (10) ◽  
Author(s):  
Francesca Pia Caruso ◽  
Luciano Garofano ◽  
Fulvio D'Angelo ◽  
Kai Yu ◽  
Fuchou Tang ◽  
...  

ABSTRACT Background Single-cell RNA sequencing is the reference technique for characterizing the heterogeneity of the tumor microenvironment. The composition of the various cell types making up the microenvironment can significantly affect the way in which the immune system activates cancer rejection mechanisms. Understanding the cross-talk signals between immune cells and cancer cells is of fundamental importance for the identification of immuno-oncology therapeutic targets. Results We present a novel method, single-cell Tumor–Host Interaction tool (scTHI), to identify significantly activated ligand–receptor interactions across clusters of cells from single-cell RNA sequencing data. We apply our approach to uncover the ligand–receptor interactions in glioma using 6 publicly available human glioma datasets encompassing 57,060 gene expression profiles from 71 patients. By leveraging this large-scale collection we show that unexpected cross-talk partners are highly conserved across different datasets in the majority of the tumor samples. This suggests that shared cross-talk mechanisms exist in glioma. Conclusions Our results provide a complete map of the active tumor–host interaction pairs in glioma that can be therapeutically exploited to reduce the immunosuppressive action of the microenvironment in brain tumor.


2017 ◽  
Author(s):  
Lipin Loo ◽  
Jeremy M. Simon ◽  
Eric S. McCoy ◽  
Jesse K. Niehaus ◽  
Mark J. Zylka

We generated a single-cell transcriptomic catalog of the developing mouse cerebral cortex that includes numerous classes of neurons, progenitors, and glia, their proliferation, migration, and activation states, and their relatedness within and across timepoints. Cell expression profiles stratified neurological disease-associated genes into distinct subtypes. Complex neurodevelopmental processes can be reconstructed with single-cell transcriptomics data, permitting a deeper understanding of cortical development and the cellular origins of brain diseases.


2021 ◽  
Vol 12 ◽  
Author(s):  
Melanie A. Brennan ◽  
Adam Z. Rosenthal

Clonal bacterial populations exhibit various forms of heterogeneity, including co-occurrence of cells with different morphological traits, biochemical properties, and gene expression profiles. This heterogeneity is prevalent in a variety of environments. For example, the productivity of large-scale industrial fermentations and virulence of infectious diseases are shaped by cell population heterogeneity and have a direct impact on human life. Due to the need and importance to better understand this heterogeneity, multiple methods of examining single-cell heterogeneity have been developed. Traditionally, fluorescent reporters or probes are used to examine a specific gene of interest, providing a useful but inherently biased approach. In contrast, single-cell RNA sequencing (scRNA-seq) is an agnostic approach to examine heterogeneity and has been successfully applied to eukaryotic cells. Unfortunately, current extensively utilized methods of eukaryotic scRNA-seq present difficulties when applied to bacteria. Specifically, bacteria have a cell wall which makes eukaryotic lysis methods incompatible, bacterial mRNA has a shorter half-life and lower copy numbers, and isolating an individual bacterial species from a mixed community is difficult. Recent work has demonstrated that these technical hurdles can be overcome, providing valuable insight into factors influencing microbial heterogeneity. This perspective describes the emerging microbial scRNA-seq toolkit. We outline the benefit of these new tools in elucidating numerous scientific questions in microbiological studies and offer insight about the possible rules that govern the segregation of traits in individual microbial cells.


Sign in / Sign up

Export Citation Format

Share Document