Efficient inference of single cell expression profiles with overlapping pooling and compressed sensing

Mapping Intimacies ◽

10.1101/338319 ◽

2018 ◽

Author(s):

Xingzhao Wen ◽

Weiqiang Xu ◽

Xiao Sun ◽

Jing Tu ◽

Zuhong Lu

Keyword(s):

Compressed Sensing ◽

Single Cell ◽

Expression Profile ◽

Expression Profiles ◽

Single Cells ◽

Group Testing ◽

Cell Types ◽

Original Data ◽

Computational Framework ◽

Cell Expression

SUMMARYPlate-based single cell RNA-Seq (scRNA-seq) methods can detect a comprehensive profile for gene expression but suffers from high library cost of each single cell. Although cost can be reduced significantly by massively parallel scRNA-seq techniques, these approaches lose sensitivity for gene detection. Inspired by group testing and compressed sensing, here, we designed a computational framework to close the gap between sensitivity and library cost. In our framework, single cells were overlapped assigned into plenty of pools. Expression profile of each pool was then obtained by using plate-based sequence approach. The expression profile of all single cells was recovered based on the pool expression and the overlapped pooling design. The inferred expression profile showed highly consistency with the original data in both accuracy and cell types identification. A parallel computing scheme was designed to boost speed when processing the enormous single cells, and elastic net regression was combined with compressed sensing to auto-adapt for both sparsely and densely expressed genes.

Sincast: a computational framework to predict cell identities in single cell transcriptomes using bulk atlases as references

10.1101/2021.11.07.467660 ◽

2021 ◽

Author(s):

Yidi Deng ◽

Jarny Choi ◽

Kim-Anh Le Cao

Keyword(s):

Single Cell ◽

Expression Profiles ◽

Single Cells ◽

Computational Framework ◽

Cell Identity ◽

Single Cell Profiling ◽

A Cell ◽

Downstream Analysis ◽

Cell Expression ◽

Cell Data

Characterizing the molecular identity of a cell is an essential step in single cell RNA-sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data and insufficient phenotype data from the reference. One solution is to project single cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data based on bulk reference atlases. Prior to projection, single cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single cell profiling that will facilitate downstream analysis of scRNA-seq data.

Genomic Architecture of Cells in Tissues (GeACT): Study of Human Mid-gestation Fetus

10.1101/2020.04.12.038000 ◽

2020 ◽

Author(s):

Feng Tian ◽

Fan Zhou ◽

Xiang Li ◽

Wenping Ma ◽

Honggui Wu ◽

...

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Human Cell ◽

Expression Profiles ◽

Single Cells ◽

Cell Types ◽

List Type ◽

Cell Type ◽

Genomic Architecture ◽

Gene Modules

SummaryBy circumventing cellular heterogeneity, single cell omics have now been widely utilized for cell typing in human tissues, culminating with the undertaking of human cell atlas aimed at characterizing all human cell types. However, more important are the probing of gene regulatory networks, underlying chromatin architecture and critical transcription factors for each cell type. Here we report the Genomic Architecture of Cells in Tissues (GeACT), a comprehensive genomic data base that collectively address the above needs with the goal of understanding the functional genome in action. GeACT was made possible by our novel single-cell RNA-seq (MALBAC-DT) and ATAC-seq (METATAC) methods of high detectability and precision. We exemplified GeACT by first studying representative organs in human mid-gestation fetus. In particular, correlated gene modules (CGMs) are observed and found to be cell-type-dependent. We linked gene expression profiles to the underlying chromatin states, and found the key transcription factors for representative CGMs.HighlightsGenomic Architecture of Cells in Tissues (GeACT) data for human mid-gestation fetusDetermining correlated gene modules (CGMs) in different cell types by MALBAC-DTMeasuring chromatin open regions in single cells with high detectability by METATACIntegrating transcriptomics and chromatin accessibility to reveal key TFs for a CGM

Self-reporting transposons enable simultaneous readout of gene expression and transcription factor binding in single cells

10.1101/538553 ◽

2019 ◽

Cited By ~ 3

Author(s):

Arnav Moudgil ◽

Michael N. Wilkinson ◽

Xuhua Chen ◽

June He ◽

Alex J. Cammack ◽

...

Keyword(s):

Gene Expression ◽

Transcription Factor ◽

Single Cell ◽

Binding Sites ◽

Expression Profiles ◽

Single Cells ◽

Gene Expression Profiles ◽

Cell Types ◽

Specific Cell

AbstractIn situ measurements of transcription factor (TF) binding are confounded by cellular heterogeneity and represent averaged profiles in complex tissues. Single cell RNA-seq (scRNA-seq) is capable of resolving different cell types based on gene expression profiles, but no technology exists to directly link specific cell types to the binding pattern of TFs in those cell types. Here, we present self-reporting transposons (SRTs) and their use in single cell calling cards (scCC), a novel assay for simultaneously capturing gene expression profiles and mapping TF binding sites in single cells. First, we show how the genomic locations of SRTs can be recovered from mRNA. Next, we demonstrate that SRTs deposited by the piggyBac transposase can be used to map the genome-wide localization of the TFs SP1, through a direct fusion of the two proteins, and BRD4, through its native affinity for piggyBac. We then present the scCC method, which maps SRTs from scRNA-seq libraries, thus enabling concomitant identification of cell types and TF binding sites in those same cells. As a proof-of-concept, we show recovery of cell type-specific BRD4 and SP1 binding sites from cultured cells. Finally, we map Brd4 binding sites in the mouse cortex at single cell resolution, thus establishing a new technique for studying TF biology in situ.

Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing

10.1101/104844 ◽

2017 ◽

Cited By ~ 8

Author(s):

Junyue Cao ◽

Jonathan S. Packer ◽

Vijay Ramani ◽

Darren A. Cusanovich ◽

Chau Huynh ◽

...

Keyword(s):

Single Cell ◽

Expression Profiles ◽

Single Cells ◽

Transcriptional Profiling ◽

Cost Effective ◽

Cell Types ◽

Cell Capture ◽

Rna Seq ◽

Major Step ◽

C Elegans

AbstractConventional methods for profiling the molecular content of biological samples fail to resolve heterogeneity that is present at the level of single cells. In the past few years, single cell RNA sequencing has emerged as a powerful strategy for overcoming this challenge. However, its adoption has been limited by a paucity of methods that are at once simple to implement and cost effective to scale massively. Here, we describe a combinatorial indexing strategy to profile the transcriptomes of large numbers of single cells or single nuclei without requiring the physical isolation of each cell (Single cell Combinatorial Indexing RNA-seq or sci-RNA-seq). We show that sci-RNA-seq can be used to efficiently profile the transcriptomes of tens-of-thousands of single cells per experiment, and demonstrate that we can stratify cell types from these data. Key advantages of sci-RNA-seq over contemporary alternatives such as droplet-based single cell RNA-seq include sublinear cost scaling, a reliance on widely available reagents and equipment, the ability to concurrently process many samples within a single workflow, compatibility with methanol fixation of cells, cell capture based on DNA content rather than cell size, and the flexibility to profile either cells or nuclei. As a demonstration of sci-RNA-seq, we profile the transcriptomes of 42,035 single cells from C. elegans at the L2 stage, effectively 50-fold “shotgun cellular coverage” of the somatic cell composition of this organism at this stage. We identify 27 distinct cell types, including rare cell types such as the two distal tip cells of the developing gonad, estimate consensus expression profiles and define cell-type specific and selective genes. Given that C. elegans is the only organism with a fully mapped cellular lineage, these data represent a rich resource for future methods aimed at defining cell types and states. They will advance our understanding of developmental biology, and constitute a major step towards a comprehensive, single-cell molecular atlas of a whole animal.

Single-cell expression profiling reveals dynamic flux of cardiac stromal, vascular and immune cells in health and injury

eLife ◽

10.7554/elife.43882 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 82

Author(s):

Nona Farbehi ◽

Ralph Patrick ◽

Aude Dorison ◽

Munira Xaymardan ◽

Vaibhao Janbandhu ◽

...

Keyword(s):

Single Cell ◽

Immune Cells ◽

Interstitial Cell ◽

Single Cells ◽

Cardiac Injury ◽

Cell Types ◽

Cell Lineages ◽

Heart Repair ◽

Cell Expression ◽

Cell Community

Besides cardiomyocytes (CM), the heart contains numerous interstitial cell types which play key roles in heart repair, regeneration and disease, including fibroblast, vascular and immune cells. However, a comprehensive understanding of this interactive cell community is lacking. We performed single-cell RNA-sequencing of the total non-CM fraction and enriched (Pdgfra-GFP+) fibroblast lineage cells from murine hearts at days 3 and 7 post-sham or myocardial infarction (MI) surgery. Clustering of >30,000 single cells identified >30 populations representing nine cell lineages, including a previously undescribed fibroblast lineage trajectory present in both sham and MI hearts leading to a uniquely activated cell state defined in part by a strong anti-WNT transcriptome signature. We also uncovered novel myofibroblast subtypes expressing either pro-fibrotic or anti-fibrotic signatures. Our data highlight non-linear dynamics in myeloid and fibroblast lineages after cardiac injury, and provide an entry point for deeper analysis of cardiac homeostasis, inflammation, fibrosis, repair and regeneration.

GapClust is a light-weight approach distinguishing rare cells from voluminous single cell expression profiles

Nature Communications ◽

10.1038/s41467-021-24489-8 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Botao Fa ◽

Ting Wei ◽

Yuan Zhou ◽

Luke Johnston ◽

Xin Yuan ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

State Of The Art ◽

Expression Profiles ◽

Cell Types ◽

Superior Performance ◽

Light Weight ◽

Memory Efficiency ◽

Rare Cells ◽

Cell Expression

AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful tool in detailing the cellular landscape within complex tissues. Large-scale single cell transcriptomics provide both opportunities and challenges for identifying rare cells playing crucial roles in development and disease. Here, we develop GapClust, a light-weight algorithm to detect rare cell types from ultra-large scRNA-seq datasets with state-of-the-art speed and memory efficiency. Benchmarking on diverse experimental datasets demonstrates the superior performance of GapClust compared to other recently proposed methods. When applying our algorithm to an intestine and 68 k PBMC datasets, GapClust identifies the tuft cells and a previously unrecognised subtype of monocyte, respectively.

Isolating and Cryo-Preserving Pig Skin Cells for Single Cell RNA Sequencing Study

10.1101/2021.01.31.429035 ◽

2021 ◽

Author(s):

Li Han ◽

Carlos P Jara ◽

Ou Wang ◽

Sandra Thibivilliers ◽

Rafał K. Wóycicki ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Skin Diseases ◽

Expression Profiles ◽

Single Cells ◽

Gene Expression Profiles ◽

Cell Aggregation ◽

Cell Types ◽

Skin Cell ◽

Single Cell Rna Sequencing

AbstractThe Pigskin architecture and physiology are similar to these of humans. Thus, the pig model is valuable for studying skin biology and testing therapeutics for skin diseases. The single-cell RNA sequencing technology allows quantitatively analyzing cell types, cell states, signaling, and receptor-ligand interactome at single-cell resolution and at high throughput. scRNA-Seq has been used to study mouse and human skins. However, studying pigskin with scRNA-Seq is still rare. Here we described a robust method for isolating and cryo-preserving pig single cells for scRNA-Seq. We showed that pigskin could be efficiently dissociated into single cells with high cell viability using the Miltenyi Human Whole Skin Dissociation kit and the Miltenyi gentleMACS Dissociator. Also, we showed that the subsequent single cells could be cryopreserved using DMSO without causing additional cell death, cell aggregation, or changes in gene expression profiles. Using the developed protocol, we were able to identify all the major skin cell types. The protocol and results from this study will be very valuable for the skin research scientific community.

A universal approach for integrating super large-scale single-cell transcriptomes by exploring gene rankings

10.1101/2021.08.23.457305 ◽

2021 ◽

Author(s):

Hongru Shen ◽

Xilin Shen ◽

Mengyao Feng ◽

Dan Wu ◽

Chao Zhang ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Single Cells ◽

Gene Interaction ◽

Cell Types ◽

Specific Cell ◽

Expression Data ◽

Gene Interaction Networks ◽

Universal Approach ◽

Cell Expression

Advancement in single-cell RNA sequencing leads to exponential accumulation of single-cell expression data. However, there is still lack of tools that could integrate these unlimited accumulation of single-cell expression data. Here, we presented a universal approach iSEEEK for integrating super large-scale single-cell expression via exploring expression rankings of top-expressing genes. We developed iSEEEK with 13.7 million single-cells. We demonstrated the efficiency of iSEEEK with canonical single-cell downstream tasks on five heterogenous datasets encompassing human and mouse samples. iSEEEK achieved good clustering performance benchmarked against well-annotated cell labels. In addition, iSEEEK could transfer its knowledge learned from large-scale expression data on new dataset that was not involved in its development. iSEEEK enables identification of gene-gene interaction networks that are characteristic of specific cell types. Our study presents a simple and yet effective method to integrate super large-scale single-cell transcriptomes and would facilitate translational single-cell research from bench to bedside.

Single cell transcriptome atlas of mouse mammary epithelial cells across development

Breast Cancer Research ◽

10.1186/s13058-021-01445-4 ◽

2021 ◽

Vol 23 (1) ◽

Author(s):

Bhupinder Pal ◽

Yunshun Chen ◽

Michael J. G. Milevskiy ◽

François Vaillant ◽

Lexie Prokopuk ◽

...

Keyword(s):

Epithelial Cells ◽

Single Cell ◽

Developmental Stages ◽

Mammary Epithelial Cells ◽

Expression Profiles ◽

Single Cells ◽

Chromatin Accessibility ◽

Mammary Epithelial ◽

Cell Transcriptome ◽

Single Cell Transcriptome

Abstract Background Heterogeneity within the mouse mammary epithelium and potential lineage relationships have been recently explored by single-cell RNA profiling. To further understand how cellular diversity changes during mammary ontogeny, we profiled single cells from nine different developmental stages spanning late embryogenesis, early postnatal, prepuberty, adult, mid-pregnancy, late-pregnancy, and post-involution, as well as the transcriptomes of micro-dissected terminal end buds (TEBs) and subtending ducts during puberty. Methods The single cell transcriptomes of 132,599 mammary epithelial cells from 9 different developmental stages were determined on the 10x Genomics Chromium platform, and integrative analyses were performed to compare specific time points. Results The mammary rudiment at E18.5 closely aligned with the basal lineage, while prepubertal epithelial cells exhibited lineage segregation but to a less differentiated state than their adult counterparts. Comparison of micro-dissected TEBs versus ducts showed that luminal cells within TEBs harbored intermediate expression profiles. Ductal basal cells exhibited increased chromatin accessibility of luminal genes compared to their TEB counterparts suggesting that lineage-specific chromatin is established within the subtending ducts during puberty. An integrative analysis of five stages spanning the pregnancy cycle revealed distinct stage-specific profiles and the presence of cycling basal, mixed-lineage, and 'late' alveolar intermediates in pregnancy. Moreover, a number of intermediates were uncovered along the basal-luminal progenitor cell axis, suggesting a continuum of alveolar-restricted progenitor states. Conclusions This extended single cell transcriptome atlas of mouse mammary epithelial cells provides the most complete coverage for mammary epithelial cells during morphogenesis to date. Together with chromatin accessibility analysis of TEB structures, it represents a valuable framework for understanding developmental decisions within the mouse mammary gland.

484 Bioturing browser: interactively explore public single cell sequencing data

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2020-sitc2020.0484 ◽

2020 ◽

Vol 8 (Suppl 3) ◽

pp. A520-A520

Author(s):

Son Pham ◽

Tri Le ◽

Tan Phan ◽

Minh Pham ◽

Huy Nguyen ◽

...

Keyword(s):

Single Cell ◽

Immune Cell ◽

Expression Profiles ◽

Meta Analysis ◽

Cell Types ◽

Sequencing Data ◽

Single Cell Sequencing ◽

Data Formats ◽

Cancer Types ◽

Cell Data

BackgroundSingle-cell sequencing technology has opened an unprecedented ability to interrogate cancer. It reveals significant insights into the intratumoral heterogeneity, metastasis, therapeutic resistance, which facilitates target discovery and validation in cancer treatment. With rapid advancements in throughput and strategies, a particular immuno-oncology study can produce multi-omics profiles for several thousands of individual cells. This overflow of single-cell data poses formidable challenges, including standardizing data formats across studies, performing reanalysis for individual datasets and meta-analysis.MethodsN/AResultsWe present BioTuring Browser, an interactive platform for accessing and reanalyzing published single-cell omics data. The platform is currently hosting a curated database of more than 10 million cells from 247 projects, covering more than 120 immune cell types and subtypes, and 15 different cancer types. All data are processed and annotated with standardized labels of cell types, diseases, therapeutic responses, etc. to be instantly accessed and explored in a uniform visualization and analytics interface. Based on this massive curated database, BioTuring Browser supports searching similar expression profiles, querying a target across datasets and automatic cell type annotation. The platform supports single-cell RNA-seq, CITE-seq and TCR-seq data. BioTuring Browser is now available for download at www.bioturing.com.ConclusionsN/A