scholarly journals A deep generative model for multi-view profiling of single-cell RNA-seq and ATAC-seq data

2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Gaoyang Li ◽  
Shaliu Fu ◽  
Shuguang Wang ◽  
Chenyu Zhu ◽  
Bin Duan ◽  
...  

AbstractHere, we present a multi-modal deep generative model, the single-cell Multi-View Profiler (scMVP), which is designed for handling sequencing data that simultaneously measure gene expression and chromatin accessibility in the same cell, including SNARE-seq, sci-CAR, Paired-seq, SHARE-seq, and Multiome from 10X Genomics. scMVP generates common latent representations for dimensionality reduction, cell clustering, and developmental trajectory inference and generates separate imputations for differential analysis and cis-regulatory element identification. scMVP can help mitigate data sparsity issues with imputation and accurately identify cell groups for different joint profiling techniques with common latent embedding, and we demonstrate its advantages on several realistic datasets.

2019 ◽  
Author(s):  
Bin Li ◽  
Young Li ◽  
Kun Li ◽  
Lianbang Zhu ◽  
Qiaoni Yu ◽  
...  

ABSTRACTThe development of sequencing technologies has promoted the survey of genome-wide chromatin accessibility at single-cell resolution; however, comprehensive analysis of single-cell epigenomic profiles remains a challenge. Here, we introduce an accessibility pattern-based epigenomic clustering (APEC) method, which classifies each individual cell by groups of accessible regions with synergistic signal patterns termed “accessons”. By integrating with other analytical tools, this python-based APEC package greatly improves the accuracy of unsupervised single-cell clustering for many different public data sets. APEC also predicts gene expressions, identifies significant differential enriched motifs, discovers super enhancers, and projects pseudotime trajectories. Furthermore, we adopted a fluorescent tagmentation-based single-cell ATAC-seq technique (ftATAC-seq) to investigated the per cell regulome dynamics of mouse thymocytes. Associated with ftATAC-seq, APEC revealed a detailed epigenomic heterogeneity of thymocytes, characterized the developmental trajectory and predicted the regulators that control the stages of maturation process. Overall, this work illustrates a powerful approach to study single-cell epigenomic heterogeneity and regulome dynamics.


Blood ◽  
2020 ◽  
Vol 136 (7) ◽  
pp. 845-856 ◽  
Author(s):  
Qin Zhu ◽  
Peng Gao ◽  
Joanna Tober ◽  
Laura Bennett ◽  
Changya Chen ◽  
...  

Abstract Hematopoietic stem and progenitor cells (HSPCs) in the bone marrow are derived from a small population of hemogenic endothelial (HE) cells located in the major arteries of the mammalian embryo. HE cells undergo an endothelial to hematopoietic cell transition, giving rise to HSPCs that accumulate in intra-arterial clusters (IAC) before colonizing the fetal liver. To examine the cell and molecular transitions between endothelial (E), HE, and IAC cells, and the heterogeneity of HSPCs within IACs, we profiled ∼40 000 cells from the caudal arteries (dorsal aorta, umbilical, vitelline) of 9.5 days post coitus (dpc) to 11.5 dpc mouse embryos by single-cell RNA sequencing and single-cell assay for transposase-accessible chromatin sequencing. We identified a continuous developmental trajectory from E to HE to IAC cells, with identifiable intermediate stages. The intermediate stage most proximal to HE, which we term pre-HE, is characterized by increased accessibility of chromatin enriched for SOX, FOX, GATA, and SMAD motifs. A developmental bottleneck separates pre-HE from HE, with RUNX1 dosage regulating the efficiency of the pre-HE to HE transition. A distal candidate Runx1 enhancer exhibits high chromatin accessibility specifically in pre-HE cells at the bottleneck, but loses accessibility thereafter. Distinct developmental trajectories within IAC cells result in 2 populations of CD45+ HSPCs; an initial wave of lymphomyeloid-biased progenitors, followed by precursors of hematopoietic stem cells (pre-HSCs). This multiomics single-cell atlas significantly expands our understanding of pre-HSC ontogeny.


Author(s):  
Xin Chen ◽  
Zhaowei Yang ◽  
Wanqiu Chen ◽  
Yongmei Zhao ◽  
Andrew Farmer ◽  
...  

AbstractSingle-cell RNA sequencing (scRNA-seq) is developing rapidly, and investigators seeking to use this technology are left with a variety of options for both experimental platform and bioinformatics methods. There is an urgent need for scRNA-seq reference datasets for benchmarking of different scRNA-seq platforms and bioinformatics methods. To be broadly applicable, these should be generated from renewable, well characterized reference samples and processed in multiple centers across different platforms. Here we present a benchmarking scRNA-seq dataset that includes 20 scRNA-seq datasets acquired either as a mixtures or as individual samples from two biologically distinct cell lines for which a large amount of multi-platform whole genome sequencing data are also available. These scRNA-seq datasets were generated from multiple popular platforms across four sequencing centers. Our benchmark datasets provide a resource that we believe will have great value for the single-cell community by serving as a reference dataset for evaluating various bioinformatics methods for scRNA-seq analyses, including but not limited to data preprocessing, imputation, normalization, clustering, batch correction, and differential analysis.


GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Yun-Ching Chen ◽  
Abhilash Suresh ◽  
Chingiz Underbayev ◽  
Clare Sun ◽  
Komudi Singh ◽  
...  

AbstractBackgroundIn single-cell RNA-sequencing analysis, clustering cells into groups and differentiating cell groups by differentially expressed (DE) genes are 2 separate steps for investigating cell identity. However, the ability to differentiate between cell groups could be affected by clustering. This interdependency often creates a bottleneck in the analysis pipeline, requiring researchers to repeat these 2 steps multiple times by setting different clustering parameters to identify a set of cell groups that are more differentiated and biologically relevant.FindingsTo accelerate this process, we have developed IKAP—an algorithm to identify major cell groups and improve differentiating cell groups by systematically tuning parameters for clustering. We demonstrate that, with default parameters, IKAP successfully identifies major cell types such as T cells, B cells, natural killer cells, and monocytes in 2 peripheral blood mononuclear cell datasets and recovers major cell types in a previously published mouse cortex dataset. These major cell groups identified by IKAP present more distinguishing DE genes compared with cell groups generated by different combinations of clustering parameters. We further show that cell subtypes can be identified by recursively applying IKAP within identified major cell types, thereby delineating cell identities in a multi-layered ontology.ConclusionsBy tuning the clustering parameters to identify major cell groups, IKAP greatly improves the automation of single-cell RNA-sequencing analysis to produce distinguishing DE genes and refine cell ontology using single-cell RNA-sequencing data.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Wenbao Yu ◽  
Yasin Uzun ◽  
Qin Zhu ◽  
Changya Chen ◽  
Kai Tan

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Xin Chen ◽  
Zhaowei Yang ◽  
Wanqiu Chen ◽  
Yongmei Zhao ◽  
Andrew Farmer ◽  
...  

AbstractSingle-cell RNA sequencing (scRNA-seq) is developing rapidly, and investigators seeking to use this technology are left with a variety of options for both experimental platform and bioinformatics methods. There is an urgent need for scRNA-seq reference datasets for benchmarking of different scRNA-seq platforms and bioinformatics methods. To be broadly applicable, these should be generated from renewable, well characterized reference samples and processed in multiple centers across different platforms. Here we present a benchmark scRNA-seq dataset that includes 20 scRNA-seq datasets acquired either as mixtures or as individual samples from two biologically distinct cell lines for which a large amount of multi-platform whole genome sequencing data are also available. These scRNA-seq datasets were generated from multiple popular platforms across four sequencing centers. We believe the datasets we describe here will provide a resource that meets this need by allowing evaluation of various bioinformatics methods for scRNA-seq analyses, including but not limited to data preprocessing, imputation, normalization, clustering, batch correction, and differential analysis.


2017 ◽  
Author(s):  
Jason D Buenrostro ◽  
M Ryan Corces ◽  
Beijing Wu ◽  
Alicia N Schep ◽  
Caleb A Lareau ◽  
...  

AbstractNormal human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While epigenomic landscapes of this process have been explored in immunophenotypically-defined populations, the single-cell regulatory variation that defines hematopoietic differentiation has been hidden by ensemble averaging. We generated single-cell chromatin accessibility landscapes across 8 populations of immunophenotypically-defined human hematopoietic cell types. Using bulk chromatin accessibility profiles to scaffold our single-cell data analysis, we constructed an epigenomic landscape of human hematopoiesis and characterized epigenomic heterogeneity within phenotypically sorted populations to find epigenomic lineage-bias toward different developmental branches in multipotent stem cell states. We identify and isolate sub-populations within classically-defined granulocyte-macrophage progenitors (GMPs) and use ATAC-seq and RNA-seq to confirm that GMPs are epigenomically and transcriptomically heterogeneous. Furthermore, we identified transcription factors andcis-regulatory elements linked to changes in chromatin accessibility within cellular populations and across a continuous myeloid developmental trajectory, and observe relatively simple TF motif dynamics give rise to a broad diversity of accessibility dynamics at cis-regulatory elements. Overall, this work provides a template for exploration of complex regulatory dynamics in primary human tissues at the ultimate level of granular specificity – the single cell.One Sentence SummarySingle cell chromatin accessibility reveals a high-resolution, continuous landscape of regulatory variation in human hematopoiesis.


Author(s):  
Jeffrey M. Granja ◽  
M. Ryan Corces ◽  
Sarah E. Pierce ◽  
S. Tansu Bagdatli ◽  
Hani Choudhry ◽  
...  

ABSTRACTThe advent of large-scale single-cell chromatin accessibility profiling has accelerated our ability to map gene regulatory landscapes, but has outpaced the development of robust, scalable software to rapidly extract biological meaning from these data. Here we present a software suite for single-cell analysis of regulatory chromatin in R (ArchR; www.ArchRProject.com) that enables fast and comprehensive analysis of single-cell chromatin accessibility data. ArchR provides an intuitive, user-focused interface for complex single-cell analyses including doublet removal, single-cell clustering and cell type identification, robust peak set generation, cellular trajectory identification, DNA element to gene linkage, transcription factor footprinting, mRNA expression level prediction from chromatin accessibility, and multi-omic integration with scRNA-seq. Enabling the analysis of over 1.2 million single cells within 8 hours on a standard Unix laptop, ArchR is a comprehensive analytical suite for end-to-end analysis of single-cell chromatin accessibility data that will accelerate the understanding of gene regulation at the resolution of individual cells.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Boying Gong ◽  
Yun Zhou ◽  
Elizabeth Purdom

AbstractA growing number of single-cell sequencing platforms enable joint profiling of multiple omics from the same cells. We present , a novel method that not only allows for analyzing the data from joint-modality platforms, but provides a coherent framework for the integration of multiple datasets measured on different modalities. We demonstrate its performance on multi-modality data of gene expression and chromatin accessibility and illustrate the integration abilities of by jointly analyzing this multi-modality data with single-cell RNA-seq and ATAC-seq datasets.


Sign in / Sign up

Export Citation Format

Share Document