scholarly journals Nonparametric Interrogation of Transcriptional Regulation in Single-Cell RNA and Chromatin Accessibility Multiomic Data

Author(s):  
Yuriko Harigaya ◽  
Zhaojun Zhang ◽  
Hongpan Zhang ◽  
Chongzhi Zang ◽  
Nancy Zhang ◽  
...  

Abstract Epigenetic control of gene expression is highly cell-type- and context-specific. Yet, despite its complexity, gene regulatory logic can be broken down into modular components consisting of a transcription factor (TF) activating or repressing the expression of a target gene through its binding to a cis-regulatory region. Recent advances in joint profiling of transcription and chromatin accessibility with single-cell resolution offer unprecedented opportunities to interrogate such regulatory logic. Here, we propose a nonparametric approach, TRIPOD, to detect and characterize three-way relationships between a TF, its target gene, and the accessibility of the TF’s binding site, using single-cell RNA and ATAC multiomic data. We apply TRIPOD to interrogate cell-type-specific regulatory logic in peripheral blood mononuclear cells and contrast our results to detections from enhancer databases, cis-eQTL studies, ChIP-seq experiments, and TF knockdown/knockout studies. We then apply TRIPOD to mouse embryonic brain data during neurogenesis and gliogenesis and identified known and novel putative regulatory relationships, validated by ChIP-seq and PLAC-seq. Finally, we demonstrate TRIPOD on SHARE-seq data of differentiating mouse hair follicle cells and identify lineage-specific regulation supported by histone marks for gene activation and super-enhancer annotations.

2021 ◽  
Author(s):  
Yuriko Harigaya ◽  
Zhaojun Zhang ◽  
Hongpan Zhang ◽  
Chongzhi Zang ◽  
Nancy R Zhang ◽  
...  

Epigenetic control of gene expression is highly cell-type- and context-specific. Yet, despite its complexity, gene regulatory logic can be broken down into modular components consisting of a transcription factor (TF) activating or repressing the expression of a target gene through its binding to a cis-regulatory region. Recent advances in joint profiling of transcription and chromatin accessibility with single-cell resolution offer unprecedented opportunities to interrogate such regulatory logic. Here, we propose a nonparametric approach, TRIPOD, to detect and characterize three-way relationships between a TF, its target gene, and the accessibility of the TF's binding site, using single-cell RNA and ATAC multiomic data. We apply TRIPOD to interrogate cell-type-specific regulatory logic in peripheral blood mononuclear cells and contrast our results to detections from enhancer databases, cis-eQTL studies, ChIP-seq experiments, and TF knockdown/knockout studies. We then apply TRIPOD to mouse embryonic brain data during neurogenesis and gliogenesis and identified known and novel putative regulatory relationships, validated by ChIP-seq and PLAC-seq. Finally, we demonstrate TRIPOD on SHARE-seq data of differentiating mouse hair follicle cells and identify lineage-specific regulation supported by histone marks for gene activation and super-enhancer annotations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Zhe Cui ◽  
Ya Cui ◽  
Yan Gao ◽  
Tao Jiang ◽  
Tianyi Zang ◽  
...  

Single-cell Assay Transposase Accessible Chromatin sequencing (scATAC-seq) has been widely used in profiling genome-wide chromatin accessibility in thousands of individual cells. However, compared with single-cell RNA-seq, the peaks of scATAC-seq are much sparser due to the lower copy numbers (diploid in humans) and the inherent missing signals, which makes it more challenging to classify cell type based on specific expressed gene or other canonical markers. Here, we present svmATAC, a support vector machine (SVM)-based method for accurately identifying cell types in scATAC-seq datasets by enhancing peak signal strength and imputing signals through patterns of co-accessibility. We applied svmATAC to several scATAC-seq data from human immune cells, human hematopoietic system cells, and peripheral blood mononuclear cells. The benchmark results showed that svmATAC is free of literature-based markers and robust across datasets in different libraries and platforms. The source code of svmATAC is available at https://github.com/mrcuizhe/svmATAC under the MIT license.


2017 ◽  
Author(s):  
Hyun Min Kang ◽  
Meena Subramaniam ◽  
Sasha Targ ◽  
Michelle Nguyen ◽  
Lenka Maliskova ◽  
...  

Droplet-based single-cell RNA-sequencing (dscRNA-seq) has enabled rapid, massively parallel profiling of transcriptomes from tens of thousands of cells. Multiplexing samples for single cell capture and library preparation in dscRNA-seq would enable cost-effective designs of differential expression and genetic studies while avoiding technical batch effects, but its implementation remains challenging. Here, we introduce an in-silico algorithm demuxlet that harnesses natural genetic variation to discover the sample identity of each cell and identify droplets containing two cells. These capabilities enable multiplexed dscRNA-seq experiments where cells from unrelated individuals are pooled and captured at higher throughput than standard workflows. To demonstrate the performance of demuxlet, we sequenced 3 pools of peripheral blood mononuclear cells (PBMCs) from 8 lupus patients. Given genotyping data for each individual, demuxlet correctly recovered the sample identity of > 99% of singlets, and identified doublets at rates consistent with previous estimates. In PBMCs, we demonstrate the utility of multiplexed dscRNA-seq in two applications: characterizing cell type specificity and inter-individual variability of cytokine response from 8 lupus patients and mapping genetic variants associated with cell type specific gene expression from 23 donors. Demuxlet is fast, accurate, scalable and could be extended to other single cell datasets that incorporate natural or synthetic DNA barcodes.


Author(s):  
Ting Luo ◽  
Fengping Zheng ◽  
Kang Wang ◽  
Yong Xu ◽  
Huixuan Xu ◽  
...  

Abstract Background Immune aberrations in end-stage renal disease (ESRD) are characterized by systemic inflammation and immune deficiency. The mechanistic understanding of this phenomenon remains limited. Methods We generated 12 981 and 9578 single-cell transcriptomes of peripheral blood mononuclear cells (PBMCs) that were pooled from 10 healthy volunteers and 10 patients with ESRD by single-cell RNA sequencing. Unsupervised clustering and annotation analyses were performed to cluster and identify cell types. The analysis of hallmark pathway and regulon activity was performed in the main cell types. Results We identified 14 leukocytic clusters that corresponded to six known PBMC types. The comparison of cells from ESRD patients and healthy individuals revealed multiple changes in biological processes. We noticed an ESRD-related increase in inflammation response, complement cascade and cellular metabolism, as well as a strong decrease in activity related to cell cycle progression in relevant cell types in ESRD. Furthermore, a list of cell type-specific candidate transcription factors (TFs) driving the ESRD-associated transcriptome changes was identified. Conclusions We generated a distinctive, high-resolution map of ESRD-derived PBMCs. These results revealed cell type-specific ESRD-associated pathways and TFs. Notably, the pooled sample analysis limits the generalization of our results. The generation of larger single-cell datasets will complement the current map and drive advances in therapies that manipulate immune cell function in ESRD.


2021 ◽  
Vol 12 ◽  
Author(s):  
Haiyan Yu ◽  
Xiaoping Hong ◽  
Hongwei Wu ◽  
Fengping Zheng ◽  
Zhipeng Zeng ◽  
...  

ObjectiveSystemic lupus erythematosus (SLE) is a complex autoimmune disease, and various immune cells are involved in the initiation, progression, and regulation of SLE. Our goal was to reveal the chromatin accessibility landscape of peripheral blood mononuclear cells (PBMCs) in SLE patients at single-cell resolution and identify the transcription factors (TFs) that may drive abnormal immune responses.MethodsThe assay for transposase accessible chromatin in single-cell sequencing (scATAC-seq) method was applied to map the landscape of active regulatory DNA in immune cells from SLE patients at single-cell resolution, followed by clustering, peak annotation and motif analysis of PBMCs in SLE.ResultsPeripheral blood mononuclear cells were robustly clustered based on their types without using antibodies. We identified twenty patterns of TF activation that drive abnormal immune responses in SLE patients. Then, we observed ten genes that were highly associated with SLE pathogenesis by altering T cell activity. Finally, we found 12 key TFs regulating the above six genes (CD83, ELF4, ITPKB, RAB27A, RUNX3, and ZMIZ1) that may be related to SLE disease pathogenesis and were significantly enriched in SLE patients (p <0.05, FC >2). With qPCR experiments on CD83, ELF4, RUNX3, and ZMIZ1 in B cells, we observed a significant difference in the expression of genes (ELF4, RUNX3, and ZMIZ1), which were regulated by seven TFs (EWSR1-FLI1, MAF, MAFA, NFIB, NR2C2 (var. 2), TBX4, and TBX5). Meanwhile, the seven TFs showed highly accessible binding sites in SLE patients.ConclusionsThese results confirm the importance of using single-cell sequencing to uncover the real features of immune cells in SLE patients, reveal key TFs in SLE-PBMCs, and provide foundational insights relevant for epigenetic therapy.


2019 ◽  
Author(s):  
Florian Wagner

AbstractClustering of cells by cell type is arguably the most common and repetitive task encountered during the analysis of single-cell RNA-Seq data. However, as popular clustering methods operate largely independently of visualization techniques, the fine-tuning of clustering parameters can be unintuitive and time-consuming. Here, I propose Galapagos, a simple and effective clustering workflow based on t-SNE and DBSCAN that does not require a gene selection step. In practice, Galapagos only involves the fine-tuning of two parameters, which is straightforward, as clustering is performed directly on the t-SNE visualization results. Using peripheral blood mononuclear cells as a model tissue, I validate the effectiveness of Galapagos in different ways. First, I show that Galapagos generates clusters corresponding to all main cell types present. Then, I demonstrate that the t-SNE results are robust to parameter choices and initialization points. Next, I employ a simulation approach to show that clustering with Galapagos is accurate and robust to the high levels of technical noise present. Finally, to demonstrate Galapagos’ accuracy on real data, I compare clustering results to true cell type identities established using CITE-Seq data. In this context, I also provide an example of the primary limitation of Galapagos, namely the difficulty to resolve related cell types in cases where t-SNE fails to clearly separate the cells. Galapagos helps to make clustering scRNA-Seq data more intuitive and reproducible, and can be implemented in most programming languages with only a few lines of code.


2018 ◽  
Author(s):  
Anja Mezger ◽  
Sandy Klemm ◽  
Ishminder Mann ◽  
Kara Brower ◽  
Alain Mir ◽  
...  

We have developed a high-throughput single-cell ATAC-seq (assay for transposition of accessible chromatin) method to measure physical access to DNA in whole cells. Our approach integrates fluorescence imaging and addressable reagent deposition across a massively parallel (5184) nano-well array, yielding a nearly 20-fold improvement in throughput (up to ~1800 cells/chip, 4-5 hour on-chip processing time) and cost (~98¢ per cell) compared to prior microfluidic implementations. We applied this method to measure regulatory variation in Peripheral Blood Mononuclear Cells (PBMCs) and show robust,de-novoclustering of single cells by hematopoietic cell type.


2021 ◽  
Vol 17 (11) ◽  
pp. e1009548
Author(s):  
Qunlun Shen ◽  
Shihua Zhang

With the rapid accumulation of biological omics datasets, decoding the underlying relationships of cross-dataset genes becomes an important issue. Previous studies have attempted to identify differentially expressed genes across datasets. However, it is hard for them to detect interrelated ones. Moreover, existing correlation-based algorithms can only measure the relationship between genes within a single dataset or two multi-modal datasets from the same samples. It is still unclear how to quantify the strength of association of the same gene across two biological datasets with different samples. To this end, we propose Approximate Distance Correlation (ADC) to select interrelated genes with statistical significance across two different biological datasets. ADC first obtains the k most correlated genes for each target gene as its approximate observations, and then calculates the distance correlation (DC) for the target gene across two datasets. ADC repeats this process for all genes and then performs the Benjamini-Hochberg adjustment to control the false discovery rate. We demonstrate the effectiveness of ADC with simulation data and four real applications to select highly interrelated genes across two datasets. These four applications including 21 cancer RNA-seq datasets of different tissues; six single-cell RNA-seq (scRNA-seq) datasets of mouse hematopoietic cells across six different cell types along the hematopoietic cell lineage; five scRNA-seq datasets of pancreatic islet cells across five different technologies; coupled single-cell ATAC-seq (scATAC-seq) and scRNA-seq data of peripheral blood mononuclear cells (PBMC). Extensive results demonstrate that ADC is a powerful tool to uncover interrelated genes with strong biological implications and is scalable to large-scale datasets. Moreover, the number of such genes can serve as a metric to measure the similarity between two datasets, which could characterize the relative difference of diverse cell types and technologies.


2021 ◽  
Author(s):  
Emily Stephenson ◽  
◽  
Gary Reynolds ◽  
Rachel A. Botting ◽  
Fernando J. Calero-Nieto ◽  
...  

AbstractAnalysis of human blood immune cells provides insights into the coordinated response to viral infections such as severe acute respiratory syndrome coronavirus 2, which causes coronavirus disease 2019 (COVID-19). We performed single-cell transcriptome, surface proteome and T and B lymphocyte antigen receptor analyses of over 780,000 peripheral blood mononuclear cells from a cross-sectional cohort of 130 patients with varying severities of COVID-19. We identified expansion of nonclassical monocytes expressing complement transcripts (CD16+C1QA/B/C+) that sequester platelets and were predicted to replenish the alveolar macrophage pool in COVID-19. Early, uncommitted CD34+ hematopoietic stem/progenitor cells were primed toward megakaryopoiesis, accompanied by expanded megakaryocyte-committed progenitors and increased platelet activation. Clonally expanded CD8+ T cells and an increased ratio of CD8+ effector T cells to effector memory T cells characterized severe disease, while circulating follicular helper T cells accompanied mild disease. We observed a relative loss of IgA2 in symptomatic disease despite an overall expansion of plasmablasts and plasma cells. Our study highlights the coordinated immune response that contributes to COVID-19 pathogenesis and reveals discrete cellular components that can be targeted for therapy.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Ailu Chen ◽  
Maria P. Diaz-Soto ◽  
Miguel F. Sanmamed ◽  
Taylor Adams ◽  
Jonas C. Schupp ◽  
...  

Abstract Background Asthma has been associated with impaired interferon response. Multiple cell types have been implicated in such response impairment and may be responsible for asthma immunopathology. However, existing models to study the immune response in asthma are limited by bulk profiling of cells. Our objective was to Characterize a model of peripheral blood mononuclear cells (PBMCs) of patients with severe asthma (SA) and its response to the TLR3 agonist Poly I:C using two single-cell methods. Methods Two complementary single-cell methods, DropSeq for single-cell RNA sequencing (scRNA-Seq) and mass cytometry (CyTOF), were used to profile PBMCs of SA patients and healthy controls (HC). Poly I:C-stimulated and unstimulated cells were analyzed in this study. Results PBMCs (n = 9414) from five SA (n = 6099) and three HC (n = 3315) were profiled using scRNA-Seq. Six main cell subsets, namely CD4 + T cells, CD8 + T cells, natural killer (NK) cells, B cells, dendritic cells (DCs), and monocytes, were identified. CD4 + T cells were the main cell type in SA and demonstrated a pro-inflammatory profile characterized by increased JAK1 expression. Following Poly I:C stimulation, PBMCs from SA had a robust induction of interferon pathways compared with HC. CyTOF profiling of Poly I:C stimulated and unstimulated PBMCs (n = 160,000) from the same individuals (SA = 5; HC = 3) demonstrated higher CD8 + and CD8 + effector T cells in SA at baseline, followed by a decrease of CD8 + effector T cells after poly I:C stimulation. Conclusions Single-cell profiling of an in vitro model using PBMCs in patients with SA identified activation of pro-inflammatory pathways at baseline and strong response to Poly I:C, as well as quantitative changes in CD8 + effector cells. Thus, transcriptomic and cell quantitative changes are associated with immune cell heterogeneity in this model to evaluate interferon responses in severe asthma.


Sign in / Sign up

Export Citation Format

Share Document