scholarly journals Unsupervised manifold alignment for single-cell multi-omics data

Author(s):  
Ritambhara Singh ◽  
Pinar Demetci ◽  
Giancarlo Bonora ◽  
Vijay Ramani ◽  
Choli Lee ◽  
...  

AbstractIntegrating single-cell measurements that capture different properties of the genome is vital to extending our understanding of genome biology. This task is challenging due to the lack of a shared axis across datasets obtained from different types of single-cell experiments. For most such datasets, we lack corresponding information among the cells (samples) and the measurements (features). In this scenario, unsupervised algorithms that are capable of aligning single-cell experiments are critical to learning an in silico co-assay that can help draw correspondences among the cells. Maximum mean discrepancy-based manifold alignment (MMD-MA) is such an unsupervised algorithm. Without requiring correspondence information, it can align single-cell datasets from different modalities in a common shared latent space, showing promising results on simulations and a small-scale single-cell experiment with 61 cells. However, it is essential to explore the applicability of this method to larger single-cell experiments with thousands of cells so that it can be of practical interest to the community. In this paper, we apply MMD-MA to two recent datasets that measure transcriptome and chromatin accessibility in ~2000 single cells. To scale the runtime of MMD-MA to a more substantial number of cells, we extend the original implementation to run on GPUs. We also introduce a method to automatically select one of the user-defined parameters, thus reducing the hyperparameter search space. We demonstrate that the proposed extensions allow MMD-MA to accurately align state-of-the-art single-cell experiments.

2021 ◽  
Author(s):  
Ziqi Zhang ◽  
Chengkai Yang ◽  
Xiuwei Zhang

Single cell multi-omics studies allow researchers to understand cell differentiation and development mechanisms in a more comprehensive manner. Single cell ATAC-sequencing (scATAC-seq) measures the chromatin accessibility of cells, and computational methods have been proposed to integrate scATAC-seq with scRNA-seq data of cells from the same cell types. This computational task is particularly challenging when the two modalities are not profiled from the same cells. Some existing methods first transform the scATAC-seq data into scRNA-seq data and integrate two scRNA-seq datasets, but how to perform the transformation is still a difficult problem. In addition, most of the existing methods to integrate scRNA-seq and scATAC-seq data focus on preserving distinct cell clusters before and after the integration, and it is not clear whether these methods can preserve the continuous trajectories for cells from continuous development or differentiation processes. We propose scDART, a scalable deep learning framework that embeds the two data modalities of single cells, scRNA-seq and scATAC-seq data, into a shared low-dimensional latent space while preserving cell trajectory structures. Furthermore, scDART learns a nonlinear function represented by a neural network encoding the cross-modality relationship simultaneously when learning the latent space representations of the integrated dataset. We test scDART on both real and simulated datasets, and compare it with the state-of-the-art methods. We show that scDART is able to integrate scRNA-seq and scATAC-seq data well while preserving the continuous cell trajectories. scDART also predicts scRNA-seq data accurately from the scATAC-seq data using the neural network module that represents cross-modality relationships.


2021 ◽  
Vol 23 (1) ◽  
Author(s):  
Bhupinder Pal ◽  
Yunshun Chen ◽  
Michael J. G. Milevskiy ◽  
François Vaillant ◽  
Lexie Prokopuk ◽  
...  

Abstract Background Heterogeneity within the mouse mammary epithelium and potential lineage relationships have been recently explored by single-cell RNA profiling. To further understand how cellular diversity changes during mammary ontogeny, we profiled single cells from nine different developmental stages spanning late embryogenesis, early postnatal, prepuberty, adult, mid-pregnancy, late-pregnancy, and post-involution, as well as the transcriptomes of micro-dissected terminal end buds (TEBs) and subtending ducts during puberty. Methods The single cell transcriptomes of 132,599 mammary epithelial cells from 9 different developmental stages were determined on the 10x Genomics Chromium platform, and integrative analyses were performed to compare specific time points. Results The mammary rudiment at E18.5 closely aligned with the basal lineage, while prepubertal epithelial cells exhibited lineage segregation but to a less differentiated state than their adult counterparts. Comparison of micro-dissected TEBs versus ducts showed that luminal cells within TEBs harbored intermediate expression profiles. Ductal basal cells exhibited increased chromatin accessibility of luminal genes compared to their TEB counterparts suggesting that lineage-specific chromatin is established within the subtending ducts during puberty. An integrative analysis of five stages spanning the pregnancy cycle revealed distinct stage-specific profiles and the presence of cycling basal, mixed-lineage, and 'late' alveolar intermediates in pregnancy. Moreover, a number of intermediates were uncovered along the basal-luminal progenitor cell axis, suggesting a continuum of alveolar-restricted progenitor states. Conclusions This extended single cell transcriptome atlas of mouse mammary epithelial cells provides the most complete coverage for mammary epithelial cells during morphogenesis to date. Together with chromatin accessibility analysis of TEB structures, it represents a valuable framework for understanding developmental decisions within the mouse mammary gland.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Shengquan Chen ◽  
Guanao Yan ◽  
Wenyu Zhang ◽  
Jinzhao Li ◽  
Rui Jiang ◽  
...  

AbstractThe recent advancements in single-cell technologies, including single-cell chromatin accessibility sequencing (scCAS), have enabled profiling the epigenetic landscapes for thousands of individual cells. However, the characteristics of scCAS data, including high dimensionality, high degree of sparsity and high technical variation, make the computational analysis challenging. Reference-guided approaches, which utilize the information in existing datasets, may facilitate the analysis of scCAS data. Here, we present RA3 (Reference-guided Approach for the Analysis of single-cell chromatin Accessibility data), which utilizes the information in massive existing bulk chromatin accessibility and annotated scCAS data. RA3 simultaneously models (1) the shared biological variation among scCAS data and the reference data, and (2) the unique biological variation in scCAS data that identifies distinct subpopulations. We show that RA3 achieves superior performance when used on several scCAS datasets, and on references constructed using various approaches. Altogether, these analyses demonstrate the wide applicability of RA3 in analyzing scCAS data.


Author(s):  
Pinar Demetci ◽  
Rebecca Santorella ◽  
Björn Sandstede ◽  
William Stafford Noble ◽  
Ritambhara Singh

AbstractData integration of single-cell measurements is critical for understanding cell development and disease, but the lack of correspondence between different types of measurements makes such efforts challenging. Several unsupervised algorithms can align heterogeneous single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data domains. However, these algorithms require hyperparameter tuning for high-quality alignments, which is difficult in an unsupervised setting without correspondence information for validation. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. We compare the alignment performance of SCOT with state-of-the-art algorithms on four simulated and two real-world datasets. SCOT performs on par with state-of-the-art methods but is faster and requires tuning fewer hyperparameters. Furthermore, we provide an algorithm for SCOT to use Gromov Wasserstein distance to guide the parameter selection. Thus, unlike previous methods, SCOT aligns well without using any orthogonal correspondence information to pick the hyperparameters. Our source code and scripts for replicating the results are available at https://github.com/rsinghlab/SCOT.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Elliott Swanson ◽  
Cara Lord ◽  
Julian Reading ◽  
Alexander T Heubeck ◽  
Palak C Genge ◽  
...  

Single-cell measurements of cellular characteristics have been instrumental in understanding the heterogeneous pathways that drive differentiation, cellular responses to signals, and human disease. Recent advances have allowed paired capture of protein abundance and transcriptomic state, but a lack of epigenetic information in these assays has left a missing link to gene regulation. Using the heterogeneous mixture of cells in human peripheral blood as a test case, we developed a novel scATAC-seq workflow that increases signal-to-noise and allows paired measurement of cell surface markers and chromatin accessibility: integrated cellular indexing of chromatin landscape and epitopes, called ICICLE-seq. We extended this approach using a droplet-based multiomics platform to develop a trimodal assay that simultaneously measures transcriptomics (scRNA-seq), epitopes, and chromatin accessibility (scATAC-seq) from thousands of single cells, which we term TEA-seq. Together, these multimodal single-cell assays provide a novel toolkit to identify type-specific gene regulation and expression grounded in phenotypically defined cell types.


2019 ◽  
Vol 66 (1) ◽  
pp. 217-228 ◽  
Author(s):  
Daniel Zucha ◽  
Peter Androvic ◽  
Mikael Kubista ◽  
Lukas Valihrach

Abstract BACKGROUND Recent advances allowing quantification of RNA from single cells are revolutionizing biology and medicine. Currently, almost all single-cell transcriptomic protocols rely on reverse transcription (RT). However, RT is recognized as a known source of variability, particularly with low amounts of RNA. Recently, several new reverse transcriptases (RTases) with the potential to decrease the loss of information have been developed, but knowledge of their performance is limited. METHODS We compared the performance of 11 RTases in quantitative reverse transcription PCR (RT-qPCR) on single-cell and 100-cell bulk templates, using 2 priming strategies: a conventional mixture of random hexamers with oligo(dT)s and a reduced concentration of oligo(dT)s mimicking common single-cell RNA-sequencing protocols. Depending on their performance, 2 RTases were further tested in a high-throughput single-cell experiment. RESULTS All tested RTases demonstrated high precision (R2 > 0.9445). The most pronounced differences were found in their ability to capture rare transcripts (0%–90% reaction positivity rate) and in their absolute reaction yield (7.3%–137.9%). RTase performance and reproducibility were compared with Z scores. The 2 best-performing enzymes were Maxima H− and SuperScript IV. The validity of the obtained results was confirmed in a follow-up single-cell model experiment. The better-performing enzyme (Maxima H−) increased the sensitivity of the single-cell experiment and improved resolution in the clustering analysis over the commonly used RTase (SuperScript II). CONCLUSIONS Our comprehensive comparison of 11 RTases in low RNA input conditions identified 2 best-performing enzymes. Our results provide a point of reference for the improvement of current single-cell quantification protocols.


2020 ◽  
Author(s):  
Junpeng Zhang ◽  
Lin Liu ◽  
Taosheng Xu ◽  
Wu Zhang ◽  
Chunwen Zhao ◽  
...  

AbstractBackgroundExisting computational methods for studying miRNA regulation are mostly based on bulk miRNA and mRNA expression data. However, bulk data only allows the analysis of miRNA regulation regarding a group of cells, rather than the miRNA regulation unique to individual cells. Recent advance in single-cell miRNA-mRNA co-sequencing technology has opened a way for investigating miRNA regulation at single-cell level. However, as currently single-cell miRNA-mRNA co-sequencing data is just emerging and only available at small-scale, there is a strong need of novel methods to exploit existing single-cell data for the study of cell-specific miRNA regulation.ResultsIn this work, we propose a new method, CSmiR (Cell-Specific miRNA regulation) to use single-cell miRNA-mRNA co-sequencing data to identify miRNA regulatory networks at the resolution of individual cells. We apply CSmiR to the miRNA-mRNA co-sequencing data in 19 K562 single-cells to identify cell-specific miRNA-mRNA regulatory networks to understand miRNA regulation in each K562 single-cell. By analyzing the obtained cell-specific miRNA-mRNA regulatory networks, we observe that the miRNA regulation in each K562 single-cell is unique. Moreover, we conduct detailed analysis on the cell-specific miRNA regulation associated with the miR-17/92 family as a case study. Finally, through exploring cell-cell similarity matrix characterized by cell-specific miRNA regulation, CSmiR provides a novel strategy for clustering single-cells to help understand cell-cell crosstalk.ConclusionsTo the best of our knowledge, CSmiR is the first method to explore miRNA regulation at a single-cell resolution level, and we believe that it can be a useful method to enhance the understanding of cell-specific miRNA regulation.


2021 ◽  
Author(s):  
Florian Wimmers ◽  
Michele Donato ◽  
Alex Kuo ◽  
Tal Ashuach ◽  
Shakti Gupta ◽  
...  

Emerging evidence indicates a fundamental role for the epigenome in immunity. Here, we used a systems biology approach to map the epigenomic and transcriptional landscape of immunity to influenza vaccination in humans at the single-cell level. Vaccination against seasonal influenza resulted in persistently reduced H3K27ac in monocytes and myeloid dendritic cells, which was associated with impaired cytokine responses to TLR stimulation. Single cell ATAC-seq analysis of 120,305 single cells revealed an epigenomically distinct subcluster of monocytes with reduced chromatin accessibility at AP-1-targeted loci after vaccination. Similar effects were also observed in response to vaccination with the AS03-adjuvanted H5N1 pandemic influenza vaccine. However, this vaccine also stimulated persistently increased chromatin accessibility at loci targeted by interferon response factors (IRFs). This was associated with elevated expression of antiviral genes and type 1 IFN production and heightened resistance to infection with the heterologous viruses Zika and Dengue. These results demonstrate that influenza vaccines stimulate persistent epigenomic remodeling of the innate immune system. Notably, AS03-adjuvanted vaccination remodeled the epigenome of myeloid cells to confer heightened resistance against heterologous viruses, revealing its potentially unappreciated role as an epigenetic adjuvant.


Author(s):  
Elliott Swanson ◽  
Cara Lord ◽  
Julian Reading ◽  
Alexander T. Heubeck ◽  
Adam K. Savage ◽  
...  

AbstractSingle-cell measurements of cellular characteristics have been instrumental in understanding the heterogeneous pathways that drive differentiation, cellular responses to extracellular signals, and human disease states. scATAC-seq has been particularly challenging due to the large size of the human genome and processing artefacts resulting from DNA damage that are an inherent source of background signal. Downstream analysis and integration of scATAC-seq with other single-cell assays is complicated by the lack of clear phenotypic information linking chromatin state and cell type. Using the heterogeneous mixture of cells in human peripheral blood as a test case, we developed a novel scATAC-seq workflow that increases the signal-to-noise ratio and allows simultaneous measurement of cell surface markers: Integrated Cellular Indexing of Chromatin Landscape and Epitopes (ICICLE-seq). We extended this approach using a droplet-based multiomics platform to develop a trimodal assay to simultaneously measure Transcriptomic state (scRNA-seq), cell surface Epitopes, and chromatin Accessibility (scATAC-seq) from thousands of single cells, which we term TEA-seq. Together, these multimodal single-cell assays provide a novel toolkit to identify type-specific gene regulation and expression grounded in phenotypically defined cell types.


2020 ◽  
Author(s):  
Shengquan Chen ◽  
Guanao Yan ◽  
Wenyu Zhang ◽  
Jinzhao Li ◽  
Rui Jiang ◽  
...  

AbstractThe recent advancements in single-cell technologies, including single-cell chromatin accessibility sequencing (scCAS), have enabled profiling the epigenetic landscapes for thousands of individual cells. However, the characteristics of scCAS data, including high dimensionality, high degree of sparsity and high technical variation, make the computational analysis challenging. Reference-guided approach, which utilizes the information in existing datasets, may facilitate the analysis of scCAS data. We present RA3 (Reference-guided Approach for the Analysis of single-cell chromatin Acessibility data), which utilizes the information in massive existing bulk chromatin accessibility and annotated scCAS data. RA3 simultaneously models 1) the shared biological variation among scCAS data and the reference data, and 2) the unique biological variation in scCAS data that identifies distinct subpopulations. We show that RA3 achieves superior performance in many scCAS datasets. We also present several approaches to construct the reference data to demonstrate the wide applicability of RA3.


Sign in / Sign up

Export Citation Format

Share Document