scholarly journals Whole Animal Multiplexed Single-Cell RNA-Seq Reveals Plasticity of Clytia Medusa Cell Types

2021 ◽  
Author(s):  
Tara Chari ◽  
Brandon Weissbourd ◽  
Jase Gehring ◽  
Anna Ferraioli ◽  
Lucas Leclère ◽  
...  

AbstractWe present an organism-wide, transcriptomic cell atlas of the hydrozoan medusa Clytia hemisphaerica, and determine how its component cell types respond to starvation. Utilizing multiplexed scRNA-seq, in which individual animals were indexed and pooled from control and perturbation conditions into a single sequencing run, we avoid artifacts from batch effects and are able to discern shifts in cell state in response to organismal perturbations. This work serves as a foundation for future studies of development, function, and plasticity in a genetically tractable jellyfish species. Moreover, we introduce a powerful workflow for high-resolution, whole animal, multiplexed single-cell genomics (WHAM-seq) that is readily adaptable to other traditional or non-traditional model organisms.

2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


Author(s):  
Massimo Andreatta ◽  
Santiago J Carmona

Abstract Summary STACAS is a computational method for the identification of integration anchors in the Seurat environment, optimized for the integration of single-cell (sc) RNA-seq datasets that share only a subset of cell types. We demonstrate that by (i) correcting batch effects while preserving relevant biological variability across datasets, (ii) filtering aberrant integration anchors with a quantitative distance measure and (iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. Availability and implementation Source code and R package available at https://github.com/carmonalab/STACAS; Docker image available at https://hub.docker.com/repository/docker/mandrea1/stacas_demo.


2020 ◽  
Author(s):  
Songwei Ge ◽  
Haohan Wang ◽  
Amir Alavi ◽  
Eric Xing ◽  
Ziv Bar-Joseph

AbstractDimensionality reduction is an important first step in the analysis of single cell RNA-seq (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and labs. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell type specific. To overcome this we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different datasets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.


2017 ◽  
Author(s):  
Trygve E. Bakken ◽  
Rebecca D. Hodge ◽  
Jeremy M. Miller ◽  
Zizhen Yao ◽  
Thuc N. Nguyen ◽  
...  

AbstractTranscriptional profiling of complex tissues by RNA-sequencing of single nuclei presents some advantages over whole cell analysis. It enables unbiased cellular coverage, lack of cell isolation-based transcriptional effects, and application to archived frozen specimens. Using a well-matched pair of single-nucleus RNA-seq (snRNA-seq) and single-cell RNA-seq (scRNA-seq) SMART-Seq v4 datasets from mouse visual cortex, we demonstrate that similarly high-resolution clustering of closely related neuronal types can be achieved with both methods if intronic sequences are included in nuclear RNA-seq analysis. More transcripts are detected in individual whole cells (∼11,000 genes) than nuclei (∼7,000 genes), but the majority of genes have similar detection across cells and nuclei. We estimate that the nuclear proportion of total cellular mRNA varies from 20% to over 50% for large and small pyramidal neurons, respectively. Together, these results illustrate the high information content of nuclear RNA for characterization of cellular diversity in brain tissues.


Author(s):  
Massimo Andreatta ◽  
Santiago J. Carmona

AbstractComputational tools for the integration of single-cell transcriptomics data are designed to correct batch effects between technical replicates or different technologies applied to the same population of cells. However, they have inherent limitations when applied to heterogeneous sets of data with moderate overlap in cell states or sub-types. STACAS is a package for the identification of integration anchors in the Seurat environment, optimized for the integration of datasets that share only a subset of cell types. We demonstrate that by i) correcting batch effects while preserving relevant biological variability across datasets, ii) filtering aberrant integration anchors with a quantitative distance measure, and iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. We anticipate that the algorithm will be a useful tool for the construction of comprehensive single-cell atlases by integration of the growing amount of single-cell data becoming available in public repositories.Code availabilityR package:https://github.com/carmonalab/STACASDocker image:https://hub.docker.com/repository/docker/mandrea1/stacas_demo


2019 ◽  
Author(s):  
Xiangjie Li ◽  
Yafei Lyu ◽  
Jihwan Park ◽  
Jingxiao Zhang ◽  
Dwight Stambolian ◽  
...  

Single-cell RNA sequencing (scRNA-seq) can characterize cell types and states through unsupervised clustering, but the ever increasing number of cells imposes computational challenges. We present an unsupervised deep embedding algorithm for single-cell clustering (DESC) that iteratively learns cluster-specific gene expression signatures and cluster assignment. DESC significantly improves clustering accuracy across various datasets and is capable of removing complex batch effects while maintaining true biological variations.


2021 ◽  
Author(s):  
Olga Borisovna Botvinnik ◽  
Pranathi Vemuri ◽  
N. Tessa Pierce Ward ◽  
Phoenix Aja Logan ◽  
Saba Nafees ◽  
...  

Single-cell RNA-seq (scRNA-seq) is a powerful tool for cell type identification but is not readily applicable to organisms without well-annotated reference genomes. Of the approximately 10 million animal species predicted to exist on earth, >99.9% do not have any submitted genome assembly. To enable scRNA-seq for the vast majority of animals on the planet, here we introduce the concept of "k-mer homology," combining biochemical synonyms in degenerate protein alphabets with uniform data subsampling via MinHash into a pipeline called Kmermaid, to directly detect similar cell types across species from transcriptomic data without the need for a reference genome. Underpinning kmermaid is the tool Orpheum, a memory-efficient method for extracting high-confidence protein-coding sequences from RNA-seq data. After validating kmermaid using datasets from human and mouse lung, we applied Kmermaid to the Chinese horseshoe bat (Rhinolophus sinicus), where we propagated cellular compartment labels at high fidelity. Our pipeline provides a high-throughput tool that enables analyses of transcriptomic data across divergent species' transcriptomes in a genome- and gene annotation-agnostic manner. Thus, the combination of Kmermaid and Orpheum identifies cellular type-specific sequences that may be missing from genome annotations and empowers molecular cellular phenotyping for novel model organisms and species.


2022 ◽  
Author(s):  
Chenfei Wang ◽  
Pengfei Ren ◽  
Xiaoying Shi ◽  
Xin Dong ◽  
Zhiguang Yu ◽  
...  

Abstract The rapid accumulation of single-cell RNA-seq data has provided rich resources to characterize various human cell types. Cell type annotation is the critical step in analyzing single-cell RNA-seq data. However, accurate cell type annotation based on public references is challenging due to the inconsistent annotations, batch effects, and poor characterization of rare cell types. Here, we introduce SELINA (single cELl identity NAvigator), an integrative annotation transferring framework for automatic cell type annotation. SELINA optimizes the annotation for minority cell types by synthetic minority over-sampling, removes batch effects among reference datasets using a multiple-adversarial domain adaptation network (MADA), and fits the query data with reference data using an autoencoder. Finally, SELINA affords a comprehensive and uniform reference atlas with 1.7 million cells covering 230 major human cell types. We demonstrated the robustness and superiority of SELINA in most human tissues compared to existing methods. SELINA provided a one-stop solution for human single- cell RNA-seq data annotation with the potential to extend for other species.


2019 ◽  
Author(s):  
Jun Xu ◽  
Caitlin Falconer ◽  
Quan Nguyen ◽  
Joanna Crawford ◽  
Brett D. McKinnon ◽  
...  

AbstractA variety of experimental and computational methods have been developed to demultiplex samples from pooled individuals in a single-cell RNA sequencing (scRNA-Seq) experiment which either require adding information (such as hashtag barcodes) or measuring information (such as genotypes) prior to pooling. We introduce scSplit which utilises genetic differences inferred from scRNA-Seq data alone to demultiplex pooled samples. scSplit also extracts a minimal set of high confidence presence/absence genotypes in each cluster which can be used to map clusters to original samples. Using a range of simulated, merged individual-sample as well as pooled multi-individual scRNA-Seq datasets, we show that scSplit is highly accurate and concordant with demuxlet predictions. Furthermore, scSplit predictions are highly consistent with the known truth in cell-hashing dataset. We also show that multiplexed-scRNA-Seq can be used to reduce batch effects caused by technical biases. scSplit is ideally suited to samples for which external genome-wide genotype data cannot be obtained (for example non-model organisms), or for which it is impossible to obtain unmixed samples directly, such as mixtures of genetically distinct tumour cells, or mixed infections. scSplit is available at: https://github.com/jon-xu/scSplit


2021 ◽  
Vol 4 (6) ◽  
pp. e202001004
Author(s):  
Almut Lütge ◽  
Joanna Zyprych-Walczak ◽  
Urszula Brykczynska Kunzmann ◽  
Helena L Crowell ◽  
Daniela Calini ◽  
...  

A key challenge in single-cell RNA-sequencing (scRNA-seq) data analysis is batch effects that can obscure the biological signal of interest. Although there are various tools and methods to correct for batch effects, their performance can vary. Therefore, it is important to understand how batch effects manifest to adjust for them. Here, we systematically explore batch effects across various scRNA-seq datasets according to magnitude, cell type specificity, and complexity. We developed a cell-specific mixing score (cms) that quantifies mixing of cells from multiple batches. By considering distance distributions, the score is able to detect local batch bias as well as differentiate between unbalanced batches and systematic differences between cells of the same cell type. We compare metrics in scRNA-seq data using real and synthetic datasets and whereas these metrics target the same question and are used interchangeably, we find differences in scalability, sensitivity, and ability to handle differentially abundant cell types. We find that cell-specific metrics outperform cell type–specific and global metrics and recommend them for both method benchmarks and batch exploration.


Sign in / Sign up

Export Citation Format

Share Document