scholarly journals Axes of inter-sample variability among transcriptional neighborhoods reveal disease associated cell states in single-cell data

2021 ◽  
Author(s):  
Yakir A Reshef ◽  
Laurie Rumker ◽  
Joyce B Kang ◽  
Aparna Nathan ◽  
Megan B Murray ◽  
...  

As single-cell datasets grow in sample size, there is a critical need to characterize cell states that vary across samples and associate with sample attributes like clinical phenotypes. Current statistical approaches typically map cells to cell-type clusters and examine sample differences through that lens alone. Here we present covarying neighborhood analysis (CNA), an unbiased method to identify cell populations of interest with greater flexibility and granularity. CNA characterizes dominant axes of variation across samples by identifying groups of very small regions in transcriptional space, termed neighborhoods, that covary in abundance across samples, suggesting shared function or regulation. CNA can then rigorously test for associations between any sample-level attribute and the abundances of these covarying neighborhood groups. We show in simulation that CNA enables more powerful and accurate identification of disease-associated cell states than a cluster-based approach. When applied to published datasets, CNA captures a Notch activation signature in rheumatoid arthritis, redefines monocyte populations expanded in sepsis, and identifies a previously undiscovered T-cell population associated with progression to active tuberculosis.

eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Alexander J Tarashansky ◽  
Jacob M Musser ◽  
Margarita Khariton ◽  
Pengyang Li ◽  
Detlev Arendt ◽  
...  

Comparing single-cell transcriptomic atlases from diverse organisms can elucidate the origins of cellular diversity and assist the annotation of new cell atlases. Yet, comparison between distant relatives is hindered by complex gene histories and diversifications in expression programs. Previously, we introduced the self-assembling manifold (SAM) algorithm to robustly reconstruct manifolds from single-cell data (Tarashansky et al., 2019). Here, we build on SAM to map cell atlas manifolds across species. This new method, SAMap, identifies homologous cell types with shared expression programs across distant species within phyla, even in complex examples where homologous tissues emerge from distinct germ layers. SAMap also finds many genes with more similar expression to their paralogs than their orthologs, suggesting paralog substitution may be more common in evolution than previously appreciated. Lastly, comparing species across animal phyla, spanning mouse to sponge, reveals ancient contractile and stem cell families, which may have arisen early in animal evolution.


2021 ◽  
Author(s):  
Mariia Bilous ◽  
Loc Tran ◽  
Chiara Cianciaruso ◽  
Santiago J Carmona ◽  
Mikael J Pittet ◽  
...  

Single-cell RNA sequencing (scRNA-seq) technologies offer unique opportunities for exploring heterogeneous cell populations. However, in-depth single-cell transcriptomic characterization of complex tissues often requires profiling tens to hundreds of thousands of cells. Such large numbers of cells represent an important hurdle for downstream analyses, interpretation and visualization. Here we develop a network-based coarse-graining framework where highly similar cells are merged into super-cells. We demonstrate that super-cells not only preserve but often improve the results of downstream analyses including visualization, clustering, differential expression, cell type annotation, gene correlation, imputation, RNA velocity and data integration. By capitalizing on the redundancy inherent to scRNA-seq data, super-cells significantly facilitate and accelerate the construction and interpretation of single-cell atlases, as demonstrated by the integration of 1.46 million cells from COVID-19 patients in less than two hours on a standard desktop.


2020 ◽  
Author(s):  
Jinjin Tian ◽  
Jiebiao Wang ◽  
Kathryn Roeder

AbstractMotivationGene-gene co-expression networks (GCN) are of biological interest for the useful information they provide for understanding gene-gene interactions. The advent of single cell RNA-sequencing allows us to examine more subtle gene co-expression occurring within a cell type. Many imputation and denoising methods have been developed to deal with the technical challenges observed in single cell data; meanwhile, several simulators have been developed for benchmarking and assessing these methods. Most of these simulators, however, either do not incorporate gene co-expression or generate co-expression in an inconvenient manner.ResultsTherefore, with the focus on gene co-expression, we propose a new simulator, ESCO, which adopts the idea of the copula to impose gene co-expression, while preserving the highlights of available simulators, which perform well for simulation of gene expression marginally. Using ESCO, we assess the performance of imputation methods on GCN recovery and find that imputation generally helps GCN recovery when the data are not too sparse, and the ensemble imputation method works best among leading methods. In contrast, imputation fails to help in the presence of an excessive fraction of zero counts, where simple data aggregating methods are a better choice. These findings are further verified with mouse and human brain cell data.AvailabilityThe ESCO implementation is available as R package SplatterESCO (https://github.com/JINJINT/SplatterESCO)[email protected]


2020 ◽  
Vol 36 (11) ◽  
pp. 3585-3587
Author(s):  
Lin Wang ◽  
Francisca Catalan ◽  
Karin Shamardani ◽  
Husam Babikir ◽  
Aaron Diaz

Abstract Summary Single-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms. Availability and implementation https://github.com/diazlab/ELSA Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Rheumatology ◽  
2021 ◽  
Author(s):  
Barbora Schonfeldova ◽  
Kristina Zec ◽  
Irina A Udalova

Abstract Despite extensive research, there is still no treatment that would lead to remission in all patients with rheumatoid arthritis as our understanding of the affected site, the synovium, is still incomplete. Recently, single-cell technologies helped to decipher the cellular heterogeneity of the synovium; however, certain synovial cell populations, such as endothelial cells or peripheral neurons, remain to be profiled on a single-cell level. Furthermore, associations between certain cellular states and inflammation were found; whether these cells cause the inflammation remains to be answered. Similarly, cellular zonation and interactions between individual effectors in the synovium are yet to be fully determined. A deeper understanding of cell signalling and interactions in the synovium is crucial for a better design of therapeutics with the goal of complete remission in all patients.


2021 ◽  
Author(s):  
Jinyue Liao ◽  
Hoi Ching Suen ◽  
Shitao Rao ◽  
Alfred Chun Shui Luk ◽  
Ruoyu Zhang ◽  
...  

AbstractSpermatogenesis depends on an orchestrated series of developing events in germ cells and full maturation of the somatic microenvironment. To date, the majority of efforts to study cellular heterogeneity in testis has been focused on single-cell gene expression rather than the chromatin landscape shaping gene expression. To advance our understanding of the regulatory programs underlying testicular cell types, we analyzed single-cell chromatin accessibility profiles in more than 25,000 cells from mouse developing testis. We showed that scATAC-Seq allowed us to deconvolve distinct cell populations and identify cis-regulatory elements (CREs) underlying cell type specification. We identified sets of transcription factors associated with cell type-specific accessibility, revealing novel regulators of cell fate specification and maintenance. Pseudotime reconstruction revealed detailed regulatory dynamics coordinating the sequential developmental progressions of germ cells and somatic cells. This high-resolution data also revealed putative stem cells within the Sertoli and Leydig cell populations. Further, we defined candidate target cell types and genes of several GWAS signals, including those associated with testosterone levels and coronary artery disease. Collectively, our data provide a blueprint of the ‘regulon’ of the mouse male germline and supporting somatic cells.


2021 ◽  
Author(s):  
Guangyuan Li ◽  
Song Baobao ◽  
H. L Grimes ◽  
V. B. Surya Prasath ◽  
Nathan L Salomonis

Hundreds of bioinformatics approaches now exist to define cellular heterogeneity from single-cell genomics data. Reconciling conflicts between diverse methods, algorithm settings, annotations or modalities have the potential to clarify which populations are real and establish reusable reference atlases. Here, we present a customizable computational strategy called scTrianguate, which leverages cooperative game theory to intelligently mix-and-match clustering solutions from different resolutions, algorithms, reference atlases, or multi-modal measurements. This algorithm relies on a series of robust statistical metrics for cluster stability that work across molecular modalities to identify high-confidence integrated annotations. When applied to annotations from diverse competing cell atlas projects, this approach is able to resolve conflicts and determine the validity of controversial cell population predictions. Tested with scRNA-Seq, CITE-Seq (RNA + surface ADT), multiome (RNA + ATAC), and TEA-Seq (RNA + surface ADT + ATAC), this approach identifies highly stable and reproducible, known and novel cell populations, while excluding clusters defined by technical artifacts (i.e., doublets). Importantly, we find that distinct cell populations are frequently attributed with features from different modalities (RNA, ATAC, ADT) in the same assay, highlighting the importance of multimodal analysis in cluster determination. As it is flexible, this approach can be updated with new user-defined statistical metrics to alter the decision engine and customized to new measures of stability for different measures of cellular activity.


2017 ◽  
Vol 3 (1) ◽  
pp. 46 ◽  
Author(s):  
Elham Azizi ◽  
Sandhya Prabhakaran ◽  
Ambrose Carr ◽  
Dana Pe'er

Single-cell RNA-seq gives access to gene expression measurements for thousands of cells, allowing discovery and characterization of cell types. However, the data is noise-prone due to experimental errors and cell type-specific biases. Current computational approaches for analyzing single-cell data involve a global normalization step which introduces incorrect biases and spurious noise and does not resolve missing data (dropouts). This can lead to misleading conclusions in downstream analyses. Moreover, a single normalization removes important cell type-specific information. We propose a data-driven model, BISCUIT, that iteratively normalizes and clusters cells, thereby separating noise from interesting biological signals. BISCUIT is a Bayesian probabilistic model that learns cell-specific parameters to intelligently drive normalization. This approach displays superior performance to global normalization followed by clustering in both synthetic and real single-cell data compared with previous methods, and allows easy interpretation and recovery of the underlying structure and cell types.


2019 ◽  
Vol 21 (Supplement_6) ◽  
pp. vi64-vi64
Author(s):  
Robert Suter ◽  
Vasileios Stathias ◽  
Anna Jermakowicz ◽  
Alexa Semonche ◽  
Michael Ivan ◽  
...  

Abstract Glioblastoma (GBM) remains the most common adult brain tumor, with poor survival expectations, and no new therapeutic modalities approved in the last decade. Our laboratories have recently demonstrated that the integration of a transcriptional disease signature obtained from The Cancer Genome Atlas’ GBM dataset with transcriptional cell drug-response signatures in the LINCS L1000 dataset yields possible combinatorial therapeutics. Considering the extreme intra-tumor heterogeneity associated with the disease, we hypothesize that the utilization of single-cell RNA-sequencing (scRNA-seq) of patient tumors will further strengthen our predictive model by providing insight on the unique transcriptomes of the cellular niches present within these tumors, and into the transcriptional dynamics of these same cellular niches. By sequencing single-cell transcriptomes from recurrent GBM tumors resected from patients at the University of Miami, and integrating our datasets with previously published scRNA-seq data from primary GBM tumors, we are able to gain additional insight into the differences between these clinical distinctions. We have analyzed the differential expression of kinases both across and within distinct cell populations of primary and recurrent GBM tumors. This transcriptional map of kinase expression represents the heterogeneity of potential targets within individual tumors and between recurrent and primary GBM. Additionally, by generating disease signatures unique to each cellular population, and integrating these with transcriptional drug-response signatures from LINCS, we are able to predict compounds to target specific cell populations within GMB tumors. Additional computational techniques such as RNA velocity analysis and cell cycle scoring elucidate temporal insights to further prioritize these cell-type specific therapeutics, and reveal the intra-cellular dynamics present within these tumors. Collectively, our studies suggest that we have developed a novel omics pipeline based on the single cell RNA-sequencing of individual GBM cells that addresses intra-tumor heterogeneity, and may lead to novel therapeutic combinations for the treatment of this incurable disease.


Author(s):  
Michael A. Skinnider ◽  
Jordan W. Squair ◽  
Claudia Kathe ◽  
Mark A. Anderson ◽  
Matthieu Gautier ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document