scholarly journals Inclusion of processed cell metadata improves single cell sequencing analysis reproducibility and accessibility

2020 ◽  
Author(s):  
Sidhant Puntambekar ◽  
Jay R. Hesselberth ◽  
Kent A. Riemondy ◽  
Rui Fu

AbstractSingle cell RNA sequencing provides an unprecedented view of cellular diversity of biological systems. Thousands of scRNA-seq datasets have been generated, providing a wealth of biological data on the diversity of cell types across different organisms, developmental stages, and disease states. But while a tremendous number of publications and datasets have been generated using this technology, we found that a minority (< 25%) of studies provide sufficient information to enable direct reuse of their data for further studies. This problem is common across journals, data repositories, and publication dates. The lack of appropriate information not only hinders exploration and knowledge transfer of reported data, but also makes reproducing the original study prohibitively difficult and/or time-consuming. Correcting this problem is not easy but we encourage investigators, reviewers, journals, and data repositories to take steps to improve their standards and ensure proper documentation of these valuable datasets.

2020 ◽  
Author(s):  
Zhenyi Wang ◽  
Yanjie Zhong ◽  
Zhaofeng Ye ◽  
Lang Zeng ◽  
Yang Chen ◽  
...  

Distinguishing cell types and cell states is one of the fundamental questions in single-cell studies. Meanwhile, exploring the lineage relations among cells and finding the path and critical points in the cell fate transition are also of great importance. Existing unsupervised clustering methods and lineage trajectory reconstruction methods often face several challenges such as clustering data of arbitrary shapes, tracking precise trajectories and identifying critical points. Certain adaptive landscape approach, which constructs a pseudo-energy landscape of the dynamical system, may be used to explore such problems. Thus, we propose Markov hierarchical clustering algorithm (MarkovHC), which reconstructs multi-scale pseudo-energy landscape by exploiting underlying metastability structure in an exponentially perturbed Markov chain. A Markov process describes the random walk of a hypothetically traveling cell in the corresponding pseudo-energy landscape over possible gene expression states. Technically, MarkovHC integrates the tasks of cell classification, trajectory reconstruction, and critical point identification in a single theoretical framework consistent with topological data analysis (TDA). In addition to the algorithm development and simulation tests, we also applied MarkovHC to diverse types of real biological data: single-cell RNA-Seq data, cytometry data, and single-cell ATAC-Seq data. Remarkably, when applying to single-cell RNA-Seq data of human ESC derived progenitor cells, MarkovHC not only could successfully identify known cell types, but also discover new cell types and stages. In addition, when using MarkovHC to analyze single-cell RNA-Seq data of human preimplantation embryos in early development, the hierarchical structure of the lineage trajectories was faithfully reconstituted. Furthermore, the critical points representing important stage transitions had also been identified by MarkovHC from early gastric cancer data. In summary, these results demonstrate that MarkovHC is a powerful tool based on rigorous metastability theory to explore hierarchical structures of biological data, to identify a cell sub-population (basin) and a critical point (stage transition), and to track a lineage trajectory (differentiation path).


Author(s):  
Zhen Miao ◽  
Michael S. Balzer ◽  
Ziyuan Ma ◽  
Hongbo Liu ◽  
Junnan Wu ◽  
...  

AbstractDetermining the epigenetic program that generates unique cell types in the kidney is critical for understanding cell-type heterogeneity during tissue homeostasis and injury response.Here, we profiled open chromatin and gene expression in developing and adult mouse kidneys at single cell resolution. We show critical reliance of gene expression on distal regulatory elements (enhancers). We define key cell type-specific transcription factors and major gene-regulatory circuits for kidney cells. Dynamic chromatin and expression changes during nephron progenitor differentiation demonstrated that podocyte commitment occurs early and is associated with sustained Foxl1 expression. Renal tubule cells followed a more complex differentiation, where Hfn4a was associated with proximal and Tfap2b with distal fate. Mapping single nucleotide variants associated with human kidney disease identified critical cell types, developmental stages, genes, and regulatory mechanisms.We provide a global single cell resolution view of chromatin accessibility of kidney development. The dataset is available via interactive public websites.


GigaScience ◽  
2019 ◽  
Vol 8 (10) ◽  
Author(s):  
Yun-Ching Chen ◽  
Abhilash Suresh ◽  
Chingiz Underbayev ◽  
Clare Sun ◽  
Komudi Singh ◽  
...  

AbstractBackgroundIn single-cell RNA-sequencing analysis, clustering cells into groups and differentiating cell groups by differentially expressed (DE) genes are 2 separate steps for investigating cell identity. However, the ability to differentiate between cell groups could be affected by clustering. This interdependency often creates a bottleneck in the analysis pipeline, requiring researchers to repeat these 2 steps multiple times by setting different clustering parameters to identify a set of cell groups that are more differentiated and biologically relevant.FindingsTo accelerate this process, we have developed IKAP—an algorithm to identify major cell groups and improve differentiating cell groups by systematically tuning parameters for clustering. We demonstrate that, with default parameters, IKAP successfully identifies major cell types such as T cells, B cells, natural killer cells, and monocytes in 2 peripheral blood mononuclear cell datasets and recovers major cell types in a previously published mouse cortex dataset. These major cell groups identified by IKAP present more distinguishing DE genes compared with cell groups generated by different combinations of clustering parameters. We further show that cell subtypes can be identified by recursively applying IKAP within identified major cell types, thereby delineating cell identities in a multi-layered ontology.ConclusionsBy tuning the clustering parameters to identify major cell groups, IKAP greatly improves the automation of single-cell RNA-sequencing analysis to produce distinguishing DE genes and refine cell ontology using single-cell RNA-sequencing data.


2020 ◽  
pp. ASN.2020070930
Author(s):  
Christian Hinze ◽  
Nikos Karaiskos ◽  
Anastasiya Boltengagen ◽  
Katharina Walentin ◽  
Klea Redo ◽  
...  

BackgroundSingle-cell transcriptomes from dissociated tissues provide insights into cell types and their gene expression and may harbor additional information on spatial position and the local microenvironment. The kidney’s cells are embedded into a gradient of increasing tissue osmolality from the cortex to the medulla, which may alter their transcriptomes and provide cues for spatial reconstruction.MethodsSingle-cell or single-nuclei mRNA sequencing of dissociated mouse kidneys and of dissected cortex, outer, and inner medulla, to represent the corticomedullary axis, was performed. Computational approaches predicted the spatial ordering of cells along the corticomedullary axis and quantitated expression levels of osmo-responsive genes. In situ hybridization validated computational predictions of spatial gene-expression patterns. The strategy was used to compare single-cell transcriptomes from wild-type mice to those of mice with a collecting duct–specific knockout of the transcription factor grainyhead-like 2 (Grhl2CD−/−), which display reduced renal medullary osmolality.ResultsSingle-cell transcriptomics from dissociated kidneys provided sufficient information to approximately reconstruct the spatial position of kidney tubule cells and to predict corticomedullary gene expression. Spatial gene expression in the kidney changes gradually and osmo-responsive genes follow the physiologic corticomedullary gradient of tissue osmolality. Single-nuclei transcriptomes from Grhl2CD−/− mice indicated a flattened expression gradient of osmo-responsive genes compared with control mice, consistent with their physiologic phenotype.ConclusionsSingle-cell transcriptomics from dissociated kidneys facilitated the prediction of spatial gene expression along the corticomedullary axis and quantitation of osmotically regulated genes, allowing the prediction of a physiologic phenotype.


2020 ◽  
Vol 6 (45) ◽  
pp. eabc4773
Author(s):  
Tengjiao Zhang ◽  
Yichi Xu ◽  
Kaoru Imai ◽  
Teng Fei ◽  
Guilin Wang ◽  
...  

Progressive unfolding of gene expression cascades underlies diverse embryonic lineage development. Here, we report a single-cell RNA sequencing analysis of the complete and invariant embryonic cell lineage of the tunicate Ciona savignyi from fertilization to the onset of gastrulation. We reconstructed a developmental landscape of 47 cell types over eight cell cycles in the wild-type embryo and identified eight fate transformations upon fibroblast growth factor (FGF) inhibition. For most FGF-dependent asymmetric cell divisions, the bipotent mother cell displays the gene signature of the default daughter fate. In convergent differentiation of the two notochord lineages, we identified additional gene pathways parallel to the master regulator T/Brachyury. Last, we showed that the defined Ciona cell types can be matched to E6.5-E8.5 stage mouse cell types and display conserved expression of limited number of transcription factors. This study provides a high-resolution single-cell dataset to understand chordate early embryogenesis and cell lineage differentiation.


2021 ◽  
Author(s):  
Pere Catala ◽  
Nathalie Groen ◽  
Jasmin A Dehnen ◽  
Eduardo Soares ◽  
Arianne JH van Velthoven ◽  
...  

The cornea is the clear window that lets light into the eye. It is composed of five layers: epithelium, Bowman layer, stroma, Descemet membrane and endothelium. The maintenance of its structure and transparency are determined by the functions of the different cell types populating each layer. Attempts to regenerate corneal tissue and understand disease conditions requires knowledge of how cell profiles vary across this heterogeneous tissue. We performed a single cell transcriptomic profiling of 19,472 cells isolated from eight healthy donor corneas. Our analysis delineates the heterogeneity of the corneal layers by identifying cell populations and revealing cell states that contribute in preserving corneal homeostasis. We identified that the expression of CAV1, CXCL14, HOMER3 and CPVL were exclusive to the corneal epithelial limbal stem cell niche, CKS2, STMN1 and UBE2C were exclusively expressed in highly proliferative transit amplifying cells, and NNMT was exclusively expressed by stromal keratocytes. Overall, this research provides a basis to improve current primary cell expansion protocols, for future profiling of corneal disease states, to help guide pluripotent stem cells into different corneal lineages, and to understand how engineered substrates affect corneal cells to improve regenerative therapies.


BMC Biology ◽  
2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Elin Lundin ◽  
Chenglin Wu ◽  
Albin Widmark ◽  
Mikaela Behm ◽  
Jens Hjerling-Leffler ◽  
...  

Abstract Background Adenosine-to-inosine (A-to-I) RNA editing is a process that contributes to the diversification of proteins that has been shown to be essential for neurotransmission and other neuronal functions. However, the spatiotemporal and diversification properties of RNA editing in the brain are largely unknown. Here, we applied in situ sequencing to distinguish between edited and unedited transcripts in distinct regions of the mouse brain at four developmental stages, and investigate the diversity of the RNA landscape. Results We analyzed RNA editing at codon-altering sites using in situ sequencing at single-cell resolution, in combination with the detection of individual ADAR enzymes and specific cell type marker transcripts. This approach revealed cell-type-specific regulation of RNA editing of a set of transcripts, and developmental and regional variation in editing levels for many of the targeted sites. We found increasing editing diversity throughout development, which arises through regional- and cell type-specific regulation of ADAR enzymes and target transcripts. Conclusions Our single-cell in situ sequencing method has proved useful to study the complex landscape of RNA editing and our results indicate that this complexity arises due to distinct mechanisms of regulating individual RNA editing sites, acting both regionally and in specific cell types.


2019 ◽  
Author(s):  
Benjamin DeMeo ◽  
Bonnie Berger

AbstractSingle-cell RNA-sequencing (scRNA-seq) has grown massively in scale since its inception, presenting substantial analytic and computational challenges. Even simple downstream analyses, such as dimensionality reduction and clustering, require days of runtime and hundreds of gigabytes of memory for today’s largest datasets. In addition, current methods often favor common cell types, and miss salient biological features captured by small cell populations. Here we present Hopper, a single-cell toolkit that both speeds up the analysis of single-cell datasets and highlights their transcriptional diversity by intelligent subsampling, or sketching. Hopper realizes the optimal polynomial-time approximation of the Hausdorff distance between the full and downsampled dataset, ensuring that each cell is well-represented by some cell in the sample. Unlike prior sketching methods, Hopper adds points iteratively and allows for additional sampling from regions of interest, enabling fast and targeted multi-resolution analyses. In a dataset of over 1.3 million mouse brain cells, we detect a cluster of just 64 macrophages expressing inflammatory tissues (0.004% of the full dataset) from a Hopper sketch containing just 5,000 cells, and several other small but biologically interesting immune cell populations invisible to analysis of the full data. On an even larger dataset consisting of ~2 million developing mouse organ cells, we show even representation of important cell types in small sketch sizes, in contrast with prior sketching methods. By condensing transcriptional information encoded in large datasets, Hopper grants the individual user with a laptop the same analytic capabilities as large consortium.


2018 ◽  
Author(s):  
Brian S. Clark ◽  
Genevieve L. Stein-O’Brien ◽  
Fion Shiau ◽  
Gabrielle H. Cannon ◽  
Emily Davis ◽  
...  

SUMMARYPrecise temporal control of gene expression in neuronal progenitors is necessary for correct regulation of neurogenesis and cell fate specification. However, the extensive cellular heterogeneity of the developing CNS has posed a major obstacle to identifying the gene regulatory networks that control these processes. To address this, we used single cell RNA-sequencing to profile ten developmental stages encompassing the full course of retinal neurogenesis. This allowed us to comprehensively characterize changes in gene expression that occur during initiation of neurogenesis, changes in developmental competence, and specification and differentiation of each of the major retinal cell types. These data identify transitions in gene expression between early and late-stage retinal progenitors, as well as a classification of neurogenic progenitors. We identify here the NFI family of transcription factors (Nfia, Nfib, and Nfix) as genes with enriched expression within late RPCs, and show they are regulators of bipolar interneuron and Müller glia specification and the control of proliferative quiescence.


Author(s):  
Ritchie Ho ◽  
Michael J. Workman ◽  
Pranav Mathkar ◽  
Kathryn Wu ◽  
Kevin J. Kim ◽  
...  

SummaryInduced pluripotent stem cell (iPSC) derived neural cultures from amyotrophic lateral sclerosis (ALS) patients can reflect disease phenotypes targetable by treatments. However, widely used differentiation protocols produce mixtures of progenitors, neurons, glia, and other cells at various developmental stages and rostrocaudal neural tube segments. Here we present a methodology using single-cell RNA sequencing analysis to distinguish cell type expression in C9orf72 ALS, sporadic ALS, control, and genome-edited cultures across multiple subjects, experiments, and commercial platforms. Combinations of HOX and developmental gene expression with global clustering classified rostrocaudal, progenitor, and mantle zone fates. This demonstrated that iPSC-differentiated cells recapitulate fetal hindbrain and spinal cord development and resolved early, reproducible, and motor neuron-specific signatures of familial and sporadic ALS. This includes downregulated ELAVL3 expression, which persists into disease endstages. Single-cell analysis thus yielded predictive ALS markers in other human and mouse models which were otherwise undiscovered through bulk omics assays.


Sign in / Sign up

Export Citation Format

Share Document