scholarly journals Single Cell App: An App for Single Cell RNA-sequencing Data Visualization, Comparison and Discovery

2021 ◽  
Author(s):  
Mathew G. Lewsey ◽  
Changyu Yi ◽  
Oliver Berkowitz ◽  
Felipe Ayora ◽  
Maurice Bernado ◽  
...  

SummaryThe Single Cell App is a cloud-based application that allows visualisation and comparison of scRNA-seq data and is scalable according to use. Users upload their own or publicly available scRNA-seq datasets after pre-processing to be visualised using a web browser. The data can be viewed in two colour modes, Cluster - representing cell identity, and Values – level of expression, and data queried using keyword or gene identification number(s). Using the app to compare four different studies we determined that some genes frequently used as cell-type markers are in fact study specific. Phosphate transporter and hormone response genes were exemplary investigated with the app. This showed that the apparent cell specific expression of PHO1;H3 differed between GFP-tagging and scRNA-seq studies. Some phosphate transporter genes were induced by protoplasting, they retained cell specificity, indicating that cell specific stress responses (i.e. protoplasting). Examination of the cell specificity of hormone response genes revealed that 132 hormone responsive genes display restricted expression and that the jasmonate response gene TIFY8 is expressed in endodermal cells which differs from previous reports. It also appears that JAZ repressors have cell-type specific functions. These differences, identified using the Single Cell App, highlight the need for resources to enable researchers to find common and different patterns of cell specific gene expression. Thus, the Single Cell App enables researchers to form new hypothesis, perform comparative studies, allows for easy re-use of data for this emerging technology to provide novel avenues to crop improvement.

2021 ◽  
Author(s):  
Dongshunyi Li ◽  
Jun Ding ◽  
Ziv Bar-Joseph

One of the first steps in the analysis of single cell RNA-Sequencing data (scRNA-Seq) is the assignment of cell types. While a number of supervised methods have been developed for this, in most cases such assignment is performed by first clustering cells in low-dimensional space and then assigning cell types to different clusters. To overcome noise and to improve cell type assignments we developed UNIFAN, a neural network method that simultaneously clusters and annotates cells using known gene sets. UNIFAN combines both, low dimension representation for all genes and cell specific gene set activity scores to determine the clustering. We applied UNIFAN to human and mouse scRNA-Seq datasets from several different organs. As we show, by using knowledge on gene sets, UNIFAN greatly outperforms prior methods developed for clustering scRNA-Seq data. The gene sets assigned by UNIFAN to different clusters provide strong evidence for the cell type that is represented by this cluster making annotations easier.


2021 ◽  
Author(s):  
Zi-Hang Wen ◽  
Jeremy L. Langsam ◽  
Lu Zhang ◽  
Wenjun Shen ◽  
Xin Zhou

AbstractSingle-cell RNA-seq (scRNA-seq) offers opportunities to study gene expression of tens of thousands of single cells simultaneously, to investigate cell-to-cell variation, and to reconstruct cell-type-specific gene regulatory networks. Recovering dropout events in a sparse gene expression matrix for scRNA-seq data is a long-standing matrix completion problem. We introduce Bfimpute, a Bayesian factorization imputation algorithm that reconstructs two latent gene and cell matrices to impute final gene expression matrix within each cell group, with or without the aid of cell type labels or bulk data. Bfimpute achieves better accuracy than other six publicly notable scRNA-seq imputation methods on simulated and real scRNA-seq data, as measured by several different evaluation metrics. Bfimpute can also flexibly integrate any gene or cell related information that users provide to increase the performance. Availability: Bfimpute is implemented in R and is freely available at https://github.com/maiziezhoulab/Bfimpute.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Rongxin Fang ◽  
Sebastian Preissl ◽  
Yang Li ◽  
Xiaomeng Hou ◽  
Jacinta Lucero ◽  
...  

AbstractIdentification of the cis-regulatory elements controlling cell-type specific gene expression patterns is essential for understanding the origin of cellular diversity. Conventional assays to map regulatory elements via open chromatin analysis of primary tissues is hindered by sample heterogeneity. Single cell analysis of accessible chromatin (scATAC-seq) can overcome this limitation. However, the high-level noise of each single cell profile and the large volume of data pose unique computational challenges. Here, we introduce SnapATAC, a software package for analyzing scATAC-seq datasets. SnapATAC dissects cellular heterogeneity in an unbiased manner and map the trajectories of cellular states. Using the Nyström method, SnapATAC can process data from up to a million cells. Furthermore, SnapATAC incorporates existing tools into a comprehensive package for analyzing single cell ATAC-seq dataset. As demonstration of its utility, SnapATAC is applied to 55,592 single-nucleus ATAC-seq profiles from the mouse secondary motor cortex. The analysis reveals ~370,000 candidate regulatory elements in 31 distinct cell populations in this brain region and inferred candidate cell-type specific transcriptional regulators.


Author(s):  
Zilong Zhang ◽  
Feifei Cui ◽  
Chen Lin ◽  
Lingling Zhao ◽  
Chunyu Wang ◽  
...  

Abstract Single-cell RNA sequencing (scRNA-seq) has enabled us to study biological questions at the single-cell level. Currently, many analysis tools are available to better utilize these relatively noisy data. In this review, we summarize the most widely used methods for critical downstream analysis steps (i.e. clustering, trajectory inference, cell-type annotation and integrating datasets). The advantages and limitations are comprehensively discussed, and we provide suggestions for choosing proper methods in different situations. We hope this paper will be useful for scRNA-seq data analysts and bioinformatics tool developers.


2020 ◽  
Author(s):  
Nil Aygün ◽  
Angela L. Elwell ◽  
Dan Liang ◽  
Michael J. Lafferty ◽  
Kerry E. Cheek ◽  
...  

SummaryInterpretation of the function of non-coding risk loci for neuropsychiatric disorders and brain-relevant traits via gene expression and alternative splicing is mainly performed in bulk post-mortem adult tissue. However, genetic risk loci are enriched in regulatory elements of cells present during neocortical differentiation, and regulatory effects of risk variants may be masked by heterogeneity in bulk tissue. Here, we map e/sQTLs and allele specific expression in primary human neural progenitors (n=85) and their sorted neuronal progeny (n=74). Using colocalization and TWAS, we uncover cell-type specific regulatory mechanisms underlying risk for these traits.


2021 ◽  
Author(s):  
Yun Zhang ◽  
Brian Aevermann ◽  
Rohan Gala ◽  
Richard H. Scheuermann

Reference cell type atlases powered by single cell transcriptomic profiling technologies have become available to study cellular diversity at a granular level. We present FR-Match for matching query datasets to reference atlases with robust and accurate performance for identifying novel cell types and non-optimally clustered cell types in the query data. This approach shows excellent performance for cross-platform, cross-sample type, cross-tissue region, and cross-data modality cell type matching.


1987 ◽  
Vol 7 (9) ◽  
pp. 3185-3193
Author(s):  
K Inokuchi ◽  
A Nakayama ◽  
F Hishinuma

The MF alpha 1 gene of Saccharomyces cerevisiae, a major structural gene for mating pheromone alpha factor, is an alpha-specific gene whose expression is regulated by the mating-type locus. To study the role of sequences upstream of MF alpha 1 in its expression and regulation, we generated two sets of promoter deletions: upstream deletions and internal deletions. By analyzing these deletions, we have identified a TATA box and two closely related, tandemly arranged upstream activation sites as necessary elements for MF alpha 1 expression. Two upstream activation sites were located ca. 300 and 250 base pairs upstream of the MF alpha 1 transcription start points, which were also determined in this study. Each site contained a homologous 22-base-pair sequence, and both sites were required for maximum transcription level. The distance between the upstream activation sites and the transcription start points could be altered without causing loss of transcription efficiency, and the sites were active in either orientation with respect to the coding region. These elements conferred cell type-specific expression on a heterologous promoter. Analysis with host mating-type locus mutants indicates that these sequences are the sites through which the MAT alpha 1 product exerts its action to activate the MF alpha 1 gene. Homologous sequences with these elements were found in other alpha-specific genes, MF alpha 2 and STE3, and may mediate activation of this set of genes by MAT alpha 1.


GigaScience ◽  
2019 ◽  
Vol 8 (9) ◽  
Author(s):  
Luca Alessandrì ◽  
Francesca Cordero ◽  
Marco Beccuti ◽  
Maddalena Arigoni ◽  
Martina Olivero ◽  
...  

Abstract Background Single-cell RNA sequencing is essential for investigating cellular heterogeneity and highlighting cell subpopulation-specific signatures. Single-cell sequencing applications have spread from conventional RNA sequencing to epigenomics, e.g., ATAC-seq. Many related algorithms and tools have been developed, but few computational workflows provide analysis flexibility while also achieving functional (i.e., information about the data and the tools used are saved as metadata) and computational reproducibility (i.e., a real image of the computational environment used to generate the data is stored) through a user-friendly environment. Findings rCASC is a modular workflow providing an integrated analysis environment (from count generation to cell subpopulation identification) exploiting Docker containerization to achieve both functional and computational reproducibility in data analysis. Hence, rCASC provides preprocessing tools to remove low-quality cells and/or specific bias, e.g., cell cycle. Subpopulation discovery can instead be achieved using different clustering techniques based on different distance metrics. Cluster quality is then estimated through the new metric "cell stability score" (CSS), which describes the stability of a cell in a cluster as a consequence of a perturbation induced by removing a random set of cells from the cell population. CSS provides better cluster robustness information than the silhouette metric. Moreover, rCASC's tools can identify cluster-specific gene signatures. Conclusions rCASC is a modular workflow with new features that could help researchers define cell subpopulations and detect subpopulation-specific markers. It uses Docker for ease of installation and to achieve a computation-reproducible analysis. A Java GUI is provided to welcome users without computational skills in R.


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 240 ◽  
Author(s):  
Prashant N. M. ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
Liam Spurr ◽  
Nawaf Alomran ◽  
...  

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Ciara H. O’Flanagan ◽  
◽  
Kieran R. Campbell ◽  
Allen W. Zhang ◽  
Farhia Kabeer ◽  
...  

Abstract Background Single-cell RNA sequencing (scRNA-seq) is a powerful tool for studying complex biological systems, such as tumor heterogeneity and tissue microenvironments. However, the sources of technical and biological variation in primary solid tumor tissues and patient-derived mouse xenografts for scRNA-seq are not well understood. Results We use low temperature (6 °C) protease and collagenase (37 °C) to identify the transcriptional signatures associated with tissue dissociation across a diverse scRNA-seq dataset comprising 155,165 cells from patient cancer tissues, patient-derived breast cancer xenografts, and cancer cell lines. We observe substantial variation in standard quality control metrics of cell viability across conditions and tissues. From the contrast between tissue protease dissociation at 37 °C or 6 °C, we observe that collagenase digestion results in a stress response. We derive a core gene set of 512 heat shock and stress response genes, including FOS and JUN, induced by collagenase (37 °C), which are minimized by dissociation with a cold active protease (6 °C). While induction of these genes was highly conserved across all cell types, cell type-specific responses to collagenase digestion were observed in patient tissues. Conclusions The method and conditions of tumor dissociation influence cell yield and transcriptome state and are both tissue- and cell-type dependent. Interpretation of stress pathway expression differences in cancer single-cell studies, including components of surface immune recognition such as MHC class I, may be especially confounded. We define a core set of 512 genes that can assist with the identification of such effects in dissociated scRNA-seq experiments.


Sign in / Sign up

Export Citation Format

Share Document