scholarly journals Rapid single cell evaluation of human disease and disorder targets using REVEAL: SingleCell™

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Namit Kumar ◽  
Ryan Golhar ◽  
Kriti Sen Sharma ◽  
James L. Holloway ◽  
Srikant Sarangi ◽  
...  

Abstract Background Single-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbated when working with larger datasets typically generated by consortium efforts. As the scale of single cell datasets continues to increase exponentially, there is an unmet technological need to develop database platforms that can evaluate key biological hypotheses by querying extensive single-cell datasets. Large single-cell datasets like Human Cell Atlas and COVID-19 cell atlas (collection of annotated sc datasets from various human organs) are excellent resources for profiling target genes involved in human diseases and disorders ranging from oncology, auto-immunity, as well as infectious diseases like COVID-19 caused by SARS-CoV-2 virus. SARS-CoV-2 infections have led to a worldwide pandemic with massive loss of lives, infections exceeding 7 million cases. The virus uses ACE2 and TMPRSS2 as key viral entry associated proteins expressed in human cells for infections. Evaluating the expression profile of key genes in large single-cell datasets can facilitate testing for diagnostics, therapeutics, and vaccine targets, as the world struggles to cope with the on-going spread of COVID-19 infections. Main body In this manuscript we describe REVEAL: SingleCell, which enables storage, retrieval, and rapid query of single-cell datasets inclusive of millions of cells. The array native database described here enables selecting and analyzing cells across multiple studies. Cells can be selected using individual metadata tags, more complex hierarchical ontology filtering, and gene expression threshold ranges, including co-expression of multiple genes. The tags on selected cells can be further evaluated for testing biological hypotheses. One such example includes identifying the most prevalent cell type annotation tag on returned cells. We used REVEAL: SingleCell to evaluate the expression of key SARS-CoV-2 entry associated genes, and queried the current database (2.2 Million cells, 32 projects) to obtain the results in < 60 s. We highlighted cells expressing COVID-19 associated genes are expressed on multiple tissue types, thus in part explains the multi-organ involvement in infected patients observed worldwide during the on-going COVID-19 pandemic. Conclusion In this paper, we introduce the REVEAL: SingleCell database that addresses immediate needs for SARS-CoV-2 research and has the potential to be used more broadly for many precision medicine applications. We used the REVEAL: SingleCell database as a reference to ask questions relevant to drug development and precision medicine regarding cell type and co-expression for genes that encode proteins necessary for SARS-CoV-2 to enter and reproduce in cells.

2020 ◽  
Author(s):  
Namit Kumar ◽  
Ryan Golhar ◽  
Kriti Sen Sharma ◽  
James L Holloway ◽  
Srikant Sarangi ◽  
...  

AbstractSingle-cell (sc) sequencing performs unbiased profiling of individual cells and enables evaluation of less prevalent cellular populations, often missed using bulk sequencing. However, the scale and the complexity of the sc datasets poses a great challenge in its utility and this problem is further exacerbated when working with larger datasets typically generated by consortium efforts. As the scale of single cell datasets continues to increase exponentially, there is an unmet technological need to develop database platforms that can evaluate key biological hypothesis by querying extensive single-cell datasets.Large single-cell datasets like human cell atlas and COVID-19 cell atlas (collection of annotated sc datasets from various human organs) are excellent resources for profiling target genes involved in human diseases and disorders ranging from oncology, auto-immunity, as well as infectious diseases like COVID-19 caused by SARS-CoV-2 virus. SARS-CoV-2 infections have led to a worldwide pandemic with massive loss of lives, infections exceeding 7 million cases. The virus uses ACE2 and TMPRSS2 as key viral entry associated proteins expressed in human cells for infections. Evaluating the expression profile of key genes in large single-cell datasets can facilitate testing for diagnostics, therapeutics and vaccine targets; as the world struggles to cope with the on-going spread of COVID-19 infections.In this manuscript we describe, REVEAL: SingleCell which enables storage, retrieval and rapid query of single-cell datasets inclusive of millions of cells. The analytical database described here enables selecting and analyzing cells across multiple studies. Cells can be selected using individual metadata tags, more complex hierarchical ontology filtering, and gene expression threshold ranges, including co-expression of multiple genes. The tags on selected cells can be further evaluated for testing biological hypothesis. One such example includes identifying the most prevalent cell type annotation tag on returned cells.We used REVEAL: SingleCell to evaluate expression of key SARS-CoV-2 entry associated genes, and queried the current database (2.2 Million cells, 32 projects) to obtain the results in <60 seconds. We highlighted cells expressing COVID-19 associated genes are expressed on multiple tissue types, thus in part explains the multi-organ involvement in infected patients observed worldwide during the on-going COVID-19 pandemic.


2020 ◽  
Author(s):  
Miao Yu ◽  
Armen Abnousi ◽  
Yanxiao Zhang ◽  
Guoqiang Li ◽  
Lindsay Lee ◽  
...  

Single cell Hi-C (scHi-C) analysis has been increasingly used to map the chromatin architecture in diverse tissue contexts, but computational tools to define chromatin contacts at high resolution from scHi-C data are still lacking. Here, we describe SnapHiC, a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. We benchmark SnapHiC against HiCCUPS, a common tool for mapping chromatin contacts in bulk Hi-C data, using scHi-C data from 742 mouse embryonic stem cells. We further demonstrate its utility by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells. We uncover cell-type-specific chromatin loops and predict putative target genes for non-coding sequence variants associated with neuropsychiatric disorders. Our results suggest that SnapHiC could facilitate the analysis of cell-type-specific chromatin architecture and gene regulatory programs in complex tissues.


2019 ◽  
Author(s):  
Joshua Chiou ◽  
Chun Zeng ◽  
Zhang Cheng ◽  
Jee Yun Han ◽  
Michael Schlichting ◽  
...  

AbstractGenetic risk variants for complex, multifactorial diseases are enriched in cis-regulatory elements. Single cell epigenomic technologies create new opportunities to dissect cell type-specific mechanisms of risk variants, yet this approach has not been widely applied to disease-relevant tissues. Given the central role of pancreatic islets in type 2 diabetes (T2D) pathophysiology, we generated accessible chromatin profiles from 14.2k islet cells and identified 13 cell clusters including multiple alpha, beta and delta cell clusters which represented hormone-producing and signal-responsive cell states. We cataloged 244,236 islet cell type accessible chromatin sites and identified transcription factors (TFs) underlying both lineage- and state-specific regulation. We measured the enrichment of T2D and glycemic trait GWAS for the accessible chromatin profiles of single cells, which revealed heterogeneity in the effects of beta cell states and TFs on fasting glucose and T2D risk. We further used machine learning to predict the cell type-specific regulatory function of genetic variants, and single cell co-accessibility to link distal sites to putative cell type-specific target genes. We localized 239 fine-mapped T2D risk signals to islet accessible chromatin, and further prioritized variants at these signals with predicted regulatory function and co-accessibility with target genes. At the KCNQ1 locus, the causal T2D variant rs231361 had predicted effects on an enhancer with beta cell-specific, long-range co-accessibility to the insulin promoter, and deletion of this enhancer reduced insulin gene and protein expression in human embryonic stem cell-derived beta cells. Our findings provide a cell type- and state-resolved map of gene regulation in human islets, illuminate likely mechanisms of T2D risk at hundreds of loci, and demonstrate the power of single cell epigenomics for interpreting complex disease genetics.


2020 ◽  
Vol 48 (W1) ◽  
pp. W275-W286 ◽  
Author(s):  
Anjun Ma ◽  
Cankun Wang ◽  
Yuzhou Chang ◽  
Faith H Brennan ◽  
Adam McDermaid ◽  
...  

Abstract A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.


2019 ◽  
Author(s):  
Ashley G. Anderson ◽  
Ashwinikumar Kulkarni ◽  
Matthew Harper ◽  
Genevieve Konopka

AbstractThe striatum is a critical forebrain structure for integrating cognitive, sensory, and motor information from diverse brain regions into meaningful behavioral output. However, the transcriptional mechanisms that underlie striatal development and organization at single-cell resolution remain unknown. Here, we show that Foxp1, a transcription factor strongly linked to autism and intellectual disability, regulates organizational features of striatal circuitry in a cell-type-dependent fashion. Using single-cell RNA-sequencing, we examine the cellular diversity of the early postnatal striatum and find that cell-type-specific deletion ofFoxp1in striatal projection neurons alters the cellular composition and neurochemical architecture of the striatum. Importantly, using this approach, we identify the non-cell autonomous effects produced by disruptingFoxp1in one cell-type and the molecular compensation that occurs in other populations. Finally, we identify Foxp1-regulated target genes within distinct cell-types and connect these molecular changes to functional and behavioral deficits relevant to phenotypes described in patients withFOXP1loss-of-function mutations. These data reveal cell-type-specific transcriptional mechanisms underlying distinct features of striatal circuitry and identify Foxp1 as a key regulator of striatal development.


2021 ◽  
Vol 18 (9) ◽  
pp. 1056-1059
Author(s):  
Miao Yu ◽  
Armen Abnousi ◽  
Yanxiao Zhang ◽  
Guoqiang Li ◽  
Lindsay Lee ◽  
...  

AbstractSingle-cell Hi-C (scHi-C) analysis has been increasingly used to map chromatin architecture in diverse tissue contexts, but computational tools to define chromatin loops at high resolution from scHi-C data are still lacking. Here, we describe Single-Nucleus Analysis Pipeline for Hi-C (SnapHiC), a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. Using scHi-C data from 742 mouse embryonic stem cells, we benchmark SnapHiC against a number of computational tools developed for mapping chromatin loops and interactions from bulk Hi-C. We further demonstrate its use by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells, which uncovers cell type-specific chromatin loops and predicts putative target genes for noncoding sequence variants associated with neuropsychiatric disorders. Our results indicate that SnapHiC could facilitate the analysis of cell type-specific chromatin architecture and gene regulatory programs in complex tissues.


2021 ◽  
Vol 12 ◽  
Author(s):  
Neetesh Pandey ◽  
Omkar Chandra ◽  
Shreya Mishra ◽  
Vibhor Kumar

Single-cell open-chromatin profiles have the potential to reveal the pattern of chromatin-interaction in a cell type. However, currently available cis-regulatory network prediction methods using single-cell open-chromatin profiles focus more on local chromatin interactions despite the fact that long-range interactions among genomic sites play a significant role in gene regulation. Here, we propose a method that predicts both short and long-range interactions among genomic sites using single-cell open chromatin profiles. Our method, termed as single-cell epigenome based chromatin-interaction analysis (scEChIA) exploits signal imputation and refined L1 regularization. For a few single-cell open-chromatin profiles, scEChIA outperformed other tools even in terms of accuracy of prediction. Using scEChIA, we predicted almost 0.7 million interactions among genomic sites across seven cell types in the human brain. Further analysis revealed cell type for connection between genes and expression quantitative trait locus (eQTL) in the human brain and making insight about target genes of human-accelerated-elements and disease-associated mutations. Our analysis enabled by scEChIA also hints about the possible action of a few transcription factors (TFs), especially through long-range interaction in brain endothelial cells.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Suvi Linna Kuosmanen ◽  
Eloi Schmauch ◽  
Kyriakitsa Galani ◽  
Carles Boix ◽  
Yongjin P Park ◽  
...  

Genome-wide association studies have uncovered over 200 genetic loci underlying coronary artery disease (CAD), providing great hope for a deeper understanding of the causal mechanisms leading to this disease. However, in order to understand CAD at the molecular level, it is necessary to uncover cell-type-specific circuits and to use these circuits to dissect driver variants, genes, pathways, and cell types, in normal and diseased tissues. Here, we provide the most detailed single-cell dissection of human heart cell types, using cardiac biopsies collected during open-heart surgery from healthy, CAD, and CAD-related heart failure donors, and profiling both transcriptional (scRNA-seq) and epigenomic (scATAC-seq) changes. Using this approach, we identify 12 major heart cell types, including typical cardiovascular cells (cardiomyocytes, endothelial cells, fibroblasts), rarer cell types (B cells, neurons, Schwann cells), and previously-unrecognized layer-specific epithelial and endothelial cell types. We define markers for each cell type, providing the first extensive reference set for the living human heart. In addition, we define differential gene expression patterns in CAD relative to control samples, revealing substantial differences in cell-type-specific expression of disease-related genes, emphasizing, for example, the importance of the vascular endothelium in the pathogenesis of CAD. Strikingly, further clustering of the cell types based on specific subtypes revealed important differences in their expression patterns of disease-associated genes. These changes enrich in known CAD genetic loci, enabling us to recognize their likely target genes from scRNA-seq expression changes, candidate driver variants based on scATAC-seq localization and differential DNA accessibility, and candidate upstream regulators based on their enriched motif occurrences in scATAC loci. Overall, our results highlight the relevance and potential of single-cell transcriptional and epigenomic analyses to gain new biological insights into cardiovascular disease, and to recognize novel therapeutic target genes, pathways, and the cell types where they act.


Author(s):  
Christoph Muus ◽  
Malte D. Luecken ◽  
Gokcen Eraslan ◽  
Avinash Waghray ◽  
Graham Heimberg ◽  
...  

ABSTRACTThe COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, creates an urgent need for identifying molecular mechanisms that mediate viral entry, propagation, and tissue pathology. Cell membrane bound angiotensin-converting enzyme 2 (ACE2) and associated proteases, transmembrane protease serine 2 (TMPRSS2) and Cathepsin L (CTSL), were previously identified as mediators of SARS-CoV2 cellular entry. Here, we assess the cell type-specific RNA expression of ACE2, TMPRSS2, and CTSL through an integrated analysis of 107 single-cell and single-nucleus RNA-Seq studies, including 22 lung and airways datasets (16 unpublished), and 85 datasets from other diverse organs. Joint expression of ACE2 and the accessory proteases identifies specific subsets of respiratory epithelial cells as putative targets of viral infection in the nasal passages, airways, and alveoli. Cells that co-express ACE2 and proteases are also identified in cells from other organs, some of which have been associated with COVID-19 transmission or pathology, including gut enterocytes, corneal epithelial cells, cardiomyocytes, heart pericytes, olfactory sustentacular cells, and renal epithelial cells. Performing the first meta-analyses of scRNA-seq studies, we analyzed 1,176,683 cells from 282 nasal, airway, and lung parenchyma samples from 164 donors spanning fetal, childhood, adult, and elderly age groups, associate increased levels of ACE2, TMPRSS2, and CTSL in specific cell types with increasing age, male gender, and smoking, all of which are epidemiologically linked to COVID-19 susceptibility and outcomes. Notably, there was a particularly low expression of ACE2 in the few young pediatric samples in the analysis. Further analysis reveals a gene expression program shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues, including genes that may mediate viral entry, subtend key immune functions, and mediate epithelial-macrophage cross-talk. Amongst these are IL6, its receptor and co-receptor, IL1R, TNF response pathways, and complement genes. Cell type specificity in the lung and airways and smoking effects were conserved in mice. Our analyses suggest that differences in the cell type-specific expression of mediators of SARS-CoV-2 viral entry may be responsible for aspects of COVID-19 epidemiology and clinical course, and point to putative molecular pathways involved in disease susceptibility and pathogenesis.


Sign in / Sign up

Export Citation Format

Share Document