scholarly journals IRIS3: integrated cell-type-specific regulon inference server from single-cell RNA-Seq

2020 ◽  
Vol 48 (W1) ◽  
pp. W275-W286 ◽  
Author(s):  
Anjun Ma ◽  
Cankun Wang ◽  
Yuzhou Chang ◽  
Faith H Brennan ◽  
Adam McDermaid ◽  
...  

Abstract A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from https://bmbl.bmi.osumc.edu/iris3/ with no login requirement.

2021 ◽  
Author(s):  
Kai Kang ◽  
Caizhi David Huang ◽  
Yuanyuan Li ◽  
David M. Umbach ◽  
Leping Li

AbstractBackgroundBiological tissues consist of heterogenous populations of cells. Because gene expression patterns from bulk tissue samples reflect the contributions from all cells in the tissue, understanding the contribution of individual cell types to the overall gene expression in the tissue is fundamentally important. We recently developed a computational method, CDSeq, that can simultaneously estimate both sample-specific cell-type proportions and cell-type-specific gene expression profiles using only bulk RNA-Seq counts from multiple samples. Here we present an R implementation of CDSeq (CDSeqR) with significant performance improvement over the original implementation in MATLAB and with a new function to aid interpretation of deconvolution outcomes. The R package would be of interest for the broader R community.ResultWe developed a novel strategy to substantially improve computational efficiency in both speed and memory usage. In addition, we designed and implemented a new function for annotating CDSeq-estimated cell types using publicly available single-cell RNA sequencing (scRNA-seq) data (single-cell data from 20 major organs are included in the R package). This function allows users to readily interpret and visualize the CDSeq-estimated cell types. We carried out additional validations of the CDSeqR software with in silico and in vitro mixtures and with real experimental data including RNA-seq data from the Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEx) project.ConclusionsThe existing bulk RNA-seq repositories, such as TCGA and GTEx, provide enormous resources for better understanding changes in transcriptomics and human diseases. They are also potentially useful for studying cell-cell interactions in the tissue microenvironment. However, bulk level analyses neglect tissue heterogeneity and hinder investigation in a cell-type-specific fashion. The CDSeqR package can be viewed as providing in silico single-cell dissection of bulk measurements. It enables researchers to gain cell-type-specific information from bulk RNA-seq data.


2021 ◽  
Author(s):  
Su Chun ◽  
Long Gao ◽  
Catherine L May ◽  
James A Pippin ◽  
Keith Boehm ◽  
...  

Three-dimensional (3D) chromatin organization maps help to dissect cell type-specific gene regulatory programs. Furthermore, 3D chromatin maps have contributed to elucidating the pathogenesis of complex genetic diseases by connecting distal regulatory regions and genetic risk variants to their respective target genes. To understand the cell type-specific regulatory architecture of diabetes risk, we generated transcriptomic and 3D epigenomic profiles of human pancreatic acinar, alpha, and beta cells using single-cell RNA-seq, single-cell ATAC-seq, and high-resolution Hi-C of sorted cells. Comparisons of these profiles revealed differential A/B (open/closed) chromatin compartmentalization, chromatin looping, and control of cell type-specific gene regulatory programs. We identified a total of 1,094 putative causal-variant-target-gene pairs at 129 type 2 diabetes GWAS signals using pancreatic 3D chromatin maps. We found that the connections between candidate causal variants and their putative target effector genes are cell-type stratified and emphasize previously underappreciated roles for alpha and acinar cells in diabetes pathogenesis


2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


2019 ◽  
Author(s):  
Matthew N. Bernstein ◽  
Zhongjie Ma ◽  
Michael Gleicher ◽  
Colin N. Dewey

SummaryCell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification by considering the rich hierarchical structure of known cell types, a source of prior knowledge that is not utilized by existing methods. Furthemore, CellO comes pre-trained on a novel, comprehensive dataset of human, healthy, untreated primary samples in the Sequence Read Archive, which to the best of our knowledge, is the most diverse curated collection of primary cell data to date. CellO’s comprehensive training set enables it to run out-of-the-box on diverse cell types and achieves superior or competitive performance when compared to existing state-of-the-art methods. Lastly, CellO’s linear models are easily interpreted, thereby enabling exploration of cell type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO’s models across the ontology.HighlightWe present CellO, a tool for hierarchically classifying cell type from single-cell RNA-seq data against the graph-structured Cell OntologyCellO is pre-trained on a comprehensive dataset comprising nearly all bulk RNA-seq primary cell samples in the Sequence Read ArchiveCellO achieves superior or comparable performance with existing methods while featuring a more comprehensive pre-packaged training setCellO is built with easily interpretable models which we expose through a novel web application, the CellO Viewer, for exploring cell type-specific signatures across the Cell OntologyGraphical Abstract


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e19013-e19013
Author(s):  
Marianne T. Santaguida ◽  
Ryosuke Kita ◽  
Steven A. Schaffert ◽  
Erica K. Anderson ◽  
Kamran A Ali ◽  
...  

e19013 Background: Understanding the heterogeneity of AML is necessary for developing targeted drugs and diagnostics. A key measure of heterogeneity is the variance in response to treatments. Previously, we developed an ex vivo flow cytometry drug sensitivity assay (DSA) that predicted response to treatments in myelodysplastic syndrome. Unlike bulk cell viability measures of other drug sensitivity assays, our flow cytometry assay provides single cell resolution. The assay measures a drug’s effect on the viability or functional state of specific cell types. Here we present the development of this technology for AML, with additional measurements of DNA-Seq and RNA-Seq. Using the data from this assay, we aim to characterize the heterogeneity in AML drug sensitivity and the molecular mechanisms that drive it. Methods: As an initial feasibility analysis, we assayed 1 bone marrow and 3 peripheral blood AML patient samples. For the DSA, the samples were cultured with six AML standard of care (SOC) compounds across seven doses, in addition to two combinations. The cells were stained to detect multiple cell types including tumor blasts, and drug response was measured by flow cytometry. For the multi-omics, the cells were magnetically sorted to enrich for blasts and then assayed using a targeted 400 gene DNA-Seq panel and whole bulk transcriptome RNA-Seq. For comparison with BeatAML, Pearson correlations between gene expression and venetoclax sensitivity were investigated. Results: In our drug sensitivity assay, we measured dose response curves for the six SOC compounds, for each different cell type across each sample. The dose responses had cell type specific effects, including differences in drug response between CD11b+ blasts, CD11b- blasts, and other non-blast populations. Integrating with the DNA-Seq and RNA-Seq data, known associations between ex vivo drug response and gene expression were identified with additional cell type specificity. For example, BCL2A1 expression was negatively correlated with venetoclax sensitivity in CD11b- blasts but not in CD11b+ blasts. To further corroborate, among the top 1000 genes associated with venetoclax sensitivity in BeatAML, 93.7% had concordant directionality in effect. Conclusions: Here we describe the development of an integrated ex vivo drug sensitivity assay and multi-omics dataset. The data demonstrated that ex vivo responses to compounds differ between cell types, highlighting the importance of measuring drug response in specific cell types. In addition, we demonstrated that integrating these data will provide unique insights on molecular mechanisms that affect cell type specific drug response. As we continue to expand the number of patient samples evaluated with our multi-dimensional platform, this dataset will provide insights for novel drug target discovery, biomarker development, and, in the future, informing treatment decisions.


2019 ◽  
Author(s):  
Pawel F. Przytycki ◽  
Katherine S. Pollard

Single-cell and bulk genomics assays have complementary strengths and weaknesses, and alone neither strategy can fully capture regulatory elements across the diversity of cells in complex tissues. We present CellWalker, a method that integrates single-cell open chromatin (scATAC-seq) data with gene expression (RNA-seq) and other data types using a network model that simultaneously improves cell labeling in noisy scATAC-seq and annotates cell-type specific regulatory elements in bulk data. We demonstrate CellWalker’s robustness to sparse annotations and noise using simulations and combined RNA-seq and ATAC-seq in individual cells. We then apply CellWalker to the developing brain. We identify cells transitioning between transcriptional states, resolve enhancers to specific cell types, and observe that autism and other neurological traits can be mapped to specific cell types through their enhancers.


2020 ◽  
Author(s):  
Miao Yu ◽  
Armen Abnousi ◽  
Yanxiao Zhang ◽  
Guoqiang Li ◽  
Lindsay Lee ◽  
...  

Single cell Hi-C (scHi-C) analysis has been increasingly used to map the chromatin architecture in diverse tissue contexts, but computational tools to define chromatin contacts at high resolution from scHi-C data are still lacking. Here, we describe SnapHiC, a method that can identify chromatin loops at high resolution and accuracy from scHi-C data. We benchmark SnapHiC against HiCCUPS, a common tool for mapping chromatin contacts in bulk Hi-C data, using scHi-C data from 742 mouse embryonic stem cells. We further demonstrate its utility by analyzing single-nucleus methyl-3C-seq data from 2,869 human prefrontal cortical cells. We uncover cell-type-specific chromatin loops and predict putative target genes for non-coding sequence variants associated with neuropsychiatric disorders. Our results suggest that SnapHiC could facilitate the analysis of cell-type-specific chromatin architecture and gene regulatory programs in complex tissues.


2018 ◽  
Author(s):  
Xuran Wang ◽  
Jihwan Park ◽  
Katalin Susztak ◽  
Nancy R. Zhang ◽  
Mingyao Li

AbstractWe present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. When applied to pancreatic islet and whole kidney expression data in human, mouse, and rats, MuSiC outperformed existing methods, especially for tissues with closely related cell types. MuSiC enables characterization of cellular heterogeneity of complex tissues for identification of disease mechanisms.


2021 ◽  
Author(s):  
Jiaxing Chen ◽  
Chinwang Cheong ◽  
Liang Lan ◽  
Xin Zhou ◽  
Jiming Liu ◽  
...  

AbstractSingle-cell RNA sequencing is used to capture cell-specific gene expression, thus allowing reconstruction of gene regulatory networks. The existing algorithms struggle to deal with dropouts and cellular heterogeneity, and commonly require pseudotime-ordered cells. Here, we describe DeepDRIM a supervised deep neural network that represents gene pair joint expression as images and considers the neighborhood context to eliminate the transitive interactions. Deep-DRIM yields significantly better performance than the other nine algorithms used on the eight cell lines tested, and can be used to successfully discriminate key functional modules between patients with mild and severe symptoms of coronavirus disease 2019 (COVID-19).


2021 ◽  
Author(s):  
Ruizhi Wang ◽  
Debomoy K. Lahiri

Abstract Alzheimer’s disease (AD) is marked by neurofibrillary tangles and senile plaques comprising amyloid β (Aβ) peptides. However, specific contributions of different cell types to Aβ deposition remain unknown. Non-coding microRNA (miRNA) play important roles in AD by regulating major proteins involved, like Aβ precursor protein (APP) and β-site APP-cleaving enzyme (BACE1), two key proteins associated with Aβ biogenesis. MiRNAs typically silence protein expression via binding specific sites in 3’- untranslated region (3’UTR) mRNA. MiRNA regulates protein levels in a cell-type specific manner; however, mechanism of miRNA’s variable activities remains unknown. We developed “miRNA-associated native protein expression” (miRnape) assays to determine a natural "UTR limit" for a miRNA’s function in a particular cell type. We report that miR-298 treatment reduced native APP protein levels in an astrocytic but not in a neuronal cell line. From miR-298’s effects on APP-3’UTR activity and native protein levels, we infer that APP 3’-UTR length could explain the differential miR-298’s activity. Such truncated, but natural, 3’-UTR found in a specific cell type provides an opportunity to regulate native protein levels by particular miRNA. Thus, miRNA’s effect tailoring to a specific cell type bypassing another undesired cell type with a truncated 3’-UTR would potentially advance translational research.


Sign in / Sign up

Export Citation Format

Share Document