scholarly journals The landscape of human tissue and cell type specific expression and co-regulation of senescence genes

2022 ◽  
Vol 17 (1) ◽  
Author(s):  
Peng Xu ◽  
Minghui Wang ◽  
Won-min Song ◽  
Qian Wang ◽  
Guo-Cheng Yuan ◽  
...  

Abstract Background Cellular senescence is a complex stress response that impacts cellular function and organismal health. Multiple developmental and environmental factors, such as intrinsic cellular cues, radiation, oxidative stress, oncogenes, and protein accumulation, activate genes and pathways that can lead to senescence. Enormous efforts have been made to identify and characterize senescence genes (SnGs) in stress and disease systems. However, the prevalence of senescent cells in healthy human tissues and the global SnG expression signature in different cell types are poorly understood. Methods This study performed an integrative gene network analysis of bulk and single-cell RNA-seq data in non-diseased human tissues to investigate SnG co-expression signatures and their cell-type specificity. Results Through a comprehensive transcriptomic network analysis of 50 human tissues in the Genotype-Tissue Expression Project (GTEx) cohort, we identified SnG-enriched gene modules, characterized SnG co-expression patterns, and constructed aggregated SnG networks across primary tissues of the human body. Our network approaches identified 51 SnGs highly conserved across the human tissues, including CDKN1A (p21)-centered regulators that control cell cycle progression and the senescence-associated secretory phenotype (SASP). The SnG-enriched modules showed remarkable cell-type specificity, especially in fibroblasts, endothelial cells, and immune cells. Further analyses of single-cell RNA-seq and spatial transcriptomic data independently validated the cell-type specific SnG signatures predicted by the network analysis. Conclusions This study systematically revealed the co-regulated organizations and cell type specificity of SnGs in major human tissues, which can serve as a blueprint for future studies to map senescent cells and their cellular interactions in human tissues.

2021 ◽  
Vol 4 (6) ◽  
pp. e202001004
Author(s):  
Almut Lütge ◽  
Joanna Zyprych-Walczak ◽  
Urszula Brykczynska Kunzmann ◽  
Helena L Crowell ◽  
Daniela Calini ◽  
...  

A key challenge in single-cell RNA-sequencing (scRNA-seq) data analysis is batch effects that can obscure the biological signal of interest. Although there are various tools and methods to correct for batch effects, their performance can vary. Therefore, it is important to understand how batch effects manifest to adjust for them. Here, we systematically explore batch effects across various scRNA-seq datasets according to magnitude, cell type specificity, and complexity. We developed a cell-specific mixing score (cms) that quantifies mixing of cells from multiple batches. By considering distance distributions, the score is able to detect local batch bias as well as differentiate between unbalanced batches and systematic differences between cells of the same cell type. We compare metrics in scRNA-seq data using real and synthetic datasets and whereas these metrics target the same question and are used interchangeably, we find differences in scalability, sensitivity, and ability to handle differentially abundant cell types. We find that cell-specific metrics outperform cell type–specific and global metrics and recommend them for both method benchmarks and batch exploration.


2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


2019 ◽  
Author(s):  
Matthew N. Bernstein ◽  
Zhongjie Ma ◽  
Michael Gleicher ◽  
Colin N. Dewey

SummaryCell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification by considering the rich hierarchical structure of known cell types, a source of prior knowledge that is not utilized by existing methods. Furthemore, CellO comes pre-trained on a novel, comprehensive dataset of human, healthy, untreated primary samples in the Sequence Read Archive, which to the best of our knowledge, is the most diverse curated collection of primary cell data to date. CellO’s comprehensive training set enables it to run out-of-the-box on diverse cell types and achieves superior or competitive performance when compared to existing state-of-the-art methods. Lastly, CellO’s linear models are easily interpreted, thereby enabling exploration of cell type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO’s models across the ontology.HighlightWe present CellO, a tool for hierarchically classifying cell type from single-cell RNA-seq data against the graph-structured Cell OntologyCellO is pre-trained on a comprehensive dataset comprising nearly all bulk RNA-seq primary cell samples in the Sequence Read ArchiveCellO achieves superior or comparable performance with existing methods while featuring a more comprehensive pre-packaged training setCellO is built with easily interpretable models which we expose through a novel web application, the CellO Viewer, for exploring cell type-specific signatures across the Cell OntologyGraphical Abstract


2019 ◽  
Vol 86 (8) ◽  
pp. 931-934 ◽  
Author(s):  
Stephany Foster ◽  
Yee Voan Teo ◽  
Nicola Neretti ◽  
Nathalie Oulhen ◽  
Gary M. Wessel

2018 ◽  
Author(s):  
Xuran Wang ◽  
Jihwan Park ◽  
Katalin Susztak ◽  
Nancy R. Zhang ◽  
Mingyao Li

AbstractWe present MuSiC, a method that utilizes cell-type specific gene expression from single-cell RNA sequencing (RNA-seq) data to characterize cell type compositions from bulk RNA-seq data in complex tissues. When applied to pancreatic islet and whole kidney expression data in human, mouse, and rats, MuSiC outperformed existing methods, especially for tissues with closely related cell types. MuSiC enables characterization of cellular heterogeneity of complex tissues for identification of disease mechanisms.


2018 ◽  
Author(s):  
Aziz Khan ◽  
Anthony Mathelier ◽  
Xuegong Zhang

AbstractBackgroundSuper-enhancers and stretch enhancers represent classes of transcriptional enhancers that have been shown to control the expression of cell identity genes and carry disease- and trait-associated variants. Specifically, super-enhancers are clusters of enhancers defined based on the binding occupancy of master transcription factors (TFs), chromatin regulators, or chromatin marks, while stretch enhancers are large chromatin-defined regulatory regions of at least 3,000 base pairs. Several studies have characterized these regulatory regions in numerous cell types and tissues to decipher their functional importance. However, the differences and similarities between these regulatory regions have not been fully assessed.ResultsWe integrated genomic, epigenomic, and transcriptomic data from ten human cell types to perform a comparative analysis of super and stretch enhancers with respect to their chromatin profiles, cell-type-specificity, and ability to control gene expression. We found that stretch enhancers are more abundant, more distal to transcription start sites, cover twice as much the genome and are significantly less conserved than super-enhancers. In contrast, super-enhancers are significantly more enriched for active chromatin marks and cohesin complex and transcriptionally active than stretch enhancers. Importantly, a vast majority of superenhancers (85%) overlap with only a small subset of stretch enhancers (13%), which are enriched for cell-type-specific biological functions, and control cell identity genes.ConclusionsThese results suggest that super-enhancers are transcriptionally more active and cell-type-specific than stretch enhancers, and importantly, most of the stretch enhancers that are distinct from superenhancers do not show an association with cell identity genes, are less active, and more likely to be poised enhancers.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Dustin J Sokolowski ◽  
Mariela Faykoo-Martinez ◽  
Lauren Erdman ◽  
Huayun Hou ◽  
Cadia Chan ◽  
...  

Abstract RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell-types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by leveraging cell-type expression data generated by scRNA-seq and existing deconvolution methods. After evaluating scMappR with simulated RNA-seq data and benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small population of immune cells. While scMappR can work with user-supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its stand-alone use with bulk RNA-seq data from these species. Overall, scMappR is a user-friendly R package that complements traditional differential gene expression analysis of bulk RNA-seq data.


2020 ◽  
Author(s):  
Songwei Ge ◽  
Haohan Wang ◽  
Amir Alavi ◽  
Eric Xing ◽  
Ziv Bar-Joseph

AbstractDimensionality reduction is an important first step in the analysis of single cell RNA-seq (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and labs. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell type specific. To overcome this we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different datasets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.


2022 ◽  
Author(s):  
Matthew T Buckley ◽  
Eric Sun ◽  
Benson M. George ◽  
Ling Liu ◽  
Nicholas Schaum ◽  
...  

Aging manifests as progressive dysfunction culminating in death. The diversity of cell types is a challenge to the precise quantification of aging and its reversal. Here we develop a suite of 'aging clocks' based on single cell transcriptomic data to characterize cell type-specific aging and rejuvenation strategies. The subventricular zone (SVZ) neurogenic region contains many cell types and provides an excellent system to study cell-level tissue aging and regeneration. We generated 21,458 single-cell transcriptomes from the neurogenic regions of 28 mice, tiling ages from young to old. With these data, we trained a suite of single cell-based regression models (aging clocks) to predict both chronological age (passage of time) and biological age (fitness, in this case the proliferative capacity of the neurogenic region). Both types of clocks perform well on independent cohorts of mice. Genes underlying the single cell-based aging clocks are mostly cell-type specific, but also include a few shared genes in the interferon and lipid metabolism pathways. We used these single cell-based aging clocks to measure transcriptomic rejuvenation, by generating single cell RNA-seq datasets of SVZ neurogenic regions for two interventions - heterochronic parabiosis (young blood) and exercise. Interestingly, the use of aging clocks reveals that both heterochronic parabiosis and exercise reverse transcriptomic aging in the niche, but in different ways across cell types and genes. This study represents the first development of high-resolution aging clocks from single cell transcriptomic data and demonstrates their application to quantify transcriptomic rejuvenation.


2020 ◽  
Author(s):  
Jiaxin Fan ◽  
Xuran Wang ◽  
Rui Xiao ◽  
Mingyao Li

AbstractAllelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provided a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.Author SummaryDetection of allelic expression imbalance (AEI), a phenomenon where the two alleles of a gene differ in their expression magnitude, is a key step towards the understanding of phenotypic variations among individuals. Existing methods detect AEI use bulk RNA sequencing (RNA-seq) data and ignore AEI variations among different cell types. Although single-cell RNA sequencing (scRNA-seq) has enabled the characterization of cell-to-cell heterogeneity in gene expression, the high costs have limited its application in AEI analysis. To overcome this limitation, we developed BSCET to characterize cell-type-specific AEI using the widely available bulk RNA-seq data by integrating cell-type composition information inferred from scRNA-seq samples. Since the degree of AEI may vary with disease phenotypes, we further extended BSCET to detect genes whose cell-type-specific AEIs are associated with clinical factors. Through extensive benchmark evaluations and analyses of two pancreatic islet bulk RNA-seq datasets, we demonstrated BSCET’s ability to refine bulk-level AEI to cell-type resolution, and to identify genes whose cell-type-specific AEIs are associated with the progression of type 2 diabetes. With the vast amount of easily accessible bulk RNA-seq data, we believe BSCET will be a valuable tool for elucidating cell type contributions in human diseases.


Sign in / Sign up

Export Citation Format

Share Document