Recent Progress in Cardiovascular Research Involving Single-Cell Omics Approaches

Cardiovascular diseases are among the leading causes of morbidity and mortality worldwide. Although the spectrum of the heart from development to disease has long been studied, it remains largely enigmatic. The emergence of single-cell omics technologies has provided a powerful toolbox for defining cell heterogeneity, unraveling previously unknown pathways, and revealing intercellular communications, thereby boosting biomedical research and obtaining numerous novel findings over the last 7 years. Not only cell atlases of normal and developing hearts that provided substantial research resources, but also some important findings regarding cell-type-specific disease gene program, could never have been established without single-cell omics technologies. Herein, we briefly describe the latest technological advances in single-cell omics and summarize the major findings achieved by such approaches, with a focus on development and homeostasis of the heart, myocardial infarction, and heart failure.

Download Full-text

Single-cell transcriptomics reveal cell type-specific molecular changes and altered intercellular communications in chronic obstructive pulmonary disease

10.1101/2021.02.23.432590 ◽

2021 ◽

Author(s):

Qiqing Huang ◽

Jingshen Wang ◽

Shaoran Shen ◽

Yuanyuan Wang ◽

Yan Chen ◽

...

Keyword(s):

Chronic Obstructive Pulmonary Disease ◽

Single Cell ◽

Pulmonary Disease ◽

Alveolar Epithelium ◽

Cell Types ◽

Chronic Obstructive ◽

Cell Type ◽

Obstructive Pulmonary Disease ◽

Cell Type Specific ◽

Intercellular Communications

AbstractChronic obstructive pulmonary disease (COPD) is a common and heterogeneous respiratory disease, the molecular complexity of which remains poorly understood, as well as the mechanisms by which aging and smoking facilitate COPD development. Here, using single-cell RNA sequencing of more than 65,000 cells from COPD and age-stratified control lung tissues of donors with different smoking histories, we identified monocytes, club cells, and macrophages as the most disease-, aging-, and smoking-relevant cell types, respectively. Notably, we found these highly cell-type specific changes under different conditions converged on cellular dysfunction of the alveolar epithelium. Deeper investigations revealed that the alveolar epithelium damage could be attributed to the abnormally activated monocytes in COPD lungs, which could be amplified via exhaustion of club cell stemness as ages. Moreover, the enhanced intercellular communications in COPD lungs as well as the pro-inflammatory interaction between macrophages and endothelial cells indued by smoking could facilitate signaling between monocyte and the alveolar epithelium. Our findings complement the existing model of COPD pathogenesis by emphasizing the contributions of the previously less appreciated cell types, highlighting their candidacy as potential therapeutic targets for COPD.

Download Full-text

Conditional out-of-distribution generation for unpaired data using transfer VAE

Bioinformatics ◽

10.1093/bioinformatics/btaa800 ◽

2020 ◽

Vol 36 (Supplement_2) ◽

pp. i610-i617

Author(s):

Mohammad Lotfollahi ◽

Mohsen Naghipourfar ◽

Fabian J Theis ◽

F Alexander Wolf

Keyword(s):

Single Cell ◽

Generative Models ◽

Response To Treatment ◽

High Dimensional ◽

Compact Representation ◽

Hair Color ◽

Great Success ◽

Cell Type ◽

Style Transfer ◽

Cell Type Specific

Abstract Motivation While generative models have shown great success in sampling high-dimensional samples conditional on low-dimensional descriptors (stroke thickness in MNIST, hair color in CelebA, speaker identity in WaveNet), their generation out-of-distribution poses fundamental problems due to the difficulty of learning compact joint distribution across conditions. The canonical example of the conditional variational autoencoder (CVAE), for instance, does not explicitly relate conditions during training and, hence, has no explicit incentive of learning such a compact representation. Results We overcome the limitation of the CVAE by matching distributions across conditions using maximum mean discrepancy in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. As this amount to solving a style-transfer problem, we refer to the model as transfer VAE (trVAE). Benchmarking trVAE on high-dimensional image and single-cell RNA-seq, we demonstrate higher robustness and higher accuracy than existing approaches. We also show qualitatively improved predictions by tackling previously problematic minority classes and multiple conditions in the context of cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively. We further demonstrate that trVAE learns cell-type-specific responses after perturbation and improves the prediction of most cell-type-specific genes by 65%. Availability and implementation The trVAE implementation is available via github.com/theislab/trvae. The results of this article can be reproduced via github.com/theislab/trvae_reproducibility.

Download Full-text

Comprehensive analysis of single cell ATAC-seq data with SnapATAC

Nature Communications ◽

10.1038/s41467-021-21583-9 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Rongxin Fang ◽

Sebastian Preissl ◽

Yang Li ◽

Xiaomeng Hou ◽

Jacinta Lucero ◽

...

Keyword(s):

Single Cell ◽

Single Cell Analysis ◽

Expression Patterns ◽

Regulatory Elements ◽

Cellular Heterogeneity ◽

Specific Gene ◽

Open Chromatin ◽

Cell Type ◽

Process Data ◽

Cell Type Specific

AbstractIdentification of the cis-regulatory elements controlling cell-type specific gene expression patterns is essential for understanding the origin of cellular diversity. Conventional assays to map regulatory elements via open chromatin analysis of primary tissues is hindered by sample heterogeneity. Single cell analysis of accessible chromatin (scATAC-seq) can overcome this limitation. However, the high-level noise of each single cell profile and the large volume of data pose unique computational challenges. Here, we introduce SnapATAC, a software package for analyzing scATAC-seq datasets. SnapATAC dissects cellular heterogeneity in an unbiased manner and map the trajectories of cellular states. Using the Nyström method, SnapATAC can process data from up to a million cells. Furthermore, SnapATAC incorporates existing tools into a comprehensive package for analyzing single cell ATAC-seq dataset. As demonstration of its utility, SnapATAC is applied to 55,592 single-nucleus ATAC-seq profiles from the mouse secondary motor cortex. The analysis reveals ~370,000 candidate regulatory elements in 31 distinct cell populations in this brain region and inferred candidate cell-type specific transcriptional regulators.

Download Full-text

An analytical method for the identification of cell type-specific disease gene modules

Journal of Translational Medicine ◽

10.1186/s12967-020-02690-5 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Jinting Guan ◽

Yiping Lin ◽

Yang Wang ◽

Junchao Gao ◽

Guoli Ji

Keyword(s):

Disease Gene ◽

Gene Interaction ◽

Cell Types ◽

Autism Spectrum ◽

Specific Gene ◽

Cell Type ◽

Specific Disease ◽

Cell Type Specific ◽

Gene Modules ◽

Disease Associated Genes

Abstract Background Genome-wide association studies have identified genetic variants associated with the risk of brain-related diseases, such as neurological and psychiatric disorders, while the causal variants and the specific vulnerable cell types are often needed to be studied. Many disease-associated genes are expressed in multiple cell types of human brains, while the pathologic variants affect primarily specific cell types. We hypothesize a model in which what determines the manifestation of a disease in a cell type is the presence of disease module comprised of disease-associated genes, instead of individual genes. Therefore, it is essential to identify the presence/absence of disease gene modules in cells. Methods To characterize the cell type-specificity of brain-related diseases, we construct human brain cell type-specific gene interaction networks integrating human brain nucleus gene expression data with a referenced tissue-specific gene interaction network. Then from the cell type-specific gene interaction networks, we identify significant cell type-specific disease gene modules by performing statistical tests. Results Between neurons and glia cells, the constructed cell type-specific gene networks and their gene functions are distinct. Then we identify cell type-specific disease gene modules associated with autism spectrum disorder and find that different gene modules are formed and distinct gene functions may be dysregulated in different cells. We also study the similarity and dissimilarity in cell type-specific disease gene modules among autism spectrum disorder, schizophrenia and bipolar disorder. The functions of neurons-specific disease gene modules are associated with synapse for all three diseases, while those in glia cells are different. To facilitate the use of our method, we develop an R package, CtsDGM, for the identification of cell type-specific disease gene modules. Conclusions The results support our hypothesis that a disease manifests itself in a cell type through forming a statistically significant disease gene module. The identification of cell type-specific disease gene modules can promote the development of more targeted biomarkers and treatments for the disease. Our method can be applied for depicting the cell type heterogeneity of a given disease, and also for studying the similarity and dissimilarity between different disorders, providing new insights into the molecular mechanisms underlying the pathogenesis and progression of diseases.

Download Full-text

Single-cell RNA sequencing of the mammalian pineal gland identifies two pinealocyte subtypes and cell type-specific daily patterns of gene expression

PLoS ONE ◽

10.1371/journal.pone.0205883 ◽

2018 ◽

Vol 13 (10) ◽

pp. e0205883 ◽

Cited By ~ 9

Author(s):

Joseph C. Mays ◽

Michael C. Kelly ◽

Steven L. Coon ◽

Lynne Holtzclaw ◽

Martin F. Rath ◽

...

Keyword(s):

Gene Expression ◽

Pineal Gland ◽

Single Cell ◽

Rna Sequencing ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Cell Type Specific ◽

Mammalian Pineal Gland ◽

Daily Patterns

Download Full-text

In vivo single-cell profiling of lncRNAs during Ebola virus infection

10.1101/2022.01.12.476002 ◽

2022 ◽

Author(s):

Luisa Santus ◽

Raquel García-Pérez ◽

Maria Sopena-Rios ◽

Aaron E Lin ◽

Gordon C Adams ◽

...

Keyword(s):

Viral Infection ◽

Single Cell ◽

Ebola Virus ◽

Cell Type ◽

Protein Coding ◽

Expression Variation ◽

Lncrna Expression ◽

Ebov Infection ◽

Cell Type Specific

Long non-coding RNAs (lncRNAs) are pivotal mediators of systemic immune response to viral infection, yet most studies concerning their expression and functions upon immune stimulation are limited to in vitro bulk cell populations. This strongly constrains our understanding of how lncRNA expression varies at single-cell resolution, and how their cell-type specific immune regulatory roles may differ compared to protein-coding genes. Here, we perform the first in-depth characterization of lncRNA expression variation at single-cell resolution during Ebola virus (EBOV) infection in vivo. Using bulk RNA-sequencing from 119 samples and 12 tissue types, we significantly expand the current macaque lncRNA annotation. We then profile lncRNA expression variation in immune circulating single-cells during EBOV infection and find that lncRNAs' expression in fewer cells is a major differentiating factor from their protein-coding gene counterparts. Upon EBOV infection, lncRNAs present dynamic and mostly cell-type specific changes in their expression profiles especially in monocytes, the main cell type targeted by EBOV. Such changes are associated with gene regulatory modules related to important innate immune responses such as interferon response and purine metabolism. Within infected cells, several lncRNAs have positively and negatively correlated expression with viral load, suggesting that expression of some of these lncRNAs might be directly hijacked by EBOV to attack host cells. This study provides novel insights into the roles that lncRNAs play in the host response to acute viral infection and paves the way for future lncRNA studies at single-cell resolution.

Download Full-text

JIND: Joint Integration and Discrimination for Automated Single-Cell Annotation

10.1101/2020.10.06.327601 ◽

2020 ◽

Author(s):

Mohit Goyal ◽

Guillermo Serrano ◽

Ilan Shomorony ◽

Mikel Hernaez ◽

Idoia Ochoa

Keyword(s):

Single Cell ◽

Cell Types ◽

Marker Genes ◽

Specific Marker ◽

Rna Seq ◽

Batch Effects ◽

Cell Type ◽

Latent Space ◽

Cell Type Specific ◽

Low Dimensional

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.

Download Full-text

CellO: Comprehensive and hierarchical cell type classification of human cells with the Cell Ontology

10.1101/634097 ◽

2019 ◽

Cited By ~ 1

Author(s):

Matthew N. Bernstein ◽

Zhongjie Ma ◽

Michael Gleicher ◽

Colin N. Dewey

Keyword(s):

Single Cell ◽

Web Application ◽

Cell Types ◽

Rna Seq ◽

Cell Type ◽

Training Set ◽

Sequence Read Archive ◽

Cell Ontology ◽

Cell Type Specific ◽

Type Classification

SummaryCell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification by considering the rich hierarchical structure of known cell types, a source of prior knowledge that is not utilized by existing methods. Furthemore, CellO comes pre-trained on a novel, comprehensive dataset of human, healthy, untreated primary samples in the Sequence Read Archive, which to the best of our knowledge, is the most diverse curated collection of primary cell data to date. CellO’s comprehensive training set enables it to run out-of-the-box on diverse cell types and achieves superior or competitive performance when compared to existing state-of-the-art methods. Lastly, CellO’s linear models are easily interpreted, thereby enabling exploration of cell type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO’s models across the ontology.HighlightWe present CellO, a tool for hierarchically classifying cell type from single-cell RNA-seq data against the graph-structured Cell OntologyCellO is pre-trained on a comprehensive dataset comprising nearly all bulk RNA-seq primary cell samples in the Sequence Read ArchiveCellO achieves superior or comparable performance with existing methods while featuring a more comprehensive pre-packaged training setCellO is built with easily interpretable models which we expose through a novel web application, the CellO Viewer, for exploring cell type-specific signatures across the Cell OntologyGraphical Abstract

Download Full-text

ICTD: A semi-supervised cell type identification and deconvolution method for multi-omics data

10.1101/426593 ◽

2018 ◽

Cited By ~ 2

Author(s):

Wennan Chang ◽

Changlin Wan ◽

Xiaoyu Lu ◽

Szu-wei Tu ◽

Yifan Sun ◽

...

Keyword(s):

Single Cell ◽

Cell Types ◽

Training Data ◽

Marker Genes ◽

Cell Detection ◽

Omics Data ◽

Deconvolution Method ◽

Cell Type ◽

Data Set ◽

Cell Type Specific

AbstractWe developed a novel deconvolution method, namely Inference of Cell Types and Deconvolution (ICTD) that addresses the fundamental issue of identifiability and robustness in current tissue data deconvolution problem. ICTD provides substantially new capabilities for omics data based characterization of a tissue microenvironment, including (1) maximizing the resolution in identifying resident cell and sub types that truly exists in a tissue, (2) identifying the most reliable marker genes for each cell type, which are tissue and data set specific, (3) handling the stability problem with co-linear cell types, (4) co-deconvoluting with available matched multi-omics data, and (5) inferring functional variations specific to one or several cell types. ICTD is empowered by (i) rigorously derived mathematical conditions of identifiable cell type and cell type specific functions in tissue transcriptomics data and (ii) a semi supervised approach to maximize the knowledge transfer of cell type and functional marker genes identified in single cell or bulk cell data in the analysis of tissue data, and (iii) a novel unsupervised approach to minimize the bias brought by training data. Application of ICTD on real and single cell simulated tissue data validated that the method has consistently good performance for tissue data coming from different species, tissue microenvironments, and experimental platforms. Other than the new capabilities, ICTD outperformed other state-of-the-art devolution methods on prediction accuracy, the resolution of identifiable cell, detection of unknown sub cell types, and assessment of cell type specific functions. The premise of ICTD also lies in characterizing cell-cell interactions and discovering cell types and prognostic markers that are predictive of clinical outcomes.

Download Full-text

Capturing cell type-specific chromatin structural patterns by applying topic modeling to single-cell Hi-C data

10.1101/534800 ◽

2019 ◽

Cited By ~ 2

Author(s):

Hyeon-Jin Kim ◽

Galip Gürkan Yardımcı ◽

Giancarlo Bonora ◽

Vijay Ramani ◽

Jie Liu ◽

...

Keyword(s):

Single Cell ◽

Topic Modeling ◽

Biological Information ◽

Chromatin Interaction ◽

Cell Type ◽

3D Genome ◽

Genome Wide ◽

Significant Barrier ◽

Chromatin Structural ◽

Cell Type Specific

AbstractSingle-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate twelve different single-cell combinatorial indexed Hi-C (sciHi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 25,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sciHi-C data in the form of “chromatin topics.” We further show enrichment of particular compartment structures associated with locus pairs in these topics.

Download Full-text