Determining cell fate specification and genetic contribution to cardiac disease risk in hiPSC-derived cardiomyocytes at single cell resolution

AbstractThe majority of genetic loci underlying common disease risk act through changing genome regulation, and are routinely linked to expression quantitative trait loci, where gene expression is measured using bulk populations of mature cells. A crucial step that is missing is evidence of variation in the expression of these genes as cells progress from a pluripotent to mature state. This is especially important for cardiovascular disease, as the majority of cardiac cells have limited properties for renewal postneonatal. To investigate the dynamic changes in gene expression across the cardiac lineage, we generated RNA-sequencing data captured from 43,168 single cells progressing through in vitro cardiac-directed differentiation from pluripotency. We developed a novel and generalized unsupervised cell clustering approach and a machine learning method for prediction of cell transition. Using these methods, we were able to reconstruct the cell fate choices as cells transition from a pluripotent state to mature cardiomyocytes, uncovering intermediate cell populations that do not progress to maturity, and distinct cell trajectories that terminate in cardiomyocytes that differ in their contractile forces. Second, we identify new gene markers that denote lineage specification and demonstrate a substantial increase in their utility for cell identification over current pluripotent and cardiogenic markers. By integrating results from analysis of the single cell lineage RNA-sequence data with population-based GWAS of cardiovascular disease and cardiac tissue eQTLs, we show that the pathogenicity of disease-associated genes is highly dynamic as cells transition across their developmental lineage, and exhibit variation between cell fate trajectories. Through the integration of single cell RNA-sequence data with population-scale genetic data we have identified genes significantly altered at cell specification events providing insights into a context-dependent role in cardiovascular disease risk. This study provides a valuable data resource focused on in vitro cardiomyocyte differentiation to understand cardiac disease coupled with new analytical methods with broad applications to single-cell data.

Download Full-text

SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization

PeerJ ◽

10.7717/peerj.12087 ◽

2021 ◽

Vol 9 ◽

pp. e12087

Author(s):

Mikio Shiga ◽

Shigeto Seno ◽

Makoto Onizuka ◽

Hideo Matsuda

Keyword(s):

Gene Expression ◽

Single Cell ◽

Matrix Factorization ◽

Sequence Data ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Multiple Gene ◽

Rna Sequence ◽

Cell Clustering ◽

Non Negative Matrix Factorization

Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method.

Download Full-text

DNA methylation and gene expression integration in cardiovascular disease

Clinical Epigenetics ◽

10.1186/s13148-021-01064-y ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Guillermo Palou-Márquez ◽

Isaac Subirana ◽

Lara Nonell ◽

Alba Fernández-Sanlés ◽

Roberto Elosua

Keyword(s):

Gene Expression ◽

Cardiovascular Disease ◽

Dna Methylation ◽

Cardiovascular Diseases ◽

Disease Risk ◽

Cardiovascular Disease Risk ◽

Risk Function ◽

Predictive Biomarkers ◽

Independent Study ◽

Cell Type

Abstract Background The integration of different layers of omics information is an opportunity to tackle the complexity of cardiovascular diseases (CVD) and to identify new predictive biomarkers and potential therapeutic targets. Our aim was to integrate DNA methylation and gene expression data in an effort to identify biomarkers related to cardiovascular disease risk in a community-based population. We accessed data from the Framingham Offspring Study, a cohort study with data on DNA methylation (Infinium HumanMethylation450 BeadChip; Illumina) and gene expression (Human Exon 1.0 ST Array; Affymetrix). Using the MOFA2 R package, we integrated these data to identify biomarkers related to the risk of presenting a cardiovascular event. Results Four independent latent factors (9, 19, 21—only in women—and 27), driven by DNA methylation, were associated with cardiovascular disease independently of classical risk factors and cell-type counts. In a sensitivity analysis, we also identified factor 21 as associated with CVD in women. Factors 9, 21 and 27 were also associated with coronary heart disease risk. Moreover, in a replication effort in an independent study three of the genes included in factor 27 were also present in a factor identified to be associated with myocardial infarction (CDC42BPB, MAN2A2 and RPTOR). Factor 9 was related to age and cell-type proportions; factor 19 was related to age and B cells count; factor 21 pointed to human immunodeficiency virus infection-related pathways and inflammation; and factor 27 was related to lifestyle factors such as alcohol consumption, smoking and body mass index. Inclusion of factor 21 (only in women) improved the discriminative and reclassification capacity of the Framingham classical risk function and factor 27 improved its discrimination. Conclusions Unsupervised multi-omics data integration methods have the potential to provide insights into the pathogenesis of cardiovascular diseases. We identified four independent factors (one only in women) pointing to inflammation, endothelium homeostasis, visceral fat, cardiac remodeling and lifestyles as key players in the determination of cardiovascular risk. Moreover, two of these factors improved the predictive capacity of a classical risk function.

Download Full-text

Resolution of Cell Fate Decisions Revealed by Single-Cell Gene Expression Analysis from Zygote to Blastocyst

Developmental Cell ◽

10.1016/j.devcel.2010.02.012 ◽

2010 ◽

Vol 18 (4) ◽

pp. 675-685 ◽

Cited By ~ 568

Author(s):

Guoji Guo ◽

Mikael Huss ◽

Guo Qing Tong ◽

Chaoyang Wang ◽

Li Li Sun ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Fate ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Cell Fate Decisions ◽

Cell Gene Expression ◽

Cell Gene

Download Full-text

Transcriptome analysis ofSchistosoma mansonilarval development using serial analysis of gene expression (SAGE)

Parasitology ◽

10.1017/s0031182009005733 ◽

2009 ◽

Vol 136 (5) ◽

pp. 469-485 ◽

Cited By ~ 22

Author(s):

A. S. TAFT ◽

J. J. VERMEIRE ◽

J. BERNIER ◽

S. R. BIRKELAND ◽

M. J. CIPRIANO ◽

...

Keyword(s):

Gene Expression ◽

Sequence Data ◽

Subsequent Development ◽

Differentially Expressed ◽

Cdna Libraries ◽

Genome Wide ◽

A Genome ◽

Genome Wide Expression ◽

Cell Conditioned Medium

SUMMARYInfection of the snail,Biomphalaria glabrata, by the free-swimming miracidial stage of the human blood fluke,Schistosoma mansoni, and its subsequent development to the parasitic sporocyst stage is critical to establishment of viable infections and continued human transmission. We performed a genome-wide expression analysis of theS. mansonimiracidia and developing sporocyst using Long Serial Analysis of Gene Expression (LongSAGE). Five cDNA libraries were constructed from miracidia andin vitrocultured 6- and 20-day-old sporocysts maintained in sporocyst medium (SM) or in SM conditioned by previous cultivation with cells of theB. glabrataembryonic (Bge) cell line. We generated 21 440 SAGE tags and mapped 13 381 to theS. mansonigene predictions (v4.0e) either by estimating theoretical 3′ UTR lengths or using existing 3′ EST sequence data. Overall, 432 transcripts were found to be differentially expressed amongst all 5 libraries. In total, 172 tags were differentially expressed between miracidia and 6-day conditioned sporocysts and 152 were differentially expressed between miracidia and 6-day unconditioned sporocysts. In addition, 53 and 45 tags, respectively, were differentially expressed in 6-day and 20-day cultured sporocysts, due to the effects of exposure to Bge cell-conditioned medium.

Download Full-text

Joint profiling of gene expression and chromatin accessibility of amphioxus development at single cell resolution

10.21203/rs.3.rs-504113/v1 ◽

2021 ◽

Author(s):

Pengcheng Ma ◽

Xingyan Liu ◽

Huimin Liu ◽

Zaoxu Xu ◽

Xiangning Ding ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Fate ◽

Regulatory Networks ◽

Functional Divergence ◽

Expression Patterns ◽

Genetic Regulatory Networks ◽

Public Access ◽

Origin And Evolution ◽

Key Species

Abstract Vertebrate evolution was accompanied with two rounds of whole genome duplication followed by functional divergence in terms of regulatory circuits and gene expression patterns. As a basal and slow-evolving chordate species, amphioxus is an ideal paradigm for exploring the origin and evolution of vertebrates. Single cell sequencing has been widely employed to construct the developmental cell atlas of several key species of vertebrates (human, mouse, zebrafish and frog) and tunicate (sea squirts). Here, we performed single-nucleus RNA sequencing (snRNA-seq) and single-cell assay for transposase accessible chromatin sequencing (scATAC-seq) for different stages of amphioxus (covering embryogenesis and adult tissues). With the datasets generated we constructed the developmental tree for amphioxus cell fate commitment and lineage specification, and revealed the underlying key regulators and genetic regulatory networks. The generated data were integrated into an online platform, AmphioxusAtlas, for public access at http://120.79.46.200:81/AmphioxusAtlas.

Download Full-text

Single cell RNA sequencing of calvarial and long bone endocortical cells

10.1101/849224 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ugur M. Ayturk ◽

Joseph P. Scollan ◽

Alexander Vesprey ◽

Christina M. Jacobsen ◽

Paola Divieti Pajevic ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cultured Cells ◽

Neutralizing Antibody ◽

Long Bone ◽

Appendicular Skeleton ◽

Rna Seq ◽

Isolated Cells

ABSTRACTSingle cell RNA-seq (scRNA-seq) is emerging as a powerful technology to examine transcriptomes of individual cells. We determined whether scRNA-seq could be used to detect the effect of environmental and pharmacologic perturbations on osteoblasts. We began with a commonly used in vitro system in which freshly isolated neonatal mouse calvarial cells are expanded and induced to produce a mineralized matrix. We used scRNA-seq to compare the relative cell type abundances and the transcriptomes of freshly isolated cells to those that had been cultured for 12 days in vitro. We observed that the percentage of macrophage-like cells increased from 6% in freshly isolated calvarial cells to 34% in cultured cells. We also found that Bglap transcripts were abundant in freshly isolated osteoblasts but nearly undetectable in the cultured calvarial cells. Thus, scRNA-seq revealed significant differences between heterogeneity of cells in vivo and in vitro. We next performed scRNA-seq on freshly recovered long bone endocortical cells from mice that received either vehicle or Sclerostin-neutralizing antibody for 1 week. Bone anabolism-associated transcripts were also not significantly increased in immature and mature osteoblasts recovered from Sclerostin-neutralizing antibody treated mice; this is likely a consequence of being underpowered to detect modest changes in gene expression, since only 7% of the sequenced endocortical cells were osteoblasts, and a limited portion of their transcriptomes were sampled. We conclude that scRNA-seq can detect changes in cell abundance, identity, and gene expression in skeletally derived cells. In order to detect modest changes in osteoblast gene expression at the single cell level in the appendicular skeleton, larger numbers of osteoblasts from endocortical bone are required.

Download Full-text

Generative modeling of single-cell population time series for inferring cell differentiation landscapes

10.1101/2020.08.26.269332 ◽

2020 ◽

Author(s):

Grace H.T. Yeo ◽

Sachit D. Saksena ◽

David K. Gifford

Keyword(s):

Gene Expression ◽

Time Series ◽

Cell Differentiation ◽

Single Cell ◽

Cell Fate ◽

Lineage Tracing ◽

Model Framework ◽

Modeling Framework ◽

Generative Modeling ◽

Model Complex

SummaryExisting computational methods that use single-cell RNA-sequencing for cell fate prediction either summarize observations of cell states and their couplings without modeling the underlying differentiation process, or are limited in their capacity to model complex differentiation landscapes. Thus, contemporary methods cannot predict how cells evolve stochastically and in physical time from an arbitrary starting expression state, nor can they model the cell fate consequences of gene expression perturbations. We introduce PRESCIENT (Potential eneRgy undErlying Single Cell gradIENTs), a generative modeling framework that learns an underlying differentiation landscape from single-cell time-series gene expression data. Our generative model framework provides insight into the process of differentiation and can simulate differentiation trajectories for arbitrary gene expression progenitor states. We validate our method on a recently published experimental lineage tracing dataset that provides observed trajectories. We show that this model is able to predict the fate biases of progenitor cells in neutrophil/macrophage lineages when accounting for cell proliferation, improving upon the best-performing existing method. We also show how a model can predict trajectories for cells not found in the model’s training set, including cells in which genes or sets of genes have been perturbed. PRESCIENT is able to accommodate complex perturbations of multiple genes, at different time points and from different starting cell populations. PRESCIENT models are able to recover the expected effects of known modulators of cell fate in hematopoiesis and pancreatic β cell differentiation.

Download Full-text

Molecular hallmarks of heterochronic parabiosis at single cell resolution

10.1101/2020.11.06.367078 ◽

2020 ◽

Author(s):

Róbert Pálovics ◽

Andreas Keller ◽

Nicholas Schaum ◽

Weilun Tan ◽

Tobias Fehlmann ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Disease Risk ◽

Global Gene Expression ◽

Cell Types ◽

The Body ◽

Hematopoietic Stem ◽

Gene Sets ◽

Genes Encoding ◽

Systemic Understanding

Slowing or reversing biological ageing would have major implications for mitigating disease risk and maintaining vitality. While an increasing number of interventions show promise for rejuvenation, the effectiveness on disparate cell types across the body and the molecular pathways susceptible to rejuvenation remain largely unexplored. We performed single-cell RNA-sequencing on 13 organs to reveal cell type specific responses to young or aged blood in heterochronic parabiosis. Adipose mesenchymal stromal cells, hematopoietic stem cells, hepatocytes, and endothelial cells from multiple tissues appear especially responsive. On the pathway level, young blood invokes novel gene sets in addition to reversing established ageing patterns, with the global rescue of genes encoding electron transport chain subunits pinpointing a prominent role of mitochondrial function in parabiosis-mediated rejuvenation. Intriguingly, we observed an almost universal loss of gene expression with age that is largely mimicked by parabiosis: aged blood reduces global gene expression, and young blood restores it. Altogether, these data lay the groundwork for a systemic understanding of the interplay between blood-borne factors and cellular integrity.

Download Full-text

Enhancing droplet-based single-nucleus RNA-seq resolution using the semi-supervised machine learning classifier DIEM

10.1101/786285 ◽

2019 ◽

Cited By ~ 4

Author(s):

Marcus Alvarez ◽

Elior Rahmani ◽

Brandon Jew ◽

Kristina M. Garske ◽

Zong Miao ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Cell Types ◽

Supervised Machine Learning ◽

Data Sets ◽

Rna Seq ◽

Novel Approach ◽

Single Nucleus ◽

Downstream Analysis

AbstractSingle-nucleus RNA sequencing (snRNA-seq) measures gene expression in individual nuclei instead of cells, allowing for unbiased cell type characterization in solid tissues. Contrary to single-cell RNA seq (scRNA-seq), we observe that snRNA-seq is commonly subject to contamination by high amounts of extranuclear background RNA, which can lead to identification of spurious cell types in downstream clustering analyses if overlooked. We present a novel approach to remove debris-contaminated droplets in snRNA-seq experiments, called Debris Identification using Expectation Maximization (DIEM). Our likelihood-based approach models the gene expression distribution of debris and cell types, which are estimated using EM. We evaluated DIEM using three snRNA-seq data sets: 1) human differentiating preadipocytes in vitro, 2) fresh mouse brain tissue, and 3) human frozen adipose tissue (AT) from six individuals. All three data sets showed various degrees of extranuclear RNA contamination. We observed that existing methods fail to account for contaminated droplets and led to spurious cell types. When compared to filtering using these state of the art methods, DIEM better removed droplets containing high levels of extranuclear RNA and led to higher quality clusters. Although DIEM was designed for snRNA-seq data, we also successfully applied DIEM to single-cell data. To conclude, our novel method DIEM removes debris-contaminated droplets from single-cell-based data fast and effectively, leading to cleaner downstream analysis. Our code is freely available for use at https://github.com/marcalva/diem.

Download Full-text

ZipSeq : Barcoding for Real-time Mapping of Single Cell Transcriptomes

10.1101/2020.02.04.932988 ◽

2020 ◽

Cited By ~ 2

Author(s):

Kenneth H. Hu ◽

John P. Eichorst ◽

Chris S. McGinnis ◽

David M. Patterson ◽

Eric D. Chow ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Real Time ◽

Dimensional Space ◽

Single Cells ◽

Expression Patterns ◽

Live Cells ◽

Glass Substrates ◽

Complete Mapping

ABSTRACTSpatial transcriptomics seeks to integrate single-cell transcriptomic data within the 3-dimensional space of multicellular biology. Current methods use glass substrates pre-seeded with matrices of barcodes or fluorescence hybridization of a limited number of probes. We developed an alternative approach, called ‘ZipSeq’, that uses patterned illumination and photocaged oligonucleotides to serially print barcodes (Zipcodes) onto live cells within intact tissues, in real-time and with on-the-fly selection of patterns. Using ZipSeq, we mapped gene expression in three settings: in-vitro wound healing, live lymph node sections and in a live tumor microenvironment (TME). In all cases, we discovered new gene expression patterns associated with histological structures. In the TME, this demonstrated a trajectory of myeloid and T cell differentiation, from periphery inward. A variation of ZipSeq efficiently scales to the level of single cells, providing a pathway for complete mapping of live tissues, subsequent to real-time imaging or perturbation.

Download Full-text