scholarly journals scEpath: energy landscape-based inference of transition probabilities and cellular trajectories from single-cell transcriptomic data

2018 ◽  
Vol 34 (12) ◽  
pp. 2077-2086 ◽  
Author(s):  
Suoqin Jin ◽  
Adam L MacLean ◽  
Tao Peng ◽  
Qing Nie

Abstract Motivation Single-cell RNA-sequencing (scRNA-seq) offers unprecedented resolution for studying cellular decision-making processes. Robust inference of cell state transition paths and probabilities is an important yet challenging step in the analysis of these data. Results Here we present scEpath, an algorithm that calculates energy landscapes and probabilistic directed graphs in order to reconstruct developmental trajectories. We quantify the energy landscape using ‘single-cell energy’ and distance-based measures, and find that the combination of these enables robust inference of the transition probabilities and lineage relationships between cell states. We also identify marker genes and gene expression patterns associated with cell state transitions. Our approach produces pseudotemporal orderings that are—in combination—more robust and accurate than current methods, and offers higher resolution dynamics of the cell state transitions, leading to new insight into key transition events during differentiation and development. Moreover, scEpath is robust to variation in the size of the input gene set, and is broadly unsupervised, requiring few parameters to be set by the user. Applications of scEpath led to the identification of a cell-cell communication network implicated in early human embryo development, and novel transcription factors important for myoblast differentiation. scEpath allows us to identify common and specific temporal dynamics and transcriptional factor programs along branched lineages, as well as the transition probabilities that control cell fates. Availability and implementation A MATLAB package of scEpath is available at https://github.com/sqjin/scEpath. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Kevin Y. Huang ◽  
Enrico Petretto

Single-cell transcriptomics analyses of the fibrotic lung uncovered two cell states critical to lung injury recovery in the alveolar epithelium- a reparative transitional cell state in the mouse and a disease-specific cell state (KRT5-/KRT17+) in human idiopathic pulmonary fibrosis (IPF). The murine transitional cell state lies between the differentiation from type 2 (AT2) to type 1 pneumocyte (AT1), and the human KRT5-/KRT17+ cell state may arise from the dysregulation of this differentiation process. We review major findings of single-cell transcriptomics analyses of the fibrotic lung and re-analyzed data from 7 single-cell RNA sequencing studies of human and murine models of IPF, focusing on the alveolar epithelium. Our comparative and cross-species single-cell transcriptomics analyses allowed us to further delineate the differentiation trajectories from AT2 to AT1 and AT2 to the KRT5-/KRT17+ cell state. We observed AT1 cells in human IPF retain the transcriptional signature of the murine transitional cell state. Using pseudotime analysis, we recapitulated the differentiation trajectories from AT2 to AT1 and from AT2 to KRT5-/KRT17+ cell state in multiple human IPF studies. We further delineated transcriptional programs underlying cell state transitions and determined the molecular phenotypes at terminal differentiation. We hypothesize that in addition to the reactivation of developmental programs (SOX4, SOX9), senescence (TP63, SOX4) and the Notch pathway (HES1) are predicted to steer intermediate progenitors to the KRT5-/KRT17+ cell state. Our analyses suggest that activation of SMAD3 later in the differentiation process may explain the fibrotic molecular phenotype typical of KRT5-/KRT17+ cells.


Author(s):  
Yixuan Qiu ◽  
Jiebiao Wang ◽  
Jing Lei ◽  
Kathryn Roeder

Abstract Motivation Marker genes, defined as genes that are expressed primarily in a single cell type, can be identified from the single cell transcriptome; however, such data are not always available for the many uses of marker genes, such as deconvolution of bulk tissue. Marker genes for a cell type, however, are highly correlated in bulk data, because their expression levels depend primarily on the proportion of that cell type in the samples. Therefore, when many tissue samples are analyzed, it is possible to identify these marker genes from the correlation pattern. Results To capitalize on this pattern, we develop a new algorithm to detect marker genes by combining published information about likely marker genes with bulk transcriptome data in the form of a semi-supervised algorithm. The algorithm then exploits the correlation structure of the bulk data to refine the published marker genes by adding or removing genes from the list. Availability and implementation We implement this method as an R package markerpen, hosted on CRAN (https://CRAN.R-project.org/package=markerpen). Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Jianhao Peng ◽  
Ullas V. Chembazhi ◽  
Sushant Bangru ◽  
Ian M. Traniello ◽  
Auinash Kalsotra ◽  
...  

AbstractMotivationWith the use of single-cell RNA sequencing (scRNA-Seq) technologies, it is now possible to acquire gene expression data for each individual cell in samples containing up to millions of cells. These cells can be further grouped into different states along an inferred cell differentiation path, which are potentially characterized by similar, but distinct enough, gene regulatory networks (GRNs). Hence, it would be desirable for scRNA-Seq GRN inference methods to capture the GRN dynamics across cell states. However, current GRN inference methods produce a unique GRN per input dataset (or independent GRNs per cell state), failing to capture these regulatory dynamics.ResultsWe propose a novel single-cell GRN inference method, named SimiC, that jointly infers the GRNs corresponding to each state. SimiC models the GRN inference problem as a LASSO optimization problem with an added similarity constraint, on the GRNs associated to contiguous cell states, that captures the inter-cell-state homogeneity. We show on a mouse hepatocyte single-cell data generated after partial hepatectomy that, contrary to previous GRN methods for scRNA-Seq data, SimiC is able to capture the transcription factor (TF) dynamics across liver regeneration, as well as the cell-level behavior for the regulatory program of each TF across cell states. In addition, on a honey bee scRNA-Seq experiment, SimiC is able to capture the increased heterogeneity of cells on whole-brain tissue with respect to a regional analysis tissue, and the TFs associated specifically to each sequenced tissue.AvailabilitySimiC is written in Python and includes an R API. It can be downloaded from https://github.com/jianhao2016/[email protected], [email protected] informationSupplementary data are available at the code repository.


Author(s):  
Srikanth Ravichandran ◽  
András Hartmann ◽  
Antonio del Sol

Abstract Summary Single-cell RNA-sequencing is increasingly employed to characterize disease or ageing cell subpopulation phenotypes. Despite exponential increase in data generation, systematic identification of key regulatory factors for controlling cellular phenotype to enable cell rejuvenation in disease or ageing remains a challenge. Here, we present SigHotSpotter, a computational tool to predict hotspots of signaling pathways responsible for the stable maintenance of cell subpopulation phenotypes, by integrating signaling and transcriptional networks. Targeted perturbation of these signaling hotspots can enable precise control of cell subpopulation phenotypes. SigHotSpotter correctly predicts the signaling hotspots with known experimental validations in different cellular systems. The tool is simple, user-friendly and is available as web-server or as stand-alone software. We believe SigHotSpotter will serve as a general purpose tool for the systematic prediction of signaling hotspots based on single-cell RNA-seq data, and potentiate novel cell rejuvenation strategies in the context of disease and ageing. Availability and implementation SigHotSpotter is at https://SigHotSpotter.lcsb.uni.lu as a web tool. Source code, example datasets and other information are available at https://gitlab.com/srikanth.ravichandran/sighotspotter. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Jiajun Zhang ◽  
Qing Nie ◽  
Tianshou Zhou

AbstractCell fate decisions play a pivotal role in development but technologies for dissecting them are limited. We developed a multifunction new method, Topographer to construct a ‘quantitative’ Waddington’s landscape of single-cell transcriptomic data. This method is able to identify complex cell-state transition trajectories and to estimate complex cell-type dynamics characterized by fate and transition probabilities. It also infers both marker gene networks and their dynamic changes as well as dynamic characteristics of transcriptional bursting along the cell-state transition trajectories. Applying this method to single-cell RNA-seq data on the differentiation of primary human myoblasts, we not only identified three known cell types but also estimated both their fate probabilities and transition probabilities among them. We found that the percent of genes expressed in a bursty manner is significantly higher at (or near) the branch point (∼97%) than before or after branch (below 80%), and that both gene-gene and cell-cell correlation degrees are apparently lower near the branch point than away from the branching. Topographer allows revealing of cell fate mechanisms in a coherent way at three scales: cell lineage (macroscopic), gene network (mesoscopic) and gene expression (microscopic).


2019 ◽  
Author(s):  
Chenling Xu ◽  
Romain Lopez ◽  
Edouard Mehlman ◽  
Jeffrey Regier ◽  
Michael I. Jordan ◽  
...  

AbstractAs single-cell transcriptomics becomes a mainstream technology, the natural next step is to integrate the accumulating data in order to achieve a common ontology of cell types and states. However, owing to various nuisance factors of variation, it is not straightforward how to compare gene expression levels across data sets and how to automatically assign cell type labels in a new data set based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of cohorts of single-cell RNA-seq data sets, while accounting for uncertainty caused by biological and measurement noise. We also introduce single-cell ANnotation using Variational Inference (scANVI), a semi-supervised variant of scVI designed to leverage any available cell state annotations — for instance when only one data set in a cohort is annotated, or when only a few cells in a single data set can be labeled using marker genes. We demonstrate that scVI and scANVI compare favorably to the existing methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings such as a hierarchical structure of cell state labels. We further show that different from existing methods, scVI and scANVI represent the integrated datasets with a single generative model that can be directly used for any probabilistic decision making task, using differential expression as our case study. scVI and scANVI are available as open source software and can be readily used to facilitate cell state annotation and help ensure consistency and reproducibility across studies.


Author(s):  
Boxun Li ◽  
Gary C. Hon

As we near a complete catalog of mammalian cell types, the capability to engineer specific cell types on demand would transform biomedical research and regenerative medicine. However, the current pace of discovering new cell types far outstrips our ability to engineer them. One attractive strategy for cellular engineering is direct reprogramming, where induction of specific transcription factor (TF) cocktails orchestrates cell state transitions. Here, we review the foundational studies of TF-mediated reprogramming in the context of a general framework for cell fate engineering, which consists of: discovering new reprogramming cocktails, assessing engineered cells, and revealing molecular mechanisms. Traditional bulk reprogramming methods established a strong foundation for TF-mediated reprogramming, but were limited by their small scale and difficulty resolving cellular heterogeneity. Recently, single-cell technologies have overcome these challenges to rapidly accelerate progress in cell fate engineering. In the next decade, we anticipate that these tools will enable unprecedented control of cell state.


2019 ◽  
Author(s):  
Heyrim Cho ◽  
Russell C. Rockne

AbstractSingle-cell sequencing technologies have revolutionized molecular and cellular biology and stimulated the development of computational tools to analyze the data generated from these technology platforms. However, despite the recent explosion of computational analysis tools, relatively few mathematical models have been developed to utilize these data. Here we compare and contrast two approaches for building mathematical models of cell state-transitions with single-cell RNA-sequencing data with hematopoeisis as a model system; by solving partial differential equations on a graph representing discrete cell state relationships, and by solving the equations on a continuous cell state-space. We demonstrate how to calibrate model parameters from single or multiple time-point single-cell sequencing data, and examine the effects of data processing algorithms on the model calibration and predictions. As an application of our approach, we demonstrate how the calibrated models may be used to mathematically perturb normal hematopoeisis to simulate, predict, and study the emergence of novel cell types during the pathogenesis of acute myeloid leukemia. The mathematical modeling framework we present is general and can be applied to study cell state-transitions in any single-cell genome sequencing dataset.Author summaryHere we compare and contrast graph- and continuum-based approaches for constructing mathematical models of cell state-transitions using single-cell RNA-sequencing data. Using two publicly available datasets, we demonstrate how to calibrate mathematical models of hematopoeisis and how to use the models to predict dynamics of acute myeloid leukemia pathogenesis by mathematically perturbing the process of cellular proliferation and differentiation. We apply these modeling approaches to study the effects of perturbing individual or sets of genes in subsets of cells, or by modeling the dynamics of cell state-transitions directly in a reduced dimensional space. We examine the effects of different graph abstraction and trajectory inference algorithms on calibrating the models and the subsequent model predictions. We conclude that both the graph- and continuum-based modeling approaches can be equally well calibrated to data and discuss situations in which one method may be preferable over the other. This work presents a general mathematical modeling framework, applicable to any single-cell sequencing dataset where cell state-transitions are of interest.


2017 ◽  
Vol 11 ◽  
pp. 117793221771224 ◽  
Author(s):  
Thomas Buder ◽  
Andreas Deutsch ◽  
Michael Seifert ◽  
Anja Voss-Böhme

Many normal and cancerous cell lines exhibit a stable composition of cells in distinct states which can, e.g., be defined on the basis of cell surface markers. There is evidence that such an equilibrium is associated with stochastic transitions between distinct states. Quantifying these transitions has the potential to better understand cell lineage compositions. We introduce CellTrans, an R package to quantify stochastic cell state transitions from cell state proportion data from fluorescence-activated cell sorting and flow cytometry experiments. The R package is based on a mathematical model in which cell state alterations occur due to stochastic transitions between distinct cell states whose rates only depend on the current state of a cell. CellTrans is an automated tool for estimating the underlying transition probabilities from appropriately prepared data. We point out potential analytical challenges in the quantification of these cell transitions and explain how CellTrans handles them. The applicability of CellTrans is demonstrated on publicly available data on the evolution of cell state compositions in cancer cell lines. We show that CellTrans can be used to (1) infer the transition probabilities between different cell states, (2) predict cell line compositions at a certain time, (3) predict equilibrium cell state compositions, and (4) estimate the time needed to reach this equilibrium. We provide an implementation of CellTrans in R, freely available via GitHub ( https://github.com/tbuder/CellTrans ).


Development ◽  
2020 ◽  
pp. dev.194027
Author(s):  
Tiffany L. Dill ◽  
Alina Carroll ◽  
Amanda Pinheiro ◽  
Jiachen Gao ◽  
Francisco J. Naya

Formation of skeletal muscle is among the most striking examples of cellular plasticity in animal tissue development, where muscle progenitor cells are reprogrammed by epithelial-mesenchymal transition (EMT) to produce multinucleated myofibers. The regulation of EMT in muscle formation remains poorly understood. Here, we demonstrate that the long noncoding RNA (lncRNA) Meg3 regulates EMT in myoblast differentiation and skeletal muscle regeneration. Chronic inhibition of Meg3 in C2C12 myoblasts induced EMT, and suppressed cell state transitions required for differentiation. Furthermore, adenoviral Meg3 knockdown compromised muscle regeneration, which was accompanied by abnormal mesenchymal gene expression and interstitial cell proliferation. Transcriptomic and pathway analyses of Meg3-depleted C2C12 myoblasts and injured skeletal muscle revealed a significant dysregulation of EMT-related genes, and identified TGFβ as a key upstream regulator. Importantly, inhibition of TGFβR1 and its downstream effectors, and the EMT transcription factor Snai2, restored many aspects of myogenic differentiation in Meg3-depleted myoblasts in vitro. We further demonstrate that reduction of Meg3-dependent Ezh2 activity results in epigenetic alterations associated with TGFβ activation. Thus, Meg3 regulates myoblast identity to maintain proper cell state for progression into differentiation.


Sign in / Sign up

Export Citation Format

Share Document