Probabilistic inference of bifurcations in single-cell data using a hierarchical mixture of factor analysers

Mapping Intimacies ◽

10.1101/076547 ◽

2016 ◽

Author(s):

Kieran R. Campbell ◽

Christopher Yau

Keyword(s):

Single Cell ◽

Probabilistic Inference ◽

Bifurcation Structure ◽

Bayesian Hierarchical ◽

Mcmc Sampling ◽

Hierarchical Prior ◽

Transcriptomics Data ◽

Cell Data ◽

Bifurcation Process

AbstractModelling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analysers. Our model exhibits competitive performance on large datasets despite implementing full MCMC sampling and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process.

Download Full-text

STACAS: Sub-Type Anchor Correction for Alignment in Seurat to integrate single-cell RNA-seq data

10.1101/2020.06.15.152306 ◽

2020 ◽

Cited By ~ 1

Author(s):

Massimo Andreatta ◽

Santiago J. Carmona

Keyword(s):

Single Cell ◽

Distance Measure ◽

Cell Types ◽

R Package ◽

Rna Seq ◽

Batch Effects ◽

Link Type ◽

Transcriptomics Data ◽

Public Repositories ◽

Cell Data

AbstractComputational tools for the integration of single-cell transcriptomics data are designed to correct batch effects between technical replicates or different technologies applied to the same population of cells. However, they have inherent limitations when applied to heterogeneous sets of data with moderate overlap in cell states or sub-types. STACAS is a package for the identification of integration anchors in the Seurat environment, optimized for the integration of datasets that share only a subset of cell types. We demonstrate that by i) correcting batch effects while preserving relevant biological variability across datasets, ii) filtering aberrant integration anchors with a quantitative distance measure, and iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. We anticipate that the algorithm will be a useful tool for the construction of comprehensive single-cell atlases by integration of the growing amount of single-cell data becoming available in public repositories.Code availabilityR package:https://github.com/carmonalab/STACASDocker image:https://hub.docker.com/repository/docker/mandrea1/stacas_demo

Download Full-text

SpatialExperiment: infrastructure for spatially resolved transcriptomics data in R using Bioconductor

10.1101/2021.01.27.428431 ◽

2021 ◽

Author(s):

Dario Righelli ◽

Lukas M. Weber ◽

Helena L. Crowell ◽

Brenda Pardo ◽

Leonardo Collado-Torres ◽

...

Keyword(s):

Single Cell ◽

Spatial Information ◽

Data Infrastructure ◽

R Programming Language ◽

Spatially Resolved ◽

Transcriptomics Data ◽

Visualization Tools ◽

R Programming ◽

Cell Data ◽

Technological Platforms

AbstractMotivationSpatially resolved transcriptomics is a new set of technologies to measure gene expression for up to thousands of genes at near-single-cell, single-cell, or sub-cellular resolution, together with the spatial positions of the measurements. Analyzing combined molecular and spatial information has generated new insights about biological processes that manifest in a spatial manner within tissues. However, to efficiently analyze these data, specialized data infrastructure is required, which facilitates storage, retrieval, subsetting, and interfacing with downstream tools.ResultsHere, we describe SpatialExperiment, a new data infrastructure for storing and accessing spatially resolved transcriptomics data, implemented within the Bioconductor framework in the R programming language. SpatialExperiment extends the existing SingleCellExperiment for single-cell data from the Bioconductor framework, which brings with it advantages of modularity, interoperability, standardized operations, and comprehensive documentation. We demonstrate the structure and user interface with examples from the 10x Genomics Visium and seqFISH platforms. SpatialExperiment is extendable to alternative technological platforms measuring expression and to new types of data modalities, such as spatial immunofluorescence or proteomics, in the future. We also provide access to example datasets and visualization tools in the STexampleData, TENxVisiumData, and ggspavis packages.Availability and ImplementationSpatialExperiment is freely available from Bioconductor at https://bioconductor.org/packages/SpatialExperiment. The STexampleData, TENxVisiumData, and ggspavis packages are available from GitHub and will be submitted to Bioconductor.

Download Full-text

DTFLOW: Inference and Visualization of Single-cell Pseudo-temporal Trajectories Using Diffusion Propagation

10.1101/2020.09.10.290973 ◽

2020 ◽

Author(s):

Jiangyong Wei ◽

Tianshou Zhou ◽

Xinan Zhang ◽

Tianhai Tian

Keyword(s):

Single Cell ◽

Reduction Method ◽

Developmental Trajectories ◽

Branching Processes ◽

Single Cells ◽

New Method ◽

Developmental States ◽

Novel Approach ◽

Cell Data

ABSTRACTOne of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. In this work we devise a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. This method consists of two major steps: namely a new dimension reduction method (i.e. Bhattacharyya kernel feature decomposition (BKFD)) and a novel approach, named Reverse Searching on kNN Graph (RSKG), to identify the underlying multi-branching processes of cellular differentiations. In BKFD we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm and then propose a new distance metric for calculating pseudo-times of single-cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of the new method with two state-of-the-art methods. Simulation results suggest that our proposed method has superior accuracy and strong robustness properties for constructing pseudo-time trajectories. Availability: DTFLOW is implemented in Python and available at https://github.com/statway/DTFLOW.

Download Full-text

Comparison of Resources and Methods to infer Cell-Cell Communication from Single-cell RNA Data

10.1101/2021.05.21.445160 ◽

2021 ◽

Author(s):

Daniel Dimitrov ◽

Dénes Türei ◽

Charlotte Boys ◽

James S. Nagai ◽

Ricardo O. Ramirez Flores ◽

...

Keyword(s):

Single Cell ◽

Cell Communication ◽

Rna Seq ◽

Intercellular Interactions ◽

Communication Events ◽

Transcriptomics Data ◽

The Impact ◽

Cell Data ◽

Shed Light ◽

Cell Cell

The growing availability of single-cell data has sparked an increased interest in the inference of cell-cell communication from this data. Many tools have been developed for this purpose. Each of them consists of a resource of intercellular interactions prior knowledge and a method to predict potential cell-cell communication events. Yet the impact of the choice of resource and method on the resulting predictions is largely unknown. To shed light on this, we created a framework, available at https://github.com/saezlab/ligrec_decoupler, to facilitate a comparative assessment of methods for inferring cell-cell communication from single cell transcriptomics data and then compared 15 resources and 6 methods. We found few unique interactions and a varying degree of overlap among the resources, and observed uneven coverage in terms of pathways and biological categories. We analysed a colorectal cancer single cell RNA-Seq dataset using all possible combinations of methods and resources. We found major differences among the highest ranked intercellular interactions inferred by each method even when using the same resources. The varying predictions lead to fundamentally different biological interpretations, highlighting the need to benchmark resources and methods.

Download Full-text

Faculty Opinions recommendation of Systems biology. Conditional density-based analysis of T cell signaling in single-cell data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.723891088.793520867 ◽

2016 ◽

Author(s):

Anuj Kumar

Keyword(s):

Systems Biology ◽

T Cell ◽

Cell Signaling ◽

Single Cell ◽

Conditional Density ◽

T Cell Signaling ◽

Cell Data

Download Full-text

Faculty Opinions recommendation of Inferential Structure Determination of Chromosomes from Single-Cell Hi-C Data.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.727146656.793528084 ◽

2017 ◽

Author(s):

Iddo Friedberg

Keyword(s):

Single Cell ◽

Structure Determination

Download Full-text

Prioritization of cell types responsive to biological perturbations in single-cell data with Augur

Nature Protocols ◽

10.1038/s41596-021-00561-x ◽

2021 ◽

Author(s):

Jordan W. Squair ◽

Michael A. Skinnider ◽

Matthieu Gautier ◽

Leonard J. Foster ◽

Grégoire Courtine

Keyword(s):

Single Cell ◽

Cell Types ◽

Cell Data

Download Full-text

Identifying cell types from single-cell data based on similarities and dissimilarities between cells

BMC Bioinformatics ◽

10.1186/s12859-020-03873-z ◽

2021 ◽

Vol 22 (S3) ◽

Author(s):

Yuanyuan Li ◽

Ping Luo ◽

Yi Lu ◽

Fang-Xiang Wu

Keyword(s):

Gene Expression ◽

Single Cell ◽

Spectral Clustering ◽

Incidence Matrix ◽

Expression Patterns ◽

Cell Types ◽

Clustering Method ◽

Different Types ◽

Cell Data ◽

Spectral Clustering Method

Abstract Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.

Download Full-text