scholarly journals Probabilistic inference of bifurcations in single-cell data using a hierarchical mixture of factor analysers

2016 ◽  
Author(s):  
Kieran R. Campbell ◽  
Christopher Yau

AbstractModelling bifurcations in single-cell transcriptomics data has become an increasingly popular field of research. Several methods have been proposed to infer bifurcation structure from such data but all rely on heuristic non-probabilistic inference. Here we propose the first generative, fully probabilistic model for such inference based on a Bayesian hierarchical mixture of factor analysers. Our model exhibits competitive performance on large datasets despite implementing full MCMC sampling and its unique hierarchical prior structure enables automatic determination of genes driving the bifurcation process.

Author(s):  
Massimo Andreatta ◽  
Santiago J. Carmona

AbstractComputational tools for the integration of single-cell transcriptomics data are designed to correct batch effects between technical replicates or different technologies applied to the same population of cells. However, they have inherent limitations when applied to heterogeneous sets of data with moderate overlap in cell states or sub-types. STACAS is a package for the identification of integration anchors in the Seurat environment, optimized for the integration of datasets that share only a subset of cell types. We demonstrate that by i) correcting batch effects while preserving relevant biological variability across datasets, ii) filtering aberrant integration anchors with a quantitative distance measure, and iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. We anticipate that the algorithm will be a useful tool for the construction of comprehensive single-cell atlases by integration of the growing amount of single-cell data becoming available in public repositories.Code availabilityR package:https://github.com/carmonalab/STACASDocker image:https://hub.docker.com/repository/docker/mandrea1/stacas_demo


2021 ◽  
Author(s):  
Dario Righelli ◽  
Lukas M. Weber ◽  
Helena L. Crowell ◽  
Brenda Pardo ◽  
Leonardo Collado-Torres ◽  
...  

AbstractMotivationSpatially resolved transcriptomics is a new set of technologies to measure gene expression for up to thousands of genes at near-single-cell, single-cell, or sub-cellular resolution, together with the spatial positions of the measurements. Analyzing combined molecular and spatial information has generated new insights about biological processes that manifest in a spatial manner within tissues. However, to efficiently analyze these data, specialized data infrastructure is required, which facilitates storage, retrieval, subsetting, and interfacing with downstream tools.ResultsHere, we describe SpatialExperiment, a new data infrastructure for storing and accessing spatially resolved transcriptomics data, implemented within the Bioconductor framework in the R programming language. SpatialExperiment extends the existing SingleCellExperiment for single-cell data from the Bioconductor framework, which brings with it advantages of modularity, interoperability, standardized operations, and comprehensive documentation. We demonstrate the structure and user interface with examples from the 10x Genomics Visium and seqFISH platforms. SpatialExperiment is extendable to alternative technological platforms measuring expression and to new types of data modalities, such as spatial immunofluorescence or proteomics, in the future. We also provide access to example datasets and visualization tools in the STexampleData, TENxVisiumData, and ggspavis packages.Availability and ImplementationSpatialExperiment is freely available from Bioconductor at https://bioconductor.org/packages/SpatialExperiment. The STexampleData, TENxVisiumData, and ggspavis packages are available from GitHub and will be submitted to Bioconductor.


2020 ◽  
Author(s):  
Jiangyong Wei ◽  
Tianshou Zhou ◽  
Xinan Zhang ◽  
Tianhai Tian

ABSTRACTOne of the major challenges in single-cell data analysis is the determination of cellular developmental trajectories using single-cell data. Although substantial studies have been conducted in recent years, more effective methods are still strongly needed to infer the developmental processes accurately. In this work we devise a new method, named DTFLOW, for determining the pseudo-temporal trajectories with multiple branches. This method consists of two major steps: namely a new dimension reduction method (i.e. Bhattacharyya kernel feature decomposition (BKFD)) and a novel approach, named Reverse Searching on kNN Graph (RSKG), to identify the underlying multi-branching processes of cellular differentiations. In BKFD we first establish a stationary distribution for each cell to represent the transition of cellular developmental states based on the random walk with restart algorithm and then propose a new distance metric for calculating pseudo-times of single-cells by introducing the Bhattacharyya kernel matrix. The effectiveness of DTFLOW is rigorously examined by using four single-cell datasets. We compare the efficiency of the new method with two state-of-the-art methods. Simulation results suggest that our proposed method has superior accuracy and strong robustness properties for constructing pseudo-time trajectories. Availability: DTFLOW is implemented in Python and available at https://github.com/statway/DTFLOW.


2021 ◽  
Author(s):  
Daniel Dimitrov ◽  
Dénes Türei ◽  
Charlotte Boys ◽  
James S. Nagai ◽  
Ricardo O. Ramirez Flores ◽  
...  

The growing availability of single-cell data has sparked an increased interest in the inference of cell-cell communication from this data. Many tools have been developed for this purpose. Each of them consists of a resource of intercellular interactions prior knowledge and a method to predict potential cell-cell communication events. Yet the impact of the choice of resource and method on the resulting predictions is largely unknown. To shed light on this, we created a framework, available at https://github.com/saezlab/ligrec_decoupler, to facilitate a comparative assessment of methods for inferring cell-cell communication from single cell transcriptomics data and then compared 15 resources and 6 methods. We found few unique interactions and a varying degree of overlap among the resources, and observed uneven coverage in terms of pathways and biological categories. We analysed a colorectal cancer single cell RNA-Seq dataset using all possible combinations of methods and resources. We found major differences among the highest ranked intercellular interactions inferred by each method even when using the same resources. The varying predictions lead to fundamentally different biological interpretations, highlighting the need to benchmark resources and methods.


2021 ◽  
Author(s):  
Jordan W. Squair ◽  
Michael A. Skinnider ◽  
Matthieu Gautier ◽  
Leonard J. Foster ◽  
Grégoire Courtine
Keyword(s):  

2021 ◽  
Vol 22 (S3) ◽  
Author(s):  
Yuanyuan Li ◽  
Ping Luo ◽  
Yi Lu ◽  
Fang-Xiang Wu

Abstract Background With the development of the technology of single-cell sequence, revealing homogeneity and heterogeneity between cells has become a new area of computational systems biology research. However, the clustering of cell types becomes more complex with the mutual penetration between different types of cells and the instability of gene expression. One way of overcoming this problem is to group similar, related single cells together by the means of various clustering analysis methods. Although some methods such as spectral clustering can do well in the identification of cell types, they only consider the similarities between cells and ignore the influence of dissimilarities on clustering results. This methodology may limit the performance of most of the conventional clustering algorithms for the identification of clusters, it needs to develop special methods for high-dimensional sparse categorical data. Results Inspired by the phenomenon that same type cells have similar gene expression patterns, but different types of cells evoke dissimilar gene expression patterns, we improve the existing spectral clustering method for clustering single-cell data that is based on both similarities and dissimilarities between cells. The method first measures the similarity/dissimilarity among cells, then constructs the incidence matrix by fusing similarity matrix with dissimilarity matrix, and, finally, uses the eigenvalues of the incidence matrix to perform dimensionality reduction and employs the K-means algorithm in the low dimensional space to achieve clustering. The proposed improved spectral clustering method is compared with the conventional spectral clustering method in recognizing cell types on several real single-cell RNA-seq datasets. Conclusions In summary, we show that adding intercellular dissimilarity can effectively improve accuracy and achieve robustness and that improved spectral clustering method outperforms the traditional spectral clustering method in grouping cells.


Cell ◽  
2021 ◽  
Author(s):  
Yuhan Hao ◽  
Stephanie Hao ◽  
Erica Andersen-Nissen ◽  
William M. Mauck ◽  
Shiwei Zheng ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document