scholarly journals scMARK an 'MNIST' like benchmark to evaluate and optimize models for unifying scRNA data

2021 ◽  
Author(s):  
Swechha Singh ◽  
Dylan Mendonca ◽  
Octavian Focsa ◽  
Juan Javier Diaz-Mejia ◽  
Sam Cooper

Today's single-cell RNA analysis tools provide enormous value in enabling researchers to make sense of large single-cell RNA (scRNA) studies, yet their ability to integrate different studies at scale remains untested. Here we present a novel benchmark dataset (scMARK), that consists of 100,000 cells over 10 studies and can test how well models unify data from different scRNA studies. We also introduce a two-step framework that uses supervised models, to evaluate how well unsupervised models integrate scRNA data from the 10 studies. Using this framework, we show that the Variational Autoencoder, scVI, represents the only tool tested that can integrate scRNA studies at scale. Overall, this work paves the way to creating large scRNA atlases and 'off-the-shelf' analysis tools.

2021 ◽  
Vol 23 (7) ◽  
Author(s):  
Sally Yu Shi ◽  
Xin Luo ◽  
Tracy M. Yamawaki ◽  
Chi-Ming Li ◽  
Brandon Ason ◽  
...  

Abstract Purpose of Review Cardiac fibroblast activation contributes to fibrosis, maladaptive remodeling and heart failure progression. This review summarizes the latest findings on cardiac fibroblast activation dynamics derived from single-cell transcriptomic analyses and discusses how this information may aid the development of new multispecific medicines. Recent Findings Advances in single-cell gene expression technologies have led to the discovery of distinct fibroblast subsets, some of which are more prevalent in diseased tissue and exhibit temporal changes in response to injury. In parallel to the rapid development of single-cell platforms, the advent of multispecific therapeutics is beginning to transform the biopharmaceutical landscape, paving the way for the selective targeting of diseased fibroblast subpopulations. Summary Insights gained from single-cell technologies reveal critical cardiac fibroblast subsets that play a pathogenic role in the progression of heart failure. Combined with the development of multispecific therapeutic agents that have enabled access to previously “undruggable” targets, we are entering a new era of precision medicine.


2021 ◽  
Author(s):  
Luke Ternes ◽  
Mark Dane ◽  
Marilyne Labrie ◽  
Gordon Mills ◽  
Joe Gray ◽  
...  

AbstractImage-based cell phenotyping relies on quantitative measurements as encoded representations of cells; however, defining suitable representations that capture complex imaging features is challenging since there are many obstacles, including segmentation and identifying subcellular compartments for feature extraction. Variational autoencoder (VAE) approaches produce encouraging results by mapping from an image to a representative descriptor, and outperform classical hand-crafted features for morphology, intensity, and texture at differentiating data. Although VAEs show promising results for capturing morphological and organizational features in tissue, single cell image analyses based on VAEs often fail to identify biologically informative features due to the intrinsic amount of uninformative variability. Herein, we propose a multi-encoder VAE (ME-VAE) in single cell image analysis using transformed images as a self-supervised signal to extract transform-invariant biologically meaningful features. We show that the proposed architecture improves analysis by making distinct populations more separable compared to traditional VAEs and intensity measurements by enhancing phenotypic differences between cells and by improving correlations to other modalities.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Minzhe Guo ◽  
Yina Du ◽  
Jason J. Gokey ◽  
Samriddha Ray ◽  
Sheila M. Bell ◽  
...  

2019 ◽  
Vol 374 (1786) ◽  
pp. 20190076 ◽  
Author(s):  
Thomas A. Richards ◽  
Ramon Massana ◽  
Stefano Pagliara ◽  
Neil Hall

Cells are the building blocks of life, from single-celled microbes through to multi-cellular organisms. To understand a multitude of biological processes we need to understand how cells behave, how they interact with each other and how they respond to their environment. The use of new methodologies is changing the way we study cells allowing us to study them on minute scales and in unprecedented detail. These same methods are allowing researchers to begin to sample the vast diversity of microbes that dominate natural environments. The aim of this special issue is to bring together research and perspectives on the application of new approaches to understand the biological properties of cells, including how they interact with other biological entities. This article is part of a discussion meeting issue ‘Single cell ecology’.


2019 ◽  
Author(s):  
Valentine Svensson ◽  
Lior Pachter

Single cell RNA-seq makes possible the investigation of variability in gene expression among cells, and dependence of variation on cell type. Statistical inference methods for such analyses must be scalable, and ideally interpretable. We present an approach based on a modification of a recently published highly scalable variational autoencoder framework that provides interpretability without sacrificing much accuracy. We demonstrate that our approach enables identification of gene programs in massive datasets. Our strategy, namely the learning of factor models with the auto-encoding variational Bayes framework, is not domain specific and may be of interest for other applications.


2017 ◽  
Author(s):  
Dongfang Wang ◽  
Jin Gu

AbstractSingle cell RNA sequencing (scRNA-seq) is a powerful technique to analyze the transcriptomic heterogeneities in single cell level. It is an important step for studying cell sub-populations and lineages based on scRNA-seq data by finding an effective low-dimensional representation and visualization of the original data. The scRNA-seq data are much noiser than traditional bulk RNA-Seq: in the single cell level, the transcriptional fluctuations are much larger than the average of a cell population and the low amount of RNA transcripts will increase the rate of technical dropout events. In this study, we proposed VASC (deep Variational Autoencoder for scRNA-seq data), a deep multi-layer generative model, for the unsupervised dimension reduction and visualization of scRNA-seq data. It can explicitly model the dropout events and find the nonlinear hierarchical feature representations of the original data. Tested on twenty datasets, VASC shows superior performances in most cases and broader dataset compatibility compared with four state-of-the-art dimension reduction methods. Then, for a case study of pre-implantation embryos, VASC successfully re-establishes the cell dynamics and identifies several candidate marker genes associated with the early embryo development.


2019 ◽  
Author(s):  
Trung Ngo Trong ◽  
Roger Kramer ◽  
Juha Mehtonen ◽  
Gerardo González ◽  
Ville Hautamäki ◽  
...  

ABSTRACTSingle-cell transcriptomics offers a tool to study the diversity of cell phenotypes through snapshots of the abundance of mRNA in individual cells. Often there is additional information available besides the single cell gene expression counts, such as bulk transcriptome data from the same tissue, or quantification of surface protein levels from the same cells. In this study, we propose models based on the Bayesian generative approach, where protein quantification available as CITE-seq counts from the same cells are used to constrain the learning process, thus forming a semi-supervised model. The generative model is based on the deep variational autoencoder (VAE) neural network architecture.


Sign in / Sign up

Export Citation Format

Share Document