Curated Single Cell Multimodal Landmark Datasets for R/Bioconductor

Background: The majority of high-throughput single-cell molecular profiling methods quantify RNA expression; however, recent multimodal profiling methods add simultaneous measurement of genomic, proteomic, epigenetic, and/or spatial information on the same cells. The development of new statistical and computational methods in Bioconductor for such data will be facilitated by easy availability of landmark datasets using standard data classes. Results: We collected, processed, and packaged publicly available landmark datasets from important single-cell multimodal protocols, including CITE-Seq, ECCITE-Seq, SCoPE2, scNMT, 10X Multiome, seqFISH, and G&T. We integrate data modalities via the MultiAssayExperiment Bioconductor class, document and re-distribute datasets as the SingleCellMultiModal package in the Bioconductor Cloud-based ExperimentHub. The result is single-command actualization of landmark datasets from seven single-cell multimodal data generation technologies, without need for further data processing or wrangling in order to analyze and develop methods within the Bioconductor ecosystem of hundreds of packages for single-cell and multimodal data. Conclusions: We provide two examples of integrative analyses that are greatly simplified by SingleCellMultiModal. The package will facilitate development of bioinformatic and statistical methods in Bioconductor to meet the challenges of integrating molecular layers and analyzing phenotypic outputs including cell differentiation, activity, and disease.

Download Full-text

Automated cell-type classification in intact tissues by single-cell molecular profiling

eLife ◽

10.7554/elife.30510 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 46

Author(s):

Monica Nagendran ◽

Daniel P Riordan ◽

Pehr B Harbury ◽

Tushar J Desai

Keyword(s):

Single Cell ◽

Spatial Information ◽

Housekeeping Genes ◽

Cell Types ◽

Molecular Profiling ◽

Mouse Lung ◽

Intact Mouse ◽

Specificity And Sensitivity

A major challenge in biology is identifying distinct cell classes and mapping their interactions in vivo. Tissue-dissociative technologies enable deep single cell molecular profiling but do not provide spatial information. We developed a proximity ligation in situ hybridization technology (PLISH) with exceptional signal strength, specificity, and sensitivity in tissue. Multiplexed data sets can be acquired using barcoded probes and rapid label-image-erase cycles, with automated calculation of single cell profiles, enabling clustering and anatomical re-mapping of cells. We apply PLISH to expression profile ~2900 cells in intact mouse lung, which identifies and localizes known cell types, including rare ones. Unsupervised classification of the cells indicates differential expression of ‘housekeeping’ genes between cell types, and re-mapping of two sub-classes of Club cells highlights their segregated spatial domains in terminal airways. By enabling single cell profiling of various RNA species in situ, PLISH can impact many areas of basic and medical research.

Download Full-text

4 Molecularly guided multiplexed digital spatial analysis reveals differential gene expression profiles in the WNT-β-catenin pathway between melanoma and prostate tumors

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2020-sitc2020.0004 ◽

2020 ◽

Vol 8 (Suppl 3) ◽

pp. A4-A4

Author(s):

Anushka Dikshit ◽

Dan Zollinger ◽

Karen Nguyen ◽

Jill McKay-Fleisch ◽

Kit Fuhrman ◽

...

Keyword(s):

Gene Expression ◽

Single Cell ◽

Single Molecule ◽

Spatial Information ◽

Expression Profiles ◽

Gene Expression Profiles ◽

Fluorescence Assay ◽

Differentially Expressed ◽

Tumor Type ◽

Prostate Tumors

BackgroundThe canonical WNT-β-catenin signaling pathway is vital for development and tissue homeostasis but becomes strongly tumorigenic when dysregulated. and alter the transcriptional signature of a cell to promote malignant transformation. However, thorough characterization of these transcriptomic signatures has been challenging because traditional methods lack either spatial information, multiplexing, or sensitivity/specificity. To overcome these challenges, we developed a novel workflow combining the single molecule and single cell visualization capabilities of the RNAscope in situ hybridization (ISH) assay with the highly multiplexed spatial profiling capabilities of the GeoMx™ Digital Spatial Profiler (DSP) RNA assays. Using these methods, we sought to spatially profile and compare gene expression signatures of tumor niches with high and low CTNNB1 expression.MethodsAfter screening 120 tumor cores from multiple tumors for CTNNB1 expression by the RNAscope assay, we identified melanoma as the tumor type with the highest CTNNB1 expression while prostate tumors had the lowest expression. Using the RNAscope Multiplex Fluorescence assay we selected regions of high CTNNB1 expression within 3 melanoma tumors as well as regions with low CTNNB1 expression within 3 prostate tumors. These selected regions of interest (ROIs) were then transcriptionally profiled using the GeoMx DSP RNA assay for a set of 78 genes relevant in immuno-oncology. Target genes that were differentially expressed were further visualized and spatially assessed using the RNAscope Multiplex Fluorescence assay to confirm GeoMx DSP data with single cell resolution.ResultsThe GeoMx DSP analysis comparing the melanoma and prostate tumors revealed that they had significantly different gene expression profiles and many of these genes showed concordance with CTNNB1 expression. Furthermore, immunoregulatory targets such as ICOSLG, CTLA4, PDCD1 and ARG1, also demonstrated significant correlation with CTNNB1 expression. On validating selected targets using the RNAscope assay, we could distinctly visualize that they were not only highly expressed in melanoma compared to the prostate tumor, but their expression levels changed proportionally to that of CTNNB1 within the same tumors suggesting that these differentially expressed genes may be regulated by the WNT-β-catenin pathway.ConclusionsIn summary, by combining the RNAscope ISH assay and the GeoMx DSP RNA assay into one joint workflow we transcriptionally profiled regions of high and low CTNNB1 expression within melanoma and prostate tumors and identified genes potentially regulated by the WNT- β-catenin pathway. This novel workflow can be fully automated and is well suited for interrogating the tumor and stroma and their interactions.GeoMx Assays are for RESEARCH ONLY, not for diagnostics.

Download Full-text

Artificial Neural Network System for Cell Classification using Single Cell RNA Expression

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm49941.2020.9313498 ◽

2020 ◽

Author(s):

Xin Lin ◽

Jiahui Zhong ◽

Minjie Lyu ◽

Sen Lin ◽

Derin B. Keskin ◽

...

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Single Cell ◽

Rna Expression ◽

Network System ◽

Cell Classification ◽

Neural Network System ◽

Artificial Neural

Download Full-text

SMILE: Mutual Information Learning for Integration of Single-cell Omics Data

Bioinformatics ◽

10.1093/bioinformatics/btab706 ◽

2021 ◽

Author(s):

Yang Xu ◽

Priyojit Das ◽

Rachel Patton McCord

Keyword(s):

Deep Learning ◽

Mutual Information ◽

Single Cell ◽

Learning Algorithm ◽

Cellular Systems ◽

Supplementary Information ◽

Omics Data ◽

Learning Approaches ◽

Rna Seq ◽

Integrate Data

Abstract Motivation Deep learning approaches have empowered single-cell omics data analysis in many ways and generated new insights from complex cellular systems. As there is an increasing need for single cell omics data to be integrated across sources, types, and features of data, the challenges of integrating single-cell omics data are rising. Here, we present an unsupervised deep learning algorithm that learns discriminative representations for single-cell data via maximizing mutual information, SMILE (Single-cell Mutual Information Learning). Results Using a unique cell-pairing design, SMILE successfully integrates multi-source single-cell transcriptome data, removing batch effects and projecting similar cell types, even from different tissues, into the shared space. SMILE can also integrate data from two or more modalities, such as joint profiling technologies using single-cell ATAC-seq, RNA-seq, DNA methylation, Hi-C, and ChIP data. When paired cells are known, SMILE can integrate data with unmatched feature, such as genes for RNA-seq and genome wide peaks for ATAC-seq. Integrated representations learned from joint profiling technologies can then be used as a framework for comparing independent single source data. Supplementary information Supplementary data are available at Bioinformatics online. The source code of SMILE including analyses of key results in the study can be found at: https://github.com/rpmccordlab/SMILE.

Download Full-text

Quantum dot imaging platform for single-cell molecular profiling

Nature Communications ◽

10.1038/ncomms2635 ◽

2013 ◽

Vol 4 (1) ◽

Cited By ~ 157

Author(s):

Pavel Zrazhevskiy ◽

Xiaohu Gao

Keyword(s):

Quantum Dot ◽

Single Cell ◽

Molecular Profiling

Download Full-text

Integrating multimodal data sets into a mathematical framework to describe and predict therapeutic resistance in cancer

10.1101/2020.02.11.943738 ◽

2020 ◽

Cited By ~ 1

Author(s):

Kaitlyn Johnson ◽

Grant R. Howard ◽

Daylin Morgan ◽

Eric A. Brenner ◽

Andrea L. Gardner ◽

...

Keyword(s):

Mathematical Models ◽

Single Cell ◽

Treatment Response ◽

Model Parameters ◽

Data Sets ◽

Response Dynamics ◽

Multimodal Data ◽

Transcriptomic Data ◽

Treatment Regimens ◽

New Treatment

SummaryA significant challenge in the field of biomedicine is the development of methods to integrate the multitude of dispersed data sets into comprehensive frameworks to be used to generate optimal clinical decisions. Recent technological advances in single cell analysis allow for high-dimensional molecular characterization of cells and populations, but to date, few mathematical models have attempted to integrate measurements from the single cell scale with other data types. Here, we present a framework that actionizes static outputs from a machine learning model and leverages these as measurements of state variables in a dynamic mechanistic model of treatment response. We apply this framework to breast cancer cells to integrate single cell transcriptomic data with longitudinal population-size data. We demonstrate that the explicit inclusion of the transcriptomic information in the parameter estimation is critical for identification of the model parameters and enables accurate prediction of new treatment regimens. Inclusion of the transcriptomic data improves predictive accuracy in new treatment response dynamics with a concordance correlation coefficient (CCC) of 0.89 compared to a prediction accuracy of CCC = 0.79 without integration of the single cell RNA sequencing (scRNA-seq) data directly into the model calibration. To the best our knowledge, this is the first work that explicitly integrates single cell clonally-resolved transcriptome datasets with longitudinal treatment response data into a mechanistic mathematical model of drug resistance dynamics. We anticipate this approach to be a first step that demonstrates the feasibility of incorporating multimodal data sets into identifiable mathematical models to develop optimized treatment regimens from data.

Download Full-text

Machine learning for single cell genomics data analysis

10.1101/2021.02.04.429763 ◽

2021 ◽

Author(s):

Félix Raimundo ◽

Laetitia Papaxanthos ◽

Céline Vallot ◽

Jean-Philippe Vert

Keyword(s):

Machine Learning ◽

Single Cell ◽

Network Inference ◽

Method Development ◽

Biological Knowledge ◽

Omics Data ◽

Gene Regulatory Network Inference ◽

Multimodal Data ◽

Low Dimensional ◽

Type Classification

AbstractSingle-cell omics technologies produce large quantities of data describing the genomic, transcriptomic or epigenomic profiles of many individual cells in parallel. In order to infer biological knowledge and develop predictive models from these data, machine learning (ML)-based model are increasingly used due to their flexibility, scalability, and impressive success in other fields. In recent years, we have seen a surge of new ML-based method development for low-dimensional representations of single-cell omics data, batch normalization, cell type classification, trajectory inference, gene regulatory network inference or multimodal data integration. To help readers navigate this fast-moving literature, we survey in this review recent advances in ML approaches developed to analyze single-cell omics data, focusing mainly on peer-reviewed publications published in the last two years (2019-2020).

Download Full-text

Rheostat Coordination of Latent Kaposi Sarcoma-Associated Herpesvirus RNA expression in Single Cells

Journal of Virology ◽

10.1128/jvi.00032-21 ◽

2021 ◽

Author(s):

Nicole C. Rondeau ◽

JJ L. Miranda

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Single Cells ◽

Kaposi Sarcoma ◽

Rna Expression ◽

Single Cell Rna Sequencing ◽

Rna Levels

We detected precise coordination of RNA levels between two latent genes of the Kaposi sarcoma-associated herpesvirus (KSHV) using single-cell RNA sequencing. LANA and vIL6 are expressed during latency by different promoters on remote regions of the episome.…

Download Full-text

Single-cell RNA Expression of SARS-CoV-2 Cell Entry Factors in Human Endometrium during Preconception

10.1101/2020.09.14.296806 ◽

2020 ◽

Author(s):

Felipe Vilella ◽

Wanxin Wang ◽

Inmaculada Moreno ◽

Stephen R. Quake ◽

Carlos Simon

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Viral Entry ◽

Embryo Implantation ◽

Human Endometrium ◽

Cell Entry ◽

Rna Expression ◽

Endometrial Cells ◽

Single Cell Rna Sequencing ◽

Low Efficiency

AbstractWe investigated potential SARS-CoV-2 tropism in human endometrium by single-cell RNA-sequencing of viral entry-associated genes in healthy women. Percentages of endometrial cells expressing ACE2, TMPRSS2, CTSB, or CTSL were <2%, 12%, 80%, and 80%, respectively, with 0.7% of cells expressing all four genes. Our findings imply low efficiency of SARS-CoV-2 infection in the endometrium before embryo implantation, providing information to assess preconception risk in asymptomatic carriers.

Download Full-text

scFlow: A Scalable and Reproducible Analysis Pipeline for Single-Cell RNA Sequencing Data

10.1101/2021.08.16.456499 ◽

2021 ◽

Author(s):

Combiz Khozoie ◽

Nurun Fancy ◽

Mahdi Moradi Marjaneh ◽

Alan E. Murphy ◽

Paul M. Matthews ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Sparse Matrix ◽

Data Generation ◽

Sequencing Data ◽

Alternative Analysis ◽

Analysis Pipeline ◽

Matrix Quality ◽

Single Cell Rna Sequencing ◽

Public Datasets

Advances in single-cell RNA-sequencing technology over the last decade have enabled exponential increases in throughput: datasets with over a million cells are becoming commonplace. The burgeoning scale of data generation, combined with the proliferation of alternative analysis methods, led us to develop the scFlow toolkit and the nf-core/scflow pipeline for reproducible, efficient, and scalable analyses of single-cell and single-nuclei RNA-sequencing data. The scFlow toolkit provides a higher level of abstraction on top of popular single-cell packages within an R ecosystem, while the nf-core/scflow Nextflow pipeline is built within the nf-core framework to enable compute infrastructure-independent deployment across all institutions and research facilities. Here we present our flexible pipeline, which leverages the advantages of containerization and the potential of Cloud computing for easy orchestration and scaling of the analysis of large case/control datasets by even non-expert users. We demonstrate the functionality of the analysis pipeline from sparse-matrix quality control through to insight discovery with examples of analysis of four recently published public datasets and describe the extensibility of scFlow as a modular, open-source tool for single-cell and single nuclei bioinformatic analyses.

Download Full-text