Deconvolution of bulk blood eQTL effects into immune cell subpopulations

Abstract Background Expression quantitative trait loci (eQTL) studies are used to interpret the function of disease-associated genetic risk factors. To date, most eQTL analyses have been conducted in bulk tissues, such as whole blood and tissue biopsies, which are likely to mask the cell type context of the eQTL regulatory effects. Although this context can be investigated by generating transcriptional profiles from purified cell subpopulations, the current methods are labor-intensive and expensive. Here we introduce a new method, Decon2, a framework for estimating cell proportions using expression profiles from bulk blood samples (Decon-cell) followed by deconvolution of cell type eQTLs (Decon-eQTL).Results The estimated cell proportions from Decon-cell agree with experimental measurements across cohorts (R ≥ 0.77). Using Decon-cell we can predict the proportions of 34 circulating cell types for 3,194 samples from a population-based cohort. Next we identified 16,362 whole blood eQTLs and deconvoluted cell type interaction (CTi) eQTLs using the predicted cell proportions from Decon-cell. CTi eQTLs show excellent allelic directional concordance with those of eQTL(≥ 96%-100%) and chromatin mark QTL (≥87%-92%) studies that used either purified cell subpopulations or single-cell RNA-seq, outperforming the conventional interaction effect.Conclusions Decon2 provides a method to detect cell type interaction effects from bulk blood eQTLs, which is useful in pinpointing the most relevant cell type for a certain complex disease. Decon2 is available as an R package and Java application. (https://github.com/molgenis/systemsgenetics/tree/master/Decon2), and as a web tool (www.molgenis.org/deconvolution).

Download Full-text

Deconvolution of bulk blood eQTL effects into immune cell subpopulations

10.1101/548669 ◽

2019 ◽

Cited By ~ 3

Author(s):

R. Aguirre-Gamboa ◽

N. de Klein ◽

J. di Tommaso ◽

A. Claringbould ◽

U. Võsa ◽

...

Keyword(s):

Whole Blood ◽

Complex Disease ◽

Immune Cell ◽

Expression Profiles ◽

Cell Types ◽

Population Based ◽

New Method ◽

Cell Type ◽

Link Type ◽

Cell Subpopulations

AbstractExpression quantitative trait loci (eQTL) studies are used to interpret the function of disease-associated genetic risk factors. To date, most eQTL analyses have been conducted in bulk tissues, such as whole blood and tissue biopsies, which are likely to mask the cell type context of the eQTL regulatory effects. Although this context can be investigated by generating transcriptional profiles from purified cell subpopulations, the current methods are labor-intensive and expensive. Here we introduce a new method, Decon2, a statistical framework for estimating cell proportions using expression profiles from bulk blood samples (Decon-cell) and consecutive deconvolution of cell type eQTLs (Decon-eQTL). The estimated cell proportions from Decon-cell agree with experimental measurements across cohorts (R ≥ 0.77). Using Decon-cell we can predict the proportions of 34 circulating cell types for 3,194 samples from a population-based cohort. Next we identified 16,362 whole blood eQTLs and assign them to a cell type with Decon-eQTL using the predicted cell proportions from Decon-cell. Deconvoluted eQTLs show excellent allelic directional concordance with those of eQTL(≥ 96%) and chromatin mark QTL (≥87%) studies that used either purified cell subpopulations or single-cell RNA-seq. Our new method provides a way to assign cell type effects to eQTLs from bulk blood, which is useful in pinpointing the most relevant cell type for a certain complex disease. Decon2 is available as an R package and Java application (https://github.com/molgenis/systemsgenetics/tree/master/Decon2), and as a web tool (www.molgenis.org/deconvolution).

Download Full-text

Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data

eLife ◽

10.7554/elife.26476 ◽

2017 ◽

Vol 6 ◽

Cited By ~ 107

Author(s):

Julien Racle ◽

Kaat de Jonge ◽

Petra Baumgaertner ◽

Daniel E Speiser ◽

David Gfeller

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Immune Cell ◽

Expression Profiles ◽

Cell Types ◽

Response To Therapy ◽

Expression Data ◽

Cell Type ◽

Tumor Gene Expression ◽

Tumor Gene

Immune cells infiltrating tumors can have important impact on tumor progression and response to therapy. We present an efficient algorithm to simultaneously estimate the fraction of cancer and immune cell types from bulk tumor gene expression data. Our method integrates novel gene expression profiles from each major non-malignant cell type found in tumors, renormalization based on cell-type-specific mRNA content, and the ability to consider uncharacterized and possibly highly variable cell types. Feasibility is demonstrated by validation with flow cytometry, immunohistochemistry and single-cell RNA-Seq analyses of human melanoma and colorectal tumor specimens. Altogether, our work not only improves accuracy but also broadens the scope of absolute cell fraction predictions from tumor gene expression data, and provides a unique novel experimental benchmark for immunogenomics analyses in cancer research (http://epic.gfellerlab.org).

Download Full-text

Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data

10.1101/117788 ◽

2017 ◽

Cited By ~ 2

Author(s):

Julien Racle ◽

Kaat de Jonge ◽

Petra Baumgaertner ◽

Daniel E. Speiser ◽

David Gfeller

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Immune Cell ◽

Expression Profiles ◽

Cell Types ◽

Response To Therapy ◽

Expression Data ◽

Cell Type ◽

Tumor Gene Expression ◽

Tumor Gene

AbstractImmune cells infiltrating tumors can have important impact on tumor progression and response to therapy. We present an efficient algorithm to simultaneously estimate the fraction of cancer and immune cell types from bulk tumor gene expression data. Our method integrates novel gene expression profiles from each major non-malignant cell type found in tumors, renormalization based on cell-type specific mRNA content, and the ability to consider uncharacterized and possibly highly variable cell types. Feasibility is demonstrated by validation with flow cytometry, immunohistochemistry and single-cell RNA-Seq analyses of human melanoma and colorectal tumor specimens. Altogether, our work not only improves accuracy but also broadens the scope of absolute cell fraction predictions from tumor gene expression data, and provides a unique novel experimental benchmark for immunogenomics analyses in cancer research.

Download Full-text

Comprehensive evaluation of transcriptome-based cell-type quantification methods for immuno-oncology

Bioinformatics ◽

10.1093/bioinformatics/btz363 ◽

2019 ◽

Vol 35 (14) ◽

pp. i436-i445 ◽

Cited By ~ 71

Author(s):

Gregor Sturm ◽

Francesca Finotello ◽

Florent Petitprez ◽

Jitao David Zhang ◽

Jan Baumbach ◽

...

Keyword(s):

Single Cell ◽

Computational Methods ◽

Immune Cell ◽

Comprehensive Evaluation ◽

Cell Types ◽

R Package ◽

Supplementary Information ◽

Rna Seq ◽

Cell Type ◽

Real World Datasets

Abstract Motivation The composition and density of immune cells in the tumor microenvironment (TME) profoundly influence tumor progression and success of anti-cancer therapies. Flow cytometry, immunohistochemistry staining or single-cell sequencing are often unavailable such that we rely on computational methods to estimate the immune-cell composition from bulk RNA-sequencing (RNA-seq) data. Various methods have been proposed recently, yet their capabilities and limitations have not been evaluated systematically. A general guideline leading the research community through cell type deconvolution is missing. Results We developed a systematic approach for benchmarking such computational methods and assessed the accuracy of tools at estimating nine different immune- and stromal cells from bulk RNA-seq samples. We used a single-cell RNA-seq dataset of ∼11 000 cells from the TME to simulate bulk samples of known cell type proportions, and validated the results using independent, publicly available gold-standard estimates. This allowed us to analyze and condense the results of more than a hundred thousand predictions to provide an exhaustive evaluation across seven computational methods over nine cell types and ∼1800 samples from five simulated and real-world datasets. We demonstrate that computational deconvolution performs at high accuracy for well-defined cell-type signatures and propose how fuzzy cell-type signatures can be improved. We suggest that future efforts should be dedicated to refining cell population definitions and finding reliable signatures. Availability and implementation A snakemake pipeline to reproduce the benchmark is available at https://github.com/grst/immune_deconvolution_benchmark. An R package allows the community to perform integrated deconvolution using different methods (https://grst.github.io/immunedeconv). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

clustifyr: An R package for automated single-cell RNA sequencing cluster classification

10.1101/855064 ◽

2019 ◽

Cited By ~ 1

Author(s):

Rui Fu ◽

Austin E. Gillen ◽

Ryan M. Sheridan ◽

Chengzhe Tian ◽

Michelle Daya ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Reference Data ◽

Expression Profiles ◽

Gene List ◽

Cell Types ◽

R Package ◽

Cell Type ◽

Type Assignment ◽

Single Cell Rna Sequencing

ABSTRACTBackgroundIn single-cell RNA sequencing (scRNA-seq) analysis, assignment of likely cell types remains a time-consuming, error-prone, and biased process. Current packages for identity assignment use limited types of reference data, and often have rigid data structure requirements. As such, a more flexible tool, capable of handling multiple types of reference data and data structures, would be beneficial.FindingsTo address difficulties in cluster identity assignment, we developed the clustifyr R package. The package leverages external datasets, including gene expression profiles from scRNA-seq, bulk RNA-seq, microarray expression data, and/or signature gene lists, to assign likely cell types. We benchmark various parameters of a correlation-based approach, and also implement a variety of gene list enrichment methods. By providing tools for exploratory data analysis, we demonstrate the feasibility of a simple and effective data-driven approach for cell type assignment in scRNA-seq cell clusters.Conclusionsclustifyr is a lightweight and effective cell type assignment tool developed for compatibility with various scRNA-seq analysis workflows. clustifyr is publicly available at https://github.com/rnabioco/clustifyr

Download Full-text

484 Bioturing browser: interactively explore public single cell sequencing data

Journal for ImmunoTherapy of Cancer ◽

10.1136/jitc-2020-sitc2020.0484 ◽

2020 ◽

Vol 8 (Suppl 3) ◽

pp. A520-A520

Author(s):

Son Pham ◽

Tri Le ◽

Tan Phan ◽

Minh Pham ◽

Huy Nguyen ◽

...

Keyword(s):

Single Cell ◽

Immune Cell ◽

Expression Profiles ◽

Meta Analysis ◽

Cell Types ◽

Sequencing Data ◽

Single Cell Sequencing ◽

Data Formats ◽

Cancer Types ◽

Cell Data

BackgroundSingle-cell sequencing technology has opened an unprecedented ability to interrogate cancer. It reveals significant insights into the intratumoral heterogeneity, metastasis, therapeutic resistance, which facilitates target discovery and validation in cancer treatment. With rapid advancements in throughput and strategies, a particular immuno-oncology study can produce multi-omics profiles for several thousands of individual cells. This overflow of single-cell data poses formidable challenges, including standardizing data formats across studies, performing reanalysis for individual datasets and meta-analysis.MethodsN/AResultsWe present BioTuring Browser, an interactive platform for accessing and reanalyzing published single-cell omics data. The platform is currently hosting a curated database of more than 10 million cells from 247 projects, covering more than 120 immune cell types and subtypes, and 15 different cancer types. All data are processed and annotated with standardized labels of cell types, diseases, therapeutic responses, etc. to be instantly accessed and explored in a uniform visualization and analytics interface. Based on this massive curated database, BioTuring Browser supports searching similar expression profiles, querying a target across datasets and automatic cell type annotation. The platform supports single-cell RNA-seq, CITE-seq and TCR-seq data. BioTuring Browser is now available for download at www.bioturing.com.ConclusionsN/A

Download Full-text

Jointly leveraging spatial transcriptomics and deep learning models for pathology image annotation improves cell type identification over either approach alone.

10.1101/2021.11.10.468082 ◽

2021 ◽

Author(s):

Asif Zubair ◽

Richard H. Chapple ◽

Sivaraman Natarajan ◽

William C. Wright ◽

Min Pan ◽

...

Keyword(s):

Immune Cell ◽

Image Annotation ◽

Cell Types ◽

Tissue Cell ◽

Cell Type ◽

Spatially Resolved ◽

Transcriptomics Data ◽

Diagnostic Applications ◽

The Many ◽

Level Performance

The disorganization of cell types within tissues underlies many human diseases and has been studied for over a century using the conventional tools of pathology, including tissue-marking dyes such as the H&E stain. Recently, spatial transcriptomics technologies were developed that can measure spatially resolved gene expression directly in pathology-stained tissues sections, revealing cell types and their dysfunction in unprecedented detail. In parallel, artificial intelligence (AI) has approached pathologist-level performance in computationally annotating H&E images of tissue sections. However, spatial transcriptomics technologies are limited in their ability to separate transcriptionally similar cell types and AI-based pathology has performed less impressively outside their training datasets. Here, we describe a methodology that can computationally integrate AI-annotated pathology images with spatial transcriptomics data to markedly improve inferences of tissue cell type composition made over either class of data alone. We show that this methodology can identify regions of clinically relevant tumor immune cell infiltration, which is predictive of response to immunotherapy and was missed by an initial pathologist's manual annotation. Thus, combining spatial transcriptomics and AI-based image annotation has the potential to exceed pathologist-level performance in clinical diagnostic applications and to improve the many applications of spatial transcriptomics that rely on accurate cell type annotations.

Download Full-text

Genomic Architecture of Cells in Tissues (GeACT): Study of Human Mid-gestation Fetus

10.1101/2020.04.12.038000 ◽

2020 ◽

Author(s):

Feng Tian ◽

Fan Zhou ◽

Xiang Li ◽

Wenping Ma ◽

Honggui Wu ◽

...

Keyword(s):

Transcription Factors ◽

Single Cell ◽

Human Cell ◽

Expression Profiles ◽

Single Cells ◽

Cell Types ◽

List Type ◽

Cell Type ◽

Genomic Architecture ◽

Gene Modules

SummaryBy circumventing cellular heterogeneity, single cell omics have now been widely utilized for cell typing in human tissues, culminating with the undertaking of human cell atlas aimed at characterizing all human cell types. However, more important are the probing of gene regulatory networks, underlying chromatin architecture and critical transcription factors for each cell type. Here we report the Genomic Architecture of Cells in Tissues (GeACT), a comprehensive genomic data base that collectively address the above needs with the goal of understanding the functional genome in action. GeACT was made possible by our novel single-cell RNA-seq (MALBAC-DT) and ATAC-seq (METATAC) methods of high detectability and precision. We exemplified GeACT by first studying representative organs in human mid-gestation fetus. In particular, correlated gene modules (CGMs) are observed and found to be cell-type-dependent. We linked gene expression profiles to the underlying chromatin states, and found the key transcription factors for representative CGMs.HighlightsGenomic Architecture of Cells in Tissues (GeACT) data for human mid-gestation fetusDetermining correlated gene modules (CGMs) in different cell types by MALBAC-DTMeasuring chromatin open regions in single cells with high detectability by METATACIntegrating transcriptomics and chromatin accessibility to reveal key TFs for a CGM

Download Full-text

MIXTURE: an improved algorithm for immune tumor microenvironment estimation based on gene expression data

10.1101/726562 ◽

2019 ◽

Cited By ~ 3

Author(s):

Elmer A. Fernández ◽

Yamil D. Mahmoud ◽

Florencia Veigas ◽

Darío Rocha ◽

Mónica Balzarini ◽

...

Keyword(s):

Tumor Microenvironment ◽

Immune Cell ◽

Therapy Response ◽

Cell Types ◽

Gene Signature ◽

Response To Therapy ◽

Support Vector ◽

Data Sets ◽

Cell Type ◽

Before And After

AbstractRNA sequencing has proved to be an efficient high-throughput technique to robustly characterize the presence and quantity of RNA in tumor biopsies at a given time. Importantly, it can be used to computationally estimate the composition of the tumor immune infiltrate and to infer the immunological phenotypes of those cells. Given the significant impact of anti-cancer immunotherapies and the role of the associated immune tumor microenvironment (ITME) on its prognosis and therapy response, the estimation of the immune cell-type content in the tumor is crucial for designing effective strategies to understand and treat cancer. Current digital estimation of the ITME cell mixture content can be performed using different analytical tools. However, current methods tend to over-estimate the number of cell-types present in the sample, thus under-estimating true proportions, biasing the results. We developed MIXTURE, a noise-constrained recursive feature selection for support vector regression that overcomes such limitations. MIXTURE deconvolutes cell-type proportions of bulk tumor samples for both RNA microarray or RNA-Seq platforms from a leukocyte validated gene signature. We evaluated MIXTURE over simulated and benchmark data sets. It overcomes competitive methods in terms of accuracy on the true number of present cell-types and proportions estimates with increased robustness to estimation bias. It also shows superior robustness to collinearity problems. Finally, we investigated the human immune microenvironment of breast cancer, head and neck squamous cell carcinoma, and melanoma biopsies before and after anti-PD-1 immunotherapy treatment revealing associations to response to therapy which have not seen by previous methods.

Download Full-text

Saliva cell type DNA methylation reference panel for epidemiology studies in children

10.1101/2020.09.14.20191361 ◽

2020 ◽

Author(s):

Lauren Y M Middleton ◽

John F Dou ◽

Jonah Fisher ◽

Jonathan A Heiss ◽

Vy Nguyen ◽

...

Keyword(s):

Dna Methylation ◽

Epithelial Cells ◽

Immune Cell ◽

R Package ◽

Magnetic Bead ◽

Reference Panel ◽

Size Exclusion ◽

Cell Type ◽

Whole Saliva ◽

Epidemiology Studies

Saliva is a widely used biological sample, especially in pediatric research, containing a heterogenous mixture of immune and epithelial cells. Associations of exposure or disease with saliva DNA methylation can be influenced by cell-type proportions. Here, we developed a saliva cell-type DNA methylation reference panel to estimate interindividual cell-type heterogeneity in whole saliva studies. Saliva was collected from 22 children (7-16 years) and sorted into immune and epithelial cells, using size exclusion filtration and magnetic bead sorting. DNA methylation was measured using the Illumina MethylationEPIC BeadChip. We assessed cell-type differences in DNA methylation profiles and tested for enriched biological pathways. Immune and epithelial cells differed at 164,793 (20.7%) DNA methylation sites (t-test p < 10-8). Immune cell hypomethylated sites mapped to genes enriched for immune pathways (p < 3.2 x 10-5). Epithelial cell hypomethylated sites were enriched for cornification (p = 5.2 x 10-4), a key process for hard palette formation. Saliva immune and epithelial cells have distinct DNA methylation profiles which can drive whole saliva DNA methylation measures. A primary saliva DNA methylation reference panel, easily implemented with an R package, will allow estimates of cell proportions from whole saliva samples and improve epigenetic epidemiology studies by accounting for measurement heterogeneity by cell-type proportions.

Download Full-text