scholarly journals Benchmarking supervised signature-scoring methods for single-cell RNA sequencing data in cancer

2021 ◽  
Author(s):  
Siyuan Zheng ◽  
Noureen Nighat ◽  
Zhenqing Ye ◽  
Yidong Chen ◽  
Xiaojing Wang

Quantifying the activity of gene expression signatures is common in analyses of single-cell RNA sequencing data. Methods originally developed for bulk samples are often used for this purpose without accounting for contextual differences between bulk and single-cell data. More broadly, these methods have not been benchmarked. Here we benchmark four such supervised methods, including single sample gene set enrichment analysis (ssGSEA), AUCell, Single Cell Signature Explorer (SCSE), and a new method we developed, Jointly Assessing Signature Mean and Inferring Enrichment (JASMINE). Using cancer as an example, we show cancer cells consistently express more genes than normal cells. This imbalance leads to bias in performance by bulk-sample-based ssGSEA in gold standard tests and down sampling experiments. In contrast, single-cell-based methods are less susceptible. Our results suggest caution should be exercised when using bulk-sample-based methods in single-cell data analyses, and cellular contexts should be taken into consideration when designing benchmarking strategies.

2021 ◽  
pp. 1-10
Author(s):  
Min Wu ◽  
Junhua Xu ◽  
Shanshan Zhu ◽  
Jinzhi Lei ◽  
Jie Gao

Analysis of single-cell RNA sequencing (scRNA-seq) data is often complicate due to the sparsity and high data dimensionality. In this work, we proposed Fuzzy C-means based linear stable-exponential distribution (LSED) model for analyzing scRNA-seq count data of chronic myeloid leukemia (CML) patients. We propose pipelines stages for analysis in which noisy and inconsistent data form sequencing is removed during data preprocessing, this process data then form s the cluster of gene feature using fuzzy c-means (FCM) clustering, relevant features are extracted during feature extraction approach. These extracted features are then fed into LSED model in order to difference feature data of gene expression. Finally we evaluate the performance for proposed analysis model based on parameter estimation, distribution comparison and parameter analysis. From the result analysis it was observed that proposed analysis model parameter reflect change in condition of patient more effectively as well as this model fits difference data of gene expression in more better way in comparison to Cauchy and stable distribution. Additional, the results of Gene-set enrichment analysis specify the affinity of proposed model can replicate the distinct enhancement of BCR-ABL+ stem cell as well as BCR-ABL- stem cells. Significantly, Proposed FCM based LSED analysis model studies CML from the perspective of statistical models, which present a new sight for CML scientific research.


Author(s):  
Zilong Zhang ◽  
Feifei Cui ◽  
Chunyu Wang ◽  
Lingling Zhao ◽  
Quan Zou

Abstract Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at the cellular level. However, due to the extremely low levels of transcripts in a single cell and technical losses during reverse transcription, gene expression at a single-cell resolution is usually noisy and highly dimensional; thus, statistical analyses of single-cell data are a challenge. Although many scRNA-seq data analysis tools are currently available, a gold standard pipeline is not available for all datasets. Therefore, a general understanding of bioinformatics and associated computational issues would facilitate the selection of appropriate tools for a given set of data. In this review, we provide an overview of the goals and most popular computational analysis tools for the quality control, normalization, imputation, feature selection and dimension reduction of scRNA-seq data.


2021 ◽  
Vol 23 (Supplement_6) ◽  
pp. vi99-vi99
Author(s):  
Thomas Lai ◽  
Janet Treger ◽  
Jingyou Rao ◽  
Robert Prins ◽  
Richard Everson

Abstract INTRODUCTION The immunotherapeutic targeting of New York-esophageal squamous cell carcinoma (NY-ESO-1) and other cancer/testis antigens (CTA) is an appealing strategy for the treatment of malignant gliomas because CTA are not expressed in most normal adult tissues and their expression can be induced in tumors for targeting by T-cells. Basally, NY-ESO-1 is often poorly expressed in glioblastoma (GBM), presumably through promoter methylation. Our previous data has shown that the hypomethylating agent decitabine (DAC) is a strong sensitizer of GBM to NY-ESO-1 specific adoptive T-cell induced cytotoxicity. However, we hypothesized that DAC alters expression of other immuno-modulatory genes in addition to NY-ESO-1 that may also increase T-cell mediated killing in GBM. The extent of regulation of other immuno-modulatory genes to DAC demethylation has not yet been thoroughly investigated. METHODS We performed single-cell RNA sequencing on DAC and non-DAC treated glioma cells. We confirmed DAC treatment induced NY-ESO-1 expression with quantitative real-time PCR and demethylation using bisulfite sequencing. We analyzed our single-cell RNA sequencing data with Seurat and our customized R-based pipelines to identify coordinate differential expression of immuno-modulatory genes and their tumor subpopulations. RESULTS Using our single-cell data, we identified tumor subpopulations with coordinate differentially expressed immuno-modulatory genes including cancer testes antigens, antigen presentation proteins, and apoptosis regulators. Amongst these candidate genes, we validated their expression with qPCR and promoter demethylation with bisulfite sequencing and TA bisulfite cloning. CONCLUSION Exposure of glioma cells to DAC results in promoter demethylation of NY-ESO-1, with increased expression of CTA and other immunomodulatory genes. DAC treatment may therefore sensitize GBM to the immunotherapeutic targeting of these antigens and reveals a feasible strategy for clinical translation.


2021 ◽  
Vol 9 (Suppl 1) ◽  
pp. A10.1-A10
Author(s):  
J Lammers ◽  
F Calkoen ◽  
M Kranendonk ◽  
A Federico ◽  
M Kool ◽  
...  

BackgroundEpendymoma is the third most common brain tumor in children. At the moment, surgery and radiotherapy are the only effective treatments that can be offered, and despite this, a significant part of the patients relapse with no therapeutic salvage options. Therefore, new treatment modalities are needed. To develop immunotherapies for these children, knowledge of the tumor microenvironment is crucial. The current study aims to unravel the tumor immune microenvironment (TIME) of pediatric posterior fossa A (PFA) ependymomas.Materials and MethodsWe used bulk RNA sequencing data of 22 pediatric ependymomas. We defined two groups, hereafter called PFA immune+ (PFAI+) and PFAI-, based on the RNA expression levels of the NanoString panel of Human PanCancer Immune Profiling genes. We performed gene set enrichment analysis and deconvoluted the bulk RNA samples with ependymoma-specific single-cell RNA sequencing datasets. To validate our findings on a protein level, we applied immunohistochemistry with antibodies recognizing tumor-infiltrating lymphocytes, tumor-associated macrophages and microglia.ResultsUnsupervised hierarchical clustering of RNA expression of immune-related genes revealed two distinct PFA groups. Differential gene expression analysis showed that PFAI+ have a significantly higher expression of genes associated with immune functions, such as CD3E, CCR2, GZMA, CXCL9 and TRBC2. Accordingly, gene set enrichment analysis demonstrated that several immune pathways, including T-cell signalling, interferon-gamma response and TNFα signalling are enriched in PFAI+ ependymomas. RNA expression of immune checkpoints was also higher in PFAI+ tumors, indicating that these tumors might be more responsive to combinational therapies including immune checkpoint inhibitors. While immunohistochemistry showed low amounts of infiltrating CD3+, CD8+ and CD20+ cells, high numbers of CD163+ and HLA-DRA+ cells were detected. These cells were mainly located in regions of tumor necrosis. Increased amounts of CD4+ and CD8+ lymphocytes were present in PFAI+ tumors compared to PFAI- tumors. Deconvolution of the bulk RNA samples based on single-cell RNA sequencing data revealed an enrichment of myeloid cell populations, especially microglia and macrophages. Furthermore, PFAI+ tumors were found to contain significantly higher relative proportions of T-cells compared to PFAI- tumors (median of 3.76% for PFAI+ compared to 0.03% for PFAI-).ConclusionsWe suggest that pediatric posterior fossa A ependymomas can be divided into two groups based on the expression of immune-related genes, in which PFAI+ ependymomas are characterized by higher RNA expression levels of these genes and greater amounts of tumor-infiltrating immune cells. Several techniques showed an enrichment of T-lymphocytes in the PFAI+ ependymomas relative to the PFAI- ependymomas.Disclosure InformationJ. Lammers: None. F. Calkoen: None. M. Kranendonk: None. A. Federico: None. M. Kool: None. L. Kester: None. J. van der Lugt: None.


2021 ◽  
Author(s):  
Hui Jin ◽  
Bin Huang ◽  
Zijuan Wu ◽  
Huayuan Zhu ◽  
Hanning Tang ◽  
...  

Abstract BackgroundIbrutinib as a widely used Bruton’s tyrosine kinase inhibitor has shown outstanding value in clinical therapy for chronic lymphocytic leukemia (CLL). However, the bottleneck of ibrutinib resistance has caused widespread concerns, necessitating the exploration of novel targets. MethodsSingle-cell RNA sequencing (scRNA-seq) was used to characterize the heterogeneity of ibrutinib-sensitive (IBS) and -resistant (IBR) CLL patients and single-cell stemness estimation and metabolic pathway enrichment analysis were performed. Lectin galactoside-binding soluble 1 (LGALS1) and lymphocyte-activating gene 3 (LAG3) were screened as key factors by analyzing the RNA-sequencing data at bulk and single cell levels. Subsequently, pseudo-time trajectory analysis and gene set enrichment analysis were conducted. In addition, an IBR CLL cell line (MEC1-IR) was generated and RT-qPCR, western blotting, and immunofluorescence were performed to detect the expression of LGALS1 and LAG3. OTX008, a selective inhibitor of galectin-1 (Gal-1, encoded by LGALS1) was assessed in CLL cells and CCK8 and apoptotic assays were conducted for functional analysis.ResultsIBR CLL showed significantly different characteristics from IBS in terms of transcriptome expression and energy metabolism. LGALS1 and LAG3 were gradually upregulated in B cells along the evolution trajectory from IBS to IBR. Their expression was verified to be closely related to the prognosis of CLL, as well as sensitivity to ibrutinib. OTX008 could effectively suppress the proliferation and induce apoptosis of CLL cells, especially for those with ibrutinib resistance.ConclusionsAn LGALS1 and LAG3 gene panel is a promising indicator of ibrutinib resistance and a prognostic marker for CLL. OTX008 displays pronounced performance against CLL cells, especially with IBR, and might represent a novel therapeutic strategy for CLL.


2020 ◽  
Vol 22 (Supplement_2) ◽  
pp. ii110-ii110
Author(s):  
Christina Jackson ◽  
Christopher Cherry ◽  
Sadhana Bom ◽  
Hao Zhang ◽  
John Choi ◽  
...  

Abstract BACKGROUND Glioma associated myeloid cells (GAMs) can be induced to adopt an immunosuppressive phenotype that can lead to inhibition of anti-tumor responses in glioblastoma (GBM). Understanding the composition and phenotypes of GAMs is essential to modulating the myeloid compartment as a therapeutic adjunct to improve anti-tumor immune response. METHODS We performed single-cell RNA-sequencing (sc-RNAseq) of 435,400 myeloid and tumor cells to identify transcriptomic and phenotypic differences in GAMs across glioma grades. We further correlated the heterogeneity of the GAM landscape with tumor cell transcriptomics to investigate interactions between GAMs and tumor cells. RESULTS sc-RNAseq revealed a diverse landscape of myeloid-lineage cells in gliomas with an increase in preponderance of bone marrow derived myeloid cells (BMDMs) with increasing tumor grade. We identified two populations of BMDMs unique to GBMs; Mac-1and Mac-2. Mac-1 demonstrates upregulation of immature myeloid gene signature and altered metabolic pathways. Mac-2 is characterized by expression of scavenger receptor MARCO. Pseudotime and RNA velocity analysis revealed the ability of Mac-1 to transition and differentiate to Mac-2 and other GAM subtypes. We further found that the presence of these two populations of BMDMs are associated with the presence of tumor cells with stem cell and mesenchymal features. Bulk RNA-sequencing data demonstrates that gene signatures of these populations are associated with worse survival in GBM. CONCLUSION We used sc-RNAseq to identify a novel population of immature BMDMs that is associated with higher glioma grades. This population exhibited altered metabolic pathways and stem-like potentials to differentiate into other GAM populations including GAMs with upregulation of immunosuppressive pathways. Our results elucidate unique interactions between BMDMs and GBM tumor cells that potentially drives GBM progression and the more aggressive mesenchymal subtype. Our discovery of these novel BMDMs have implications in new therapeutic targets in improving the efficacy of immune-based therapies in GBM.


2021 ◽  
Vol 12 (2) ◽  
pp. 317-334
Author(s):  
Omar Alaqeeli ◽  
Li Xing ◽  
Xuekui Zhang

Classification tree is a widely used machine learning method. It has multiple implementations as R packages; rpart, ctree, evtree, tree and C5.0. The details of these implementations are not the same, and hence their performances differ from one application to another. We are interested in their performance in the classification of cells using the single-cell RNA-Sequencing data. In this paper, we conducted a benchmark study using 22 Single-Cell RNA-sequencing data sets. Using cross-validation, we compare packages’ prediction performances based on their Precision, Recall, F1-score, Area Under the Curve (AUC). We also compared the Complexity and Run-time of these R packages. Our study shows that rpart and evtree have the best Precision; evtree is the best in Recall, F1-score and AUC; C5.0 prefers more complex trees; tree is consistently much faster than others, although its complexity is often higher than others.


Author(s):  
Yinlei Hu ◽  
Bin Li ◽  
Falai Chen ◽  
Kun Qu

Abstract Unsupervised clustering is a fundamental step of single-cell RNA sequencing data analysis. This issue has inspired several clustering methods to classify cells in single-cell RNA sequencing data. However, accurate prediction of the cell clusters remains a substantial challenge. In this study, we propose a new algorithm for single-cell RNA sequencing data clustering based on Sparse Optimization and low-rank matrix factorization (scSO). We applied our scSO algorithm to analyze multiple benchmark datasets and showed that the cluster number predicted by scSO was close to the number of reference cell types and that most cells were correctly classified. Our scSO algorithm is available at https://github.com/QuKunLab/scSO. Overall, this study demonstrates a potent cell clustering approach that can help researchers distinguish cell types in single-cell RNA sequencing data.


Sign in / Sign up

Export Citation Format

Share Document