scholarly journals Scirpy: a Scanpy extension for analyzing single-cell T-cell receptor-sequencing data

2020 ◽  
Vol 36 (18) ◽  
pp. 4817-4818 ◽  
Author(s):  
Gregor Sturm ◽  
Tamas Szabo ◽  
Georgios Fotakis ◽  
Marlene Haider ◽  
Dietmar Rieder ◽  
...  

Abstract Summary Advances in single-cell technologies have enabled the investigation of T-cell phenotypes and repertoires at unprecedented resolution and scale. Bioinformatic methods for the efficient analysis of these large-scale datasets are instrumental for advancing our understanding of adaptive immune responses. However, while well-established solutions are accessible for the processing of single-cell transcriptomes, no streamlined pipelines are available for the comprehensive characterization of T-cell receptors. Here, we propose single-cell immune repertoires in Python (Scirpy), a scalable Python toolkit that provides simplified access to the analysis and visualization of immune repertoires from single cells and seamless integration with transcriptomic data. Availability and implementation Scirpy source code and documentation are available at https://github.com/icbi-lab/scirpy. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Author(s):  
Gregor Sturm ◽  
Tamas Szabo ◽  
Georgios Fotakis ◽  
Marlene Haider ◽  
Dietmar Rieder ◽  
...  

AbstractSummaryAdvances in single-cell technologies have enabled the investigation of T cell phenotypes and repertoires at unprecedented resolution and scale. Bioinformatic methods for the efficient analysis of these large-scale datasets are instrumental for advancing our understanding of adaptive immune responses in cancer, but also in infectious diseases like COVID-19. However, while well-established solutions are accessible for the processing of single-cell transcriptomes, no streamlined pipelines are available for the comprehensive characterization of T cell receptors. Here we propose Scirpy, a scalable Python toolkit that provides simplified access to the analysis and visualization of immune repertoires from single cells and seamless integration with transcriptomic data.Availability and implementationScirpy source code and documentation are available at https://github.com/icbi-lab/scirpy.


2020 ◽  
Vol 36 (15) ◽  
pp. 4255-4262
Author(s):  
Si-Yi Chen ◽  
Chun-Jie Liu ◽  
Qiong Zhang ◽  
An-Yuan Guo

Abstract Motivation T-cell receptors (TCRs) function to recognize antigens and play vital roles in T-cell immunology. Surveying TCR repertoires by characterizing complementarity-determining region 3 (CDR3) is a key issue. Due to the high diversity of CDR3 and technological limitation, accurate characterization of CDR3 repertoires remains a great challenge. Results We propose a computational method named CATT for ultra-sensitive and precise TCR CDR3 sequences detection. CATT can be applied on TCR sequencing, RNA-Seq and single-cell TCR(RNA)-Seq data to characterize CDR3 repertoires. CATT integrated de Bruijn graph-based micro-assembly algorithm, data-driven error correction model and Bayesian inference algorithm, to self-adaptively and ultra-sensitively characterize CDR3 repertoires with high performance. Benchmark results of datasets from in silico and experimental data demonstrated that CATT showed superior recall and precision compared with existing tools, especially for data with short read length and small size and single-cell sequencing data. Thus, CATT will be a useful tool for TCR analysis in researches of cancer and immunology. Availability and implementation http://bioinfo.life.hust.edu.cn/CATT or https://github.com/GuoBioinfoLab/CATT. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Hussein A. Abbas ◽  
Dapeng Hao ◽  
Katarzyna Tomczak ◽  
Praveen Barrodia ◽  
Jin Seon Im ◽  
...  

AbstractIn contrast to the curative effect of allogenic stem cell transplantation in acute myeloid leukemia via T cell activity, only modest responses are achieved with checkpoint-blockade therapy, which might be explained by T cell phenotypes and T cell receptor (TCR) repertoires. Here, we show by paired single-cell RNA analysis and TCR repertoire profiling of bone marrow cells in relapsed/refractory acute myeloid leukemia patients pre/post azacytidine+nivolumab treatment that the disease-related T cell subsets are highly heterogeneous, and their abundance changes following PD-1 blockade-based treatment. TCR repertoires expand and primarily emerge from CD8+ cells in patients responding to treatment or having a stable disease, while TCR repertoires contract in therapy-resistant patients. Trajectory analysis reveals a continuum of CD8+ T cell phenotypes, characterized by differential expression of granzyme B and a bone marrow-residing memory CD8+ T cell subset, in which a population with stem-like properties expressing granzyme K is enriched in responders. Chromosome 7/7q loss, on the other hand, is a cancer-intrinsic genomic marker of PD-1 blockade resistance in AML. In summary, our study reveals that adaptive T cell plasticity and genomic alterations determine responses to PD-1 blockade in acute myeloid leukemia.


2019 ◽  
Vol 35 (22) ◽  
pp. 4688-4695 ◽  
Author(s):  
Rui Hou ◽  
Elena Denisenko ◽  
Alistair R R Forrest

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) measures gene expression at the resolution of individual cells. Massively multiplexed single-cell profiling has enabled large-scale transcriptional analyses of thousands of cells in complex tissues. In most cases, the true identity of individual cells is unknown and needs to be inferred from the transcriptomic data. Existing methods typically cluster (group) cells based on similarities of their gene expression profiles and assign the same identity to all cells within each cluster using the averaged expression levels. However, scRNA-seq experiments typically produce low-coverage sequencing data for each cell, which hinders the clustering process. Results We introduce scMatch, which directly annotates single cells by identifying their closest match in large reference datasets. We used this strategy to annotate various single-cell datasets and evaluated the impacts of sequencing depth, similarity metric and reference datasets. We found that scMatch can rapidly and robustly annotate single cells with comparable accuracy to another recent cell annotation tool (SingleR), but that it is quicker and can handle larger reference datasets. We demonstrate how scMatch can handle large customized reference gene expression profiles that combine data from multiple sources, thus empowering researchers to identify cell populations in any complex tissue with the desired precision. Availability and implementation scMatch (Python code) and the FANTOM5 reference dataset are freely available to the research community here https://github.com/forrest-lab/scMatch. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 21 (4) ◽  
pp. 1209-1223 ◽  
Author(s):  
Raphael Petegrosso ◽  
Zhuliu Li ◽  
Rui Kuang

Abstract   Single-cell RNAsequencing (scRNA-seq) technologies have enabled the large-scale whole-transcriptome profiling of each individual single cell in a cell population. A core analysis of the scRNA-seq transcriptome profiles is to cluster the single cells to reveal cell subtypes and infer cell lineages based on the relations among the cells. This article reviews the machine learning and statistical methods for clustering scRNA-seq transcriptomes developed in the past few years. The review focuses on how conventional clustering techniques such as hierarchical clustering, graph-based clustering, mixture models, $k$-means, ensemble learning, neural networks and density-based clustering are modified or customized to tackle the unique challenges in scRNA-seq data analysis, such as the dropout of low-expression genes, low and uneven read coverage of transcripts, highly variable total mRNAs from single cells and ambiguous cell markers in the presence of technical biases and irrelevant confounding biological variations. We review how cell-specific normalization, the imputation of dropouts and dimension reduction methods can be applied with new statistical or optimization strategies to improve the clustering of single cells. We will also introduce those more advanced approaches to cluster scRNA-seq transcriptomes in time series data and multiple cell populations and to detect rare cell types. Several software packages developed to support the cluster analysis of scRNA-seq data are also reviewed and experimentally compared to evaluate their performance and efficiency. Finally, we conclude with useful observations and possible future directions in scRNA-seq data analytics. Availability All the source code and data are available at https://github.com/kuanglab/single-cell-review.


2021 ◽  
Vol 21 ◽  
pp. S74-S75
Author(s):  
Mattia D’Agostino ◽  
Maeva Fincker ◽  
Cristina Panaroni ◽  
Ashish Yeri ◽  
Pingping Mao ◽  
...  

2018 ◽  
Vol 19 (3) ◽  
pp. 291-301 ◽  
Author(s):  
David Zemmour ◽  
Rapolas Zilionis ◽  
Evgeny Kiner ◽  
Allon M. Klein ◽  
Diane Mathis ◽  
...  

Author(s):  
Abha S Bais ◽  
Dennis Kostka

Abstract Motivation Single-cell RNA sequencing (scRNA-seq) technologies enable the study of transcriptional heterogeneity at the resolution of individual cells and have an increasing impact on biomedical research. However, it is known that these methods sometimes wrongly consider two or more cells as single cells, and that a number of so-called doublets is present in the output of such experiments. Treating doublets as single cells in downstream analyses can severely bias a study’s conclusions, and therefore computational strategies for the identification of doublets are needed. Results With scds, we propose two new approaches for in silico doublet identification: Co-expression based doublet scoring (cxds) and binary classification based doublet scoring (bcds). The co-expression based approach, cxds, utilizes binarized (absence/presence) gene expression data and, employing a binomial model for the co-expression of pairs of genes, yields interpretable doublet annotations. bcds, on the other hand, uses a binary classification approach to discriminate artificial doublets from original data. We apply our methods and existing computational doublet identification approaches to four datasets with experimental doublet annotations and find that our methods perform at least as well as the state of the art, at comparably little computational cost. We observe appreciable differences between methods and across datasets and that no approach dominates all others. In summary, scds presents a scalable, competitive approach that allows for doublet annotation of datasets with thousands of cells in a matter of seconds. Availability and implementation scds is implemented as a Bioconductor R package (doi: 10.18129/B9.bioc.scds). Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document