scholarly journals BiTSC2: Bayesian inference of Tumor clonal Tree by joint analysis of Single-Cell SNV and CNA data

2020 ◽  
Author(s):  
Ziwei Chen ◽  
Fuzhou Gong ◽  
Lin Wan ◽  
Liang Ma

AbstractThe rapid development of single-cell DNA sequencing (scDNA-seq) technology has greatly enhanced the resolution of tumor cell profiling, providing an unprecedented perspective in characterizing intra-tumoral heterogeneity and understanding tumor progression and metastasis. However, prominent algorithms for constructing tumor phylogeny based on scDNA-seq data usually only take single nucleotide variations (SNVs) as markers, failing to consider the effect caused by copy number alterations (CNAs). Here, we propose BiTSC2, Bayesian inference of Tumor clonal Tree by joint analysis of Single-Cell SNV and CNA data. BiTSC2 takes raw reads from scDNA-seq as input, accounts for sequencing errors, models dropout rate and assigns single cells into subclones. By applying Markov Chain Monte Carlo (MCMC) sampling, BiTSC2 can simultaneously estimate the subclonal scCNA and scSNV genotype matrices, sub-clonal assignments and tumor subclonal evolutionary tree. In comparison with existing methods on synthetic and real tumor data, BiTSC2 shows high accuracy in genotype recovery and sub-clonal assignment. BiTSC2 also performs robustly in dealing with scDNA-seq data with low sequencing depth and variant dropout rate.

2016 ◽  
Author(s):  
Olivier Poirion ◽  
Xun Zhu ◽  
Travers Ching ◽  
Lana X. Garmire

AbstractDespite its popularity, characterization of subpopulations with transcript abundance is subject to a significant amount of noise. We propose to use effective and expressed nucleotide variations (eeSNVs) from scRNA-seq as alternative features for tumor subpopulation identification. We developed a linear modeling framework, SSrGE, to link eeSNVs associated with gene expression. In all the datasets tested, eeSNVs achieve better accuracies than gene expression for identifying subpopulations. Previously validated cancer-relevant genes are also highly ranked, confirming the significance of the method. Moreover, SSrGE is capable of analyzing coupled DNA-seq and RNA-seq data from the same single cells, demonstrating its value in integrating multi-omics single cell techniques. In summary, SNV features from scRNA-seq data have merits for both subpopulation identification and linkage of genotype-phenotype relationship. The method SSrGE is available at https://github.com/lanagarmire/SSrGE.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Julian Lamanna ◽  
Erica Y. Scott ◽  
Harrison S. Edwards ◽  
M. Dean Chamberlain ◽  
Michael D. M. Dryden ◽  
...  

Abstract We introduce Digital microfluidic Isolation of Single Cells for -Omics (DISCO), a platform that allows users to select particular cells of interest from a limited initial sample size and connects single-cell sequencing data to their immunofluorescence-based phenotypes. Specifically, DISCO combines digital microfluidics, laser cell lysis, and artificial intelligence-driven image processing to collect the contents of single cells from heterogeneous populations, followed by analysis of single-cell genomes and transcriptomes by next-generation sequencing, and proteomes by nanoflow liquid chromatography and tandem mass spectrometry. The results described herein confirm the utility of DISCO for sequencing at levels that are equivalent to or enhanced relative to the state of the art, capable of identifying features at the level of single nucleotide variations. The unique levels of selectivity, context, and accountability of DISCO suggest potential utility for deep analysis of any rare cell population with contextual dependencies.


2019 ◽  
Author(s):  
Ning Wang ◽  
Andrew E. Teschendorff

AbstractInferring the activity of transcription factors in single cells is a key task to improve our understanding of development and complex genetic diseases. This task is, however, challenging due to the relatively large dropout rate and noisy nature of single-cell RNA-Seq data. Here we present a novel statistical inference framework called SCIRA (Single Cell Inference of Regulatory Activity), which leverages the power of large-scale bulk RNA-Seq datasets to infer high-quality tissue-specific regulatory networks, from which regulatory activity estimates in single cells can be subsequently obtained. We show that SCIRA can correctly infer regulatory activity of transcription factors affected by high technical dropouts. In particular, SCIRA can improve sensitivity by as much as 70% compared to differential expression analysis and current state-of-the-art methods. Importantly, SCIRA can reveal novel regulators of cell-fate in tissue-development, even for cell-types that only make up 5% of the tissue, and can identify key novel tumor suppressor genes in cancer at single cell resolution. In summary, SCIRA will be an invaluable tool for single-cell studies aiming to accurately map activity patterns of key transcription factors during development, and how these are altered in disease.


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 240 ◽  
Author(s):  
Prashant N. M. ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
Liam Spurr ◽  
Nawaf Alomran ◽  
...  

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.


2021 ◽  
Author(s):  
Aaron Wing Cheung Kwok ◽  
Chen Qiao ◽  
Rongting Huang ◽  
Mai-Har Sham ◽  
Joshua W. K. Ho ◽  
...  

AbstractMitochondrial mutations are increasingly recognised as informative endogenous genetic markers that can be used to reconstruct cellular clonal structure using single-cell RNA or DNA sequencing data. However, there is a lack of effective computational methods to identify informative mtDNA variants in noisy and sparse single-cell sequencing data. Here we present an open source computational tool MQuad that accurately calls clonally informative mtDNA variants in a population of single cells, and an analysis suite for complete clonality inference, based on single cell RNA or DNA sequencing data. Through a variety of simulated and experimental single cell sequencing data, we showed that MQuad can identify mitochondrial variants with both high sensitivity and specificity, outperforming existing methods by a large extent. Furthermore, we demonstrated its wide applicability in different single cell sequencing protocols, particularly in complementing single-nucleotide and copy-number variations to extract finer clonal resolution. MQuad is a Python package available via https://github.com/single-cell-genetics/MQuad.


2020 ◽  
Vol 6 (50) ◽  
pp. eabd6454
Author(s):  
Qingyu Ruan ◽  
Weidong Ruan ◽  
Xiaoye Lin ◽  
Yang Wang ◽  
Fenxiang Zou ◽  
...  

Single-cell whole-genome sequencing (WGS) is critical for characterizing dynamic intercellular changes in DNA. Current sample preparation technologies for single-cell WGS are complex, expensive, and suffer from high amplification bias and errors. Here, we describe Digital-WGS, a sample preparation platform that streamlines high-performance single-cell WGS with automatic processing based on digital microfluidics. Using the method, we provide high single-cell capture efficiency for any amount and types of cells by a wetted hydrodynamic structure. The digital control of droplets in a closed hydrophobic interface enables the complete removal of exogenous DNA, sufficient cell lysis, and lossless amplicon recovery, achieving the low coefficient of variation and high coverage at multiple scales. The single-cell genomic variations profiling performs the excellent detection of copy number variants with the smallest bin of 150 kb and single-nucleotide variants with allele dropout rate of 5.2%, holding great promise for broader applications of single-cell genomics.


2017 ◽  
Author(s):  
Craig L. Bohrson ◽  
Allison R. Barton ◽  
Michael A. Lodato ◽  
Rachel E. Rodin ◽  
Vinay Viswanadham ◽  
...  

AbstractWhole-genome sequencing of DNA from single cells has the potential to reshape our understanding of the mutational heterogeneity in normal and disease tissues. A major difficulty, however, is distinguishing artifactual mutations that arise from DNA isolation and amplification from true mutations. Here, we describe linked-read analysis (LiRA), a method that utilizes phasing of somatic single nucleotide variants with nearby germline variants to identify true mutations, thereby allowing accurate estimation of somatic mutation rates at the single cell level.


2021 ◽  
Author(s):  
Chenxu Zhu ◽  
Yanxiao Zhang ◽  
Yang Eric Li ◽  
Jacinta Lucero ◽  
M. Margarita Behrens ◽  
...  

Abstract We describe here Paired-Tag, a high-throughput multi-omics method for joint profiling of histone modifications and gene expressions in single cells. The assay is based on a combinatorial barcoding indexing strategy that does not require special instruments. It can be performed with nuclei extracted from cultured cells or frozen tissues, in standard molecular biology laboratories.


2019 ◽  
Vol 23 (5) ◽  
pp. 508-518
Author(s):  
E. A. Vodiasova ◽  
E. S. Chelebieva ◽  
O. N. Kuleshova

A wealth of genome and transcriptome data obtained using new generation sequencing (NGS) technologies for whole organisms could not answer many questions in oncology, immunology, physiology, neurobiology, zoology and other fields of science and medicine. Since the cell is the basis for the living of all unicellular and multicellular organisms, it is necessary to study the biological processes at its level. This understanding gave impetus to the development of a new direction – the creation of technologies that allow working with individual cells (single-cell technology). The rapid development of not only instruments, but also various advanced protocols for working with single cells is due to the relevance of these studies in many fields of science and medicine. Studying the features of various stages of ontogenesis, identifying patterns of cell differentiation and subsequent tissue development, conducting genomic and transcriptome analyses in various areas of medicine (especially in demand in immunology and oncology), identifying cell types and states, patterns of biochemical and physiological processes using single cell technologies, allows the comprehensive research to be conducted at a new level. The first RNA-sequencing technologies of individual cell transcriptomes (scRNA-seq) captured no more than one hundred cells at a time, which was insufficient due to the detection of high cell heterogeneity, existence of the minor cell types (which were not detected by morphology) and complex regulatory pathways. The unique techniques for isolating, capturing and sequencing transcripts of tens of thousands of cells at a time are evolving now. However, new technologies have certain differences both at the sample preparation stage and during the bioinformatics analysis. In the paper we consider the most effective methods of multiple parallel scRNA-seq using the example of 10XGenomics, as well as the specifics of such an experiment, further bioinformatics analysis of the data, future outlook and applications of new high-performance technologies.


2019 ◽  
Author(s):  
Yiliang Zhang ◽  
Kexuan Liang ◽  
Molei Liu ◽  
Yue Li ◽  
Hao Ge ◽  
...  

AbstractSingle-cell RNA sequencing technologies are widely used in recent years as a powerful tool allowing the observation of gene expression at the resolution of single cells. Two of the major challenges in scRNA-seq data analysis are dropout events and batch effects. The inflation of zero(dropout rate) varies substantially across single cells. Evidence has shown that technical noise, including batch effects, explains a notable proportion of this cell-to-cell variation. To capture biological variation, it is necessary to quantify and remove technical variation. Here, we introduce SCRIBE (Single-Cell Recovery Imputation with Batch Effects), a principled framework that imputes dropout events and corrects batch effects simultaneously. We demonstrate, through real examples, that SCRIBE outperforms existing scRNA-seq data analysis tools in recovering cell-specific gene expression patterns, removing batch effects and retaining biological variation across cells. Our software is freely available online at https://github.com/YiliangTracyZhang/SCRIBE.


Sign in / Sign up

Export Citation Format

Share Document