scholarly journals Joint copy number and mutation phylogeny reconstruction from single-cell amplicon sequencing data

2022 ◽  
Author(s):  
Etienne Sollier ◽  
Jack Kuipers ◽  
Niko Beerenwinkel ◽  
Koichi Takahashi ◽  
Katharina Jahn

Reconstructing the history of somatic DNA alterations that occurred in a tumour can help understand its evolution and predict its resistance to treatment. Single-cell DNA sequencing (scDNAseq) can be used to investigate clonal heterogeneity and to inform phylogeny reconstruction. However, existing phylogenetic methods for scDNAseq data are designed either for point mutations or for large copy number variations, but not for both types of events simultaneously. Here, we develop COMPASS, a computational method for inferring the joint phylogeny of mutations and copy number alterations from targeted scDNAseq data. We evaluate COMPASS on simulated data and show that it outperforms existing methods. We apply COMPASS to a large cohort of 123 patients with acute myeloid leukemia (AML) and detect copy number alterations, including subclonal ones, which are in agreement with current knowledge of AML development. We further used bulk SNP array data to orthogonally validate or findings.

Blood ◽  
2021 ◽  
Vol 138 (Supplement 1) ◽  
pp. 1590-1590
Author(s):  
Mehmet K. Samur ◽  
Anil Aktas-Samur ◽  
Romain Lannes ◽  
Jill Corre ◽  
Anjan Thakurta ◽  
...  

Abstract New generation immunotherapies in Multiple Myeloma (MM) targeting BCMA, have shown remarkable clinical benefits. However relapse still occurs due to tumor intrinsic and extrisic resistance mechanisms including antigen loss related to mutation, deletion and splicing pattern changes. Two recent case reports including ours highlighted biallelic loss of BCMA as a cause for resistance to anti-BCMA targeting therapy. In both studies BCMA locus at 16p was deleted bringing in focus importance of del16p. Here, we have evaluated 2883 MM patients at diagnosis and relapse to understand frequency characteristics of somatic events targeting BCMA. We first evaluated the frequency of deletion involving the BCMA locus (16p13.13) in MM patients from multiple studies using WGS sequencing data as well as using Affymetrix Cytoscan HD and SNP 6.0 arrays. We observed del16p in 8.58 % (7.6% to 14.6% in individual studies) of newly-diagnosed patients (n=2458). Similar frequency was observed in relapsed MM patients not previously exposed to BCMA targeting therapy. Next, we evaluated genome wide copy number alterations (CNAs) in all patients with loss of BCMA locus and observed similar frequency of loss in both hyperdiploid MM (HMM) and non-HMM suggesting its independence from cytogentic subtypes of MM. Overall copy number loss was significantly higher in patients with BCMA loss compared to rest of the MM patients. Patients with loss of BCMA locus have increased mutational load (8202 with 95% HDI 6921 and 9535) compared to those without BCMA locus loss (6975 with 95% HDI 6626 - 7343); probability of difference greater than 0 was 96.8% and difference of the means were 1222 [95% CI -112 - 2589] We next evaluated co-occurrence of BCMA loss with other high risk events and observed del1p and del17p as being significantly associated with loss of BCMA locus [Odds ratio 19.37 (13.13-25.80), FDR = 1.57e-65; and 8.8 (6.39-12.15), FDR = 5.57E-39, respectively)]. Furthermore, we observed that when both BCMA and TP53 loss are present, they have same log ratio (sequencing) or smoothed copy numbers (SNP array). Similarly, we used CDKN2C as a proxy to chromosome 1p loss and observed that when both BCMA and CKDN2C loss are present in the same patient they tend to show similar copy number values. These data suggested a possibility of co-occurrence of these events in the same cell. To further investigate this observation, we used single cell DNA sequencing data from patients with sub clonal and clonal BCMA locus loss. scDNA sequencing showed that almost all cells with BCMA deletion also had TP53 deletion (95%). Interestingly, almost all cells with BCMA loss also had p53 loss, while not all cells with p53 loss had BCMA loss suggesting that the chronology of this copy number alternation may suggest first p53 loss followed by BCMA loss. We further investigated whether a bi-allelic BCMA loss was observed after anti-BCMA targeted CAR-T cell therapy by imputing the copy number alterations using single cell RNA sequencing data. Our data from this case also indicated that BCMA loss tend to co-occur with TP53 deletions (OR=5.67 [95% CI 4.12-7.84], p value < 0.0001). Moreover, TP53 mutations were also more frequent in patients with del16p and del17p, compared to patients who only had del16p or del17p. In summary, our data from large scale copy number profiles at the diagnosis and relapse showed that monoallelic BCMA deletions are frequent events, patients with these events show increased aneuploidy, mostly deletions, potentially making these cells vulnerable for biallelic loss of genes, especially under the pressure of targeted therapy. Our results also highlight that BCMA expressions in bulk sample may not detect the presence or absence of cells with target loss and therefore combining strategies at bulk and single cell level are necessary to understand the disease status. These results suggest the need to study del16p in patients being targeted for BCMA-directed therapy and its association with other risk factors in MM. Disclosures Thakurta: Bristol Myers Squibb: Current Employment, Current equity holder in publicly-traded company. Anderson: Celgene: Membership on an entity's Board of Directors or advisory committees; Janssen: Membership on an entity's Board of Directors or advisory committees; Gilead: Membership on an entity's Board of Directors or advisory committees; Sanofi-Aventis: Membership on an entity's Board of Directors or advisory committees; Millenium-Takeda: Membership on an entity's Board of Directors or advisory committees; Bristol Myers Squibb: Membership on an entity's Board of Directors or advisory committees; Pfizer: Membership on an entity's Board of Directors or advisory committees; Scientific Founder of Oncopep and C4 Therapeutics: Current equity holder in publicly-traded company, Current holder of individual stocks in a privately-held company; AstraZeneca: Membership on an entity's Board of Directors or advisory committees; Mana Therapeutics: Membership on an entity's Board of Directors or advisory committees. Munshi: Takeda: Consultancy; Adaptive Biotechnology: Consultancy; Amgen: Consultancy; Karyopharm: Consultancy; Celgene: Consultancy; Abbvie: Consultancy; Oncopep: Consultancy, Current equity holder in publicly-traded company, Other: scientific founder, Patents & Royalties; Novartis: Consultancy; Legend: Consultancy; Pfizer: Consultancy; Janssen: Consultancy; Bristol-Myers Squibb: Consultancy.


Author(s):  
Jack Kuipers ◽  
Mustafa Anıl Tuncel ◽  
Pedro Ferreira ◽  
Katharina Jahn ◽  
Niko Beerenwinkel

Copy number alterations are driving forces of tumour development and the emergence of intra-tumour heterogeneity. A comprehensive picture of these genomic aberrations is therefore essential for the development of personalised and precise cancer diagnostics and therapies. Single-cell sequencing offers the highest resolution for copy number profiling down to the level of individual cells. Recent high-throughput protocols allow for the processing of hundreds of cells through shallow whole-genome DNA sequencing. The resulting low read-depth data poses substantial statistical and computational challenges to the identification of copy number alterations. We developed SCICoNE, a statistical model and MCMC algorithm tailored to single-cell copy number profiling from shallow whole-genome DNA sequencing data. SCICoNE reconstructs the history of copy number events in the tumour and uses these evolutionary relationships to identify the copy number profiles of the individual cells. We show the accuracy of this approach in evaluations on simulated data and demonstrate its practicability in applications to a xenograft breast cancer sample.


2014 ◽  
Author(s):  
Tyler Garvin ◽  
Robert Aboukhalil ◽  
Jude Kendall ◽  
Timour Baslan ◽  
Gurinder S. Atwal ◽  
...  

We present an open-source visual-analytics web platform, Ginkgo (http://qb.cshl.edu/ginkgo), for the interactive analysis and quality assessment of single-cell copy-number alterations. Ginkgo automatically constructs copy-number profiles of individual cells from mapped reads, as well as constructing phylogenetic trees of related cells. We validate Ginkgo by reproducing the results of five major studies and examine the data characteristics of three commonly used single-cell amplification techniques to conclude DOP-PCR to be the most consistent for CNV analysis.


2021 ◽  
Vol 12 ◽  
Author(s):  
Kirill Anoshkin ◽  
Ivan Vasilyev ◽  
Kristina Karandasheva ◽  
Mikhail Shugay ◽  
Valeriya Kudryavtseva ◽  
...  

Insulinomatosis is characterized by monohormonality of multiple macro-tumors and micro-tumors that arise synchronously and metachronously in all regions of the pancreas, and often recurring hypoglycemia. One of the main characteristics of insulinomatosis is the presence of insulin-expressing monohormonal endocrine cell clusters that are exclusively composed of proliferating insulin-positive cells, are less than 1 mm in size, and show solid islet-like structure. It is presumed that insulinomatosis affects the entire population of β-cells. With regards to molecular genetics, this phenomenon is not related to mutation in MEN1 gene and is more similar to sporadic benign insulinomas, however, at the moment molecular genetics of this disease remains poorly investigated. NGS sequencing was performed with a panel of 409 cancer-related genes. Results of sequencing were analyzed by bioinformatic algorithms for detecting point mutations and copy number variations. DNA copy number variations were detected that harbor a large number of genes in insulinoma and fewer genes in micro-tumors. qPCR was used to confirm copy number variations at ATRX, FOXL2, IRS2 and CEBPA genes. Copy number alterations involving FOXL2, IRS2, CEBPA and ATRX genes were observed in insulinoma as well as in micro-tumors samples, suggesting that alterations of these genes may promote malignization in the β-cells population.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 122-122
Author(s):  
Nicola E Potter ◽  
Luca Ermini ◽  
Elli Papaemmanuil ◽  
Gowri Vijayaraghavan ◽  
Ian Titley ◽  
...  

Abstract Abstract 122 Cancer clone development is widely regarded as an evolutionary or Darwinian process of genetic diversification and natural (or therapeutic) selection within tissue ecosystems. Emerging studies are providing strong evidence that dynamic and complex branching sub-clonal genetic architectures are a common feature of cancer (Greaves M and Maley CC Nature 2012). This complexity may underpin the intransigence of advanced cancer to therapeutic control, particularly as the critical 'driver' cells – cancer or leukaemic stem cells, also appear to be genetically diverse within individual patients (Anderson K et al Nature 2011, Notta F et al Nature 2011). Sub-clonal architecture can only be fully determined through the study of large numbers of single cells uniformly sampled from the individual cancer of interest and assessed for composite genotype. Various technologies and approaches from fluorescent in situ hybridisation (FISH) to whole-genome sequencing of single cells have been applied to cancer and leukaemic cells but each approach has limitations. We have developed a novel multiplex microfluidic Q-PCR approach that allows unbiased single cell sampling, high throughput analysis of hundreds of individual cells and simultaneous detection of multiple genetic alterations in a single cell, including fusion genes, DNA copy number alterations (CNAs) and sequence-based mutations. As a proof of principle study we have applied this technique to REH, an acute lymphoblastic leukaemia (ALL) cell line that harbors the ETV6-RUNX1 fusion and a SNP in the EPO receptor gene, which we used as a surrogate mutation. We further determined a detailed sub-clonal genetic architecture for two ETV6-RUNX1 positive ALL patient samples with multiple point mutations and copy number alterations (determined by whole-genome sequencing) by interrogating approximately 400 flow cytometry sorted single cells with validation by FISH and standard sequencing. Briefly, single cells were lysed prior to multiplex specific (DNA) target amplification (STA) and Q-PCR using the 96.96 dynamic microfluidic array and the BioMarkï HD (Fluidigm, UK). Phylogenetic trees were constructed using maximum parsimony with PAUP analysis software. Interrogation of REH revealed that all single cells registered the ETV6-RUNX1 fusion and EPO receptor SNP, but 42% of cells gained either 1 or 2 additional copies of chromosome 21. Patient sample data revealed branching sub-clonal architectures in Case A in which all leukaemic cells harbored the fusion with additional point mutations but only sub-clones showed CNAs. In contrast, the sub-clonal architecture of Case B showed that whilst the ETV6-RUNX1 fusion was the earliest (or universal) genomic event, CNAs were relatively early events preceding the acquisition of point mutations (Figure 1). In both cases, the numerically predominant sub-clone harbored both point mutations and CNAs in addition to the presumptive initiating lesion, ETV6-RUNX1. These detailed and complex sub-clonal architectures would be masked by other genetic techniques. Single cell genetics coupled with deep genome sequencing is now technically feasible and provides an accurate portrait of the dynamic clonal complexity in leukaemia (and other cancers). Variegated genetics and clonal complexity in individual leukaemias has important implications for our understanding of molecular pathogenesis and for therapeutic targeting. Figure 1. This sub-clonal genetic architecture depicts the branching structure found for Case B, illustrating that in this case the ETV6-RUNX1 fusion was the earliest genomic event, followed by CNAs and the acquisition of point mutations. Those populations highlighted grey are within the experimental error rate but potentially true populations. Figure 1. This sub-clonal genetic architecture depicts the branching structure found for Case B, illustrating that in this case the ETV6-RUNX1 fusion was the earliest genomic event, followed by CNAs and the acquisition of point mutations. Those populations highlighted grey are within the experimental error rate but potentially true populations. Disclosures: No relevant conflicts of interest to declare.


2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Fei Luo

Abstract Background The Copy Number Alterations (CNAs) are discovered to be tightly associated with cancers, so accurately detecting them is one of the most important tasks in the cancer genomics. A series of CNAs detection methods have been proposed and new ones are still being developed. Due to the complexity of CNAs in cancers, no CNAs detection method has been accepted as the gold standard caller. Several evaluation works have made attempts to reveal typical CNAs detection methods’ performance. Limited by the scale of evaluation data, these different comparison works don’t reach a consensus and the researchers are still confused on how to choose one proper CNAs caller for their analysis. Therefore, it needs a more comprehensive evaluation of typical CNAs detection methods’ performance. Results In this work, we use a large-scale real dataset from CAGEKID consortium to evaluate total 12 typical CNAs detection methods. These methods are most widely used in cancer researches and always used as benchmark for the newly proposed CNAs detection methods. This large-scale dataset comprises of SNP array data on 94 samples and the whole genome sequencing data on 10 samples. Evaluations are comprehensively implemented in current scenarios of CNAs detection, which include that detect CNAs on SNP array data, on sequencing data with tumor and normal matched samples and on sequencing data with single tumor sample. Three SNP based methods are firstly ranked. Subsequently, the best SNP based method’s results are used as benchmark to compare six matched samples based methods and three single tumor sample based methods in terms of the preprocessing, recall rate, Jaccard index and segmentation characteristics. Conclusions Our survey thoroughly reveals 12 typical methods’ superiority and inferiority. We explain why methods show specific characteristics from a methodological standpoint. Finally, we present the guiding principle for choosing one proper CNAs detection method under specific conditions. Some unsolved problems and expectations are also addressed for upcoming CNAs detection methods.


2021 ◽  
Author(s):  
Salvatore Milite ◽  
Riccardo Bergamin ◽  
Giulio Caravagna

AbstractCancers are constituted by heterogeneous populations of cells that show complex genotypes and phenotypes which we can read out by sequencing. Many attempts at deciphering the clonal process that drives these populations are focusing on single-cell technologies to resolve genetic and phenotypic intra-tumour heterogeneity. While the ideal technologies for these investigations are multi-omics assays, unfortunately these types of data are still too expensive and have limited scalability. We can resort to single-molecule assays, which are cheaper and scalable, and statistically emulate a joint assay, only if we can integrate measurements collected from independent cells of the same sample. In this work we follow this intuition and construct a new Bayesian method to genotype copy number alterations on single-cell RNA sequencing data, therefore integrating DNA and RNA measurements. Our method is unsupervised, and leverages on a segmentation of the input DNA to determine the sample subclonal composition at the copy number level, together with clone-specific phenotypes defined from RNA counts. By design our probabilistic method works without a reference RNA expression profile, and therefore can be applied in cases where this information may not be accessible. We implement the method on a probabilistic backend that allows easy running on both CPUs and GPUs, and test it on both simulated and real data. Our analysis shows its ability to determine copy number associated clones and their RNA phenotypes in tumour data from 10x and Smart-Seq assays, as well as in data from the Human Cell Atlas project.


Blood ◽  
2011 ◽  
Vol 118 (21) ◽  
pp. 1437-1437
Author(s):  
Vera Binder ◽  
Christoph Bartenhagen ◽  
Vera Okpanyi ◽  
Bianca Behrens ◽  
Birte Moehlendick ◽  
...  

Abstract Abstract 1437 Introduction: Genetic heterogeneity is common not only in solid tumors, but also in leukemias. The analysis of genetic heterogeneity among single cancer cells is vital for a better understanding of cancer evolution and therapeutic failure of systemic cancer therapy. So far, comprehensive genome-wide single cell studies were limited by many technical difficulties. Here, we present a novel approach, combining adapter-linker PCR based whole genome amplification (WGA) with 2nd generation sequencing, that enables comprehensive and comparative genome-wide analysis of single leukemic cells. Methods: WGA, based on adapter-linker PCR (Klein et al PNAS 1999, Stoecklein et al Cancer Cell 2008), of three individually picked cells of the permanent leukemia cell line REH was performed. WGA products, subsequently fragmented to 100 bp or 250 bp, were used for library preparation. After loading one amplified single cell genome per flowcell, DNA was sequenced with paired end (PE) reads (2× 75bp or 2× 100 bp respectively) on a Genome Analyzer IIx or a HiSeq 2000 (Illumina). After alignment with Burrows-Wheeler Aligner (BWA), removal of duplicate read pairs, and identification of SNPs by the Genome Analysis Toolkit (GATK), copy number variants (CNV), loss of heterozygosity (LOH) and allele dropout rates were analyzed, based on the human reference genome (hg19/GRCh37). Results were compared to data obtained by hybridizing pooled gDNA of REH cells of the same passage to a SNP 6.0 array (Affymetrix). Interchromosomal translocations were determined in single cells of the same passage of REH cells by spectral karyotyping (SKY) and compared to sequencing data, analyzed by Geometric Analysis of Structural Variants (GASV). Results: With our approach we obtained up to 600 mio mappable reads per run, evenly spread over the genome, which led to a sequence coverage of up to 67%, with an even higher coverage of coding sequence (76%) and a sequence depth of 16x. Comparison of SNP arraydata with PE sequencing data showed, that they are highly overlapping (99,3%) regarding the detection of normal copy numbers. But also for copy number alterations, consistency between both methods was observed in detecting losses (94.1%) or gains (77.1%) of genomic material (figure 1). Up to 97% of regions of LOH detected by sequencing, were also detected by the SNP array, when analyzed in a resolution of 500K bp. By analyzing the data with higher resolutions of up to 10K bp, an increasing amount of regions of LOH could be detected. However, decreased correlation between SNP array and sequencing data (max. 74.5%) was observed, with high correlation between the sequencing runs (85%). This indicates increased detection of false positive LOH regions by the SNP array and the sequencing approach to be superior in this high resolution. To assess the allele dropout rate as a quality control for the PCR based WGA method, the heterozygous SNPs detected by PE sequencing were compared to those called by the SNP array. High consistency (95%) indicates an allele dropout rate of only 5%. To analyze the accuracy of our approach in detecting genetic heterogeneity between single cells, we assessed the variability in the SNP profile between the three individual cells. As they are derived from a permanent cell line, they are expected to be highly similar. In fact, the SNPs, that were covered in all three sequencing runs showed a variation of less than 0,1% among the single REH cells. As the SNP array is not applicable to asses copy number neutral variations as translocations, the karyotype of REH cells was assessed by SKY, confirming the predescribed translocations t(4;12), t(4;16), t(5;12), t(16;21) and t(12;21). Breakpoint regions comparable to those defined by SKY, were identified for all 5 translocations by analysis of discordant read pairs with GASV. The detection of additional, exclusively by sequencing identified breakpoints, is currently under intensified investigation, to confirm potentially newly discovered breakpoints and reliably rule out false positive results. Conclusion: Our approach provides a powerful tool to achieve an unprecedented genome-wide overview on genomic variations of single cells. The robustness of our single cell approach in comparison to the data acquired with pooled gDNA and the homogeneity of our results in the permanent REH cell line clearly shows the reliability of our approach to assess single cell heterogeneity in primary leukemic samples. Disclosures: No relevant conflicts of interest to declare.


Medicina ◽  
2021 ◽  
Vol 57 (5) ◽  
pp. 502
Author(s):  
Georgiana Gug ◽  
Caius Solovan

Background and Objectives: Mycosis fungoides (MF) and large plaque parapsoriasis (LPP) evolution provide intriguing data and are the cause of numerous debates. The diagnosis of MF and LPP is associated with confusion and imprecise definition. Copy number alterations (CNAs) may play an essential role in the genesis of cancer out of genes expression dysregulation. Objectives: Due to the heterogeneity of MF and LPP and the scarcity of the cases, there are an exceedingly small number of studies that have identified molecular changes in these pathologies. We aim to identify and compare DNA copy number alterations and gene expression changes between MF and LPP to highlight the similarities and the differences between these pathologies. Materials and Methods: The patients were prospectively selected from University Clinic of Dermatology and Venereology Timișoara, Romania. From fresh frozen skin biopsies, we extracted DNA using single nucleotide polymorphism (SNP) data. The use of SNP array for copy number profiling is a promising approach for genome-wide analysis. Results: After reviewing each group, we observed that the histograms generated for chromosome 1–22 were remarkably similar and had a lot of CNAs in common, but also significant differences were seen. Conclusions: This study took a step forward in finding out the differences and similarities between MF and LPP, for a more specific and implicitly correct approach of the case. The similarity between these two pathologies in terms of CNAs is striking, emphasizing once again the difficulty of approaching and differentiating them.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Xinping Fan ◽  
Guanghao Luo ◽  
Yu S. Huang

Abstract Background Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. Results We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation–maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/. Conclusions We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.


Sign in / Sign up

Export Citation Format

Share Document