Tumor Copy Number Data Deconvolution Integrating Bulk and Single-cell Sequencing Data

Author(s):  
Haoyun Lei ◽  
Bochuan Lyu ◽  
E. Michael Gertz ◽  
Alejandro A. Schaffer ◽  
Russell Schwartz
2020 ◽  
Vol 27 (4) ◽  
pp. 565-598 ◽  
Author(s):  
Haoyun Lei ◽  
Bochuan Lyu ◽  
E. Michael Gertz ◽  
Alejandro A. Schäffer ◽  
Xulian Shi ◽  
...  

2019 ◽  
Author(s):  
Enrique I. Velazquez-Villarreal ◽  
Shamoni Maheshwari ◽  
Jon Sorenson ◽  
Ian T. Fiddes ◽  
Vijay Kumar ◽  
...  

ABSTRACTWe performed shallow single-cell sequencing of genomic DNA across 1,475 cells from a well-studied cell-line, COLO829, to resolve overall tumor complexity and clonality. This melanoma tumor-line has been previously characterized by multiple technologies and provides a benchmark for evaluating somatic alterations, though has exhibited conflicting and indeterminate copy number states. We identified at least four major sub-clones by discriminant analysis of principal components (DAPC) of single cell copy number data. Break-point and loss of heterozygosity (LOH) analysis of aggregated data from sub-clones revealed a complex rearrangement of chromosomes 1, 10 and 18 that was maintained in all but two sub-clones. Likewise, two of the sub-clones were distinguished by loss of 1 copy of chromosome 8. Re-analysis of previous spectral karyotyping data and bulk sequencing data recapitulated these sub-clone hallmark features and explains why the original bulk sequencing experiments generated conflicting copy number results. Overall, our results demonstrate how shallow copy number profiling together with clustering analysis of single cell sequencing can uncover significant hidden insights even in well studied cell-lines.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Patrick P. T. Leong ◽  
Aleksandar Mihajlović ◽  
Nadežda Bogdanović ◽  
Luka Breberina ◽  
Larry Xi

AbstractSingle-cell sequencing provides a new level of granularity in studying the heterogeneous nature of cancer cells. For some cancers, this heterogeneity is the result of copy number changes of genes within the cellular genomes. The ability to accurately determine such copy number changes is critical in tracing and understanding tumorigenesis. Current single-cell genome sequencing methodologies infer copy numbers based on statistical approaches followed by rounding decimal numbers to integer values. Such methodologies are sample dependent, have varying calling sensitivities which heavily depend on the sample’s ploidy and are sensitive to noise in sequencing data. In this paper we have demonstrated the concept of integer-counting by using a novel bioinformatic algorithm built on our library construction chemistry in order to detect the discrete nature of the genome.


2019 ◽  
Author(s):  
Haoyun Lei ◽  
Bochuan Lyu ◽  
E. Michael Gertz ◽  
Alejandro A. Schäffer ◽  
Xulian Shi ◽  
...  

AbstractCharacterizing intratumor heterogeneity (ITH) is crucial to understanding cancer development, but it is hampered by limits of available data sources. Bulk DNA sequencing is the most common technology to assess ITH, but mixes many genetically distinct cells in each sample, which must then be computationally deconvolved. Single-cell sequencing (SCS) is a promising alternative, but its limitations — e.g., high noise, difficulty scaling to large populations, technical artifacts, and large data sets — have so far made it impractical for studying cohorts of sufficient size to identify statistically robust features of tumor evolution. We have developed strategies for deconvolution and tumor phylogenetics combining limited amounts of bulk and single-cell data to gain some advantages of single-cell resolution with much lower cost, with specific focus on deconvolving genomic copy number data. We developed a mixed membership model for clonal deconvolution via non-negative matrix factorization (NMF) balancing deconvolution quality with similarity to single-cell samples via an associated efficient coordinate descent algorithm. We then improve on that algorithm by integrating deconvolution with clonal phylogeny inference, using a mixed integer linear programming (MILP) model to incorporate a minimum evolution phylogenetic tree cost in the problem objective. We demonstrate the effectiveness of these methods on semi-simulated data of known ground truth, showing improved deconvolution accuracy relative to bulk data alone.


Author(s):  
Haoyun Lei ◽  
Bochuan Lyu ◽  
E. Michael Gertz ◽  
Alejandro A. Schäffer ◽  
Xulian Shi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document