scholarly journals Ultra-Sensitive Mutation Detection and Genome-Wide DNA Copy Number Reconstruction by Error-Corrected Circulating Tumor DNA Sequencing

2018 ◽  
Vol 64 (11) ◽  
pp. 1626-1635 ◽  
Author(s):  
Sonia Mansukhani ◽  
Louise J Barber ◽  
Dimitrios Kleftogiannis ◽  
Sing Yu Moorcraft ◽  
Michael Davidson ◽  
...  

Abstract BACKGROUND Circulating free DNA sequencing (cfDNA-Seq) can portray cancer genome landscapes, but highly sensitive and specific technologies are necessary to accurately detect mutations with often low variant frequencies. METHODS We developed a customizable hybrid-capture cfDNA-Seq technology using off-the-shelf molecular barcodes and a novel duplex DNA molecule identification tool for enhanced error correction. RESULTS Modeling based on cfDNA yields from 58 patients showed that this technology, requiring 25 ng of cfDNA, could be applied to >95% of patients with metastatic colorectal cancer (mCRC). cfDNA-Seq of a 32-gene, 163.3-kbp target region detected 100% of single-nucleotide variants, with 0.15% variant frequency in spike-in experiments. Molecular barcode error correction reduced false-positive mutation calls by 97.5%. In 28 consecutively analyzed patients with mCRC, 80 out of 91 mutations previously detected by tumor tissue sequencing were called in the cfDNA. Call rates were similar for point mutations and indels. cfDNA-Seq identified typical mCRC driver mutations in patients in whom biopsy sequencing had failed or did not include key mCRC driver genes. Mutations only called in cfDNA but undetectable in matched biopsies included a subclonal resistance driver mutation to anti-EGFR antibodies in KRAS, parallel evolution of multiple PIK3CA mutations in 2 cases, and TP53 mutations originating from clonal hematopoiesis. Furthermore, cfDNA-Seq off-target read analysis allowed simultaneous genome-wide copy number profile reconstruction in 20 of 28 cases. Copy number profiles were validated by low-coverage whole-genome sequencing. CONCLUSIONS This error-corrected, ultradeep cfDNA-Seq technology with a customizable target region and publicly available bioinformatics tools enables broad insights into cancer genomes and evolution. ClinicalTrials.gov Identifier NCT02112357

2017 ◽  
Author(s):  
Sonia Mansukhani ◽  
Louise J. Barber ◽  
Sing Yu Moorcraft ◽  
Michael Davidson ◽  
Andrew Woolston ◽  
...  

AbstractMinimally invasive circulating free DNA (cfDNA) analysis can portray cancer genome landscapes but highly sensitive and specific genetic approaches are necessary to accurately detect mutations with often low variant frequencies. We developed a targeted cfDNA sequencing technology using novel off-the-shelf molecular barcodes for error correction, in combination with custom solution hybrid capture enrichment. Modelling based on cfDNA yields from 58 patients shows that our assay, which requires 25ng of cfDNA input, should be applicable to >95% of patients with metastatic colorectal cancer. Sequencing of a 163.3 kb target region including 32 genes detected 100% of single nucleotide variants with 0.15% variant frequency in cfDNA spike-in experiments. Molecular barcode error correction reduced false positive mutation calls by 98.6%. In a series of 28 patients with metastatic colorectal cancers, 80 out of 91 (88%) mutations previously detected by tumour tissue sequencing were called in the cfDNA. Call rates were similar for single nucleotide variants and small insertions/deletions. Mutations only called in cfDNA but not detectable in matched tumour tissue included, among others, a subclonal resistance driver mutation to anti-EGFR antibodies in theKRASgene, multiple activatingPIK3CAmutations in each of two patients (indicative of parallel evolution), andTP53mutations originating from clonal haematopoiesis. Furthermore, we demonstrate that cfDNA off-target read analysis allows the reconstruction of genome wide copy number aberration profiles from 71% of these 28 cases. This error-corrected ultra-deep cfDNA sequencing assay with a target region that can be readily customized enables broad insights into cancer genomes and evolution.


2017 ◽  
Author(s):  
Moritz Gerstung ◽  
Clemency Jolly ◽  
Ignaty Leshchiner ◽  
Stefan C. Dentro ◽  
Santiago Gonzalez ◽  
...  

SummaryCancer develops through a process of somatic evolution. Here, we use whole-genome sequencing of 2,778 tumour samples from 2,658 donors to reconstruct the life history, evolution of mutational processes, and driver mutation sequences of 39 cancer types. The early phases of oncogenesis are driven by point mutations in a small set of driver genes, often including biallelic inactivation of tumour suppressors. Early oncogenesis is also characterised by specific copy number gains, such as trisomy 7 in glioblastoma or isochromosome 17q in medulloblastoma. By contrast, increased genomic instability, a nearly four-fold diversification of driver genes, and an acceleration of point mutation processes are features of later stages. Copy-number alterations often occur in mitotic crises leading to simultaneous gains of multiple chromosomal segments. Timing analysis suggests that driver mutations often precede diagnosis by many years, and in some cases decades, providing a window of opportunity for early cancer detection.


2021 ◽  
Vol 11 (1) ◽  
pp. 33
Author(s):  
Nayoung Han ◽  
Jung Mi Oh ◽  
In-Wha Kim

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.


2021 ◽  
pp. 1-10
Author(s):  
Yang Ma ◽  
Jingxia Zhao ◽  
Yun Du ◽  
Rui Wang ◽  
Xiaokun Ji ◽  
...  

<b><i>Objective:</i></b> The aim of the study was to investigate the mutation status of multiple driver genes by RT-qPCR and their significance in advanced lung adenocarcinoma using cytological specimens. <b><i>Materials and Methods:</i></b> 155 cytological specimens that had been diagnosed with lung adenocarcinoma in the Fourth Hospital of Hebei Medical University were selected from April to November 2019. The cytological specimens included serous cavity effusion and fine-needle aspiration biopsies. Among cytological specimens, 108 cases were processed by using the cell block method (CBM), and 47 cases were processed by the disposable membrane cell collector method (MCM) before DNA/RNA extraction. Ten drive genes of EGFR, ALK, ROS1, BRAF, KRAS, NRAS, HER2, RET, PIK3CA, and MET were combined detected at one step by the amplification refractory mutation system and ABI 7500 RT-qPCR. <b><i>Results:</i></b> The purity of RNA (<i>p</i> = 0.005) and DNA (<i>p</i> = 0.001) extracted by using the MCM was both significantly higher than that extracted by using the CBM. Forty-seven cases of fresh cell specimens processed by the MCM all succeeded in multigene detections, while of 108 specimens processed by the CBM, 6 cases failed in multigene detections. Among 149 specimens, single-gene mutation rates of EGFR, ALK, ROS1, RET, HER2, MET, KRAS, NRAS, BRAF, and PIK3CA mutations were 57.71%, 6.04%, 3.36%, 2.68%, 2.01%, 2.01%, 1.34%, 0.67%, 0% and 0% respectively, and 6 cases including 2 coexistence mutations. We found that mutation status was correlated with gender (<i>p</i> = 0.047), but not correlated with age (<i>p</i> = 0.141) and smoking status (<i>p</i> = 0.083). We found that the EGFR mutation status was correlated with gender (<i>p</i> = 0.003), age (<i>p</i> = 0.015) and smoking habits (<i>p</i> = 0.007), and ALK mutation status was correlated with age (<i>p</i> = 0.002). <b><i>Conclusion:</i></b> Compared with the CBM, the MCM can improve the efficiency of DNA/RNA extraction and PCR amplification by removing impurities and enriching tumor cells. And we speculate that the successful detection rate of fresh cytological specimens was higher than that of paraffin-embedded specimens. EGFR, ALK, and ROS1 mutations were the main driver mutations in patients with advanced lung adenocarcinoma. We speculate that EGFR and ALK are more prone to concomitant mutations, respectively. Targeted therapies for patients with coexisting mutations need further study.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii389-iii389
Author(s):  
Rahul Kumar ◽  
Maximilian Deng ◽  
Kyle Smith ◽  
Anthony Liu ◽  
Girish Dhall ◽  
...  

Abstract INTRODUCTION The next generation of clinical trials for relapsed medulloblastoma demands a thorough understanding of the clinical behavior of relapsed tumors as well as the molecular relationship to their diagnostic counterparts. METHODS A multi-institutional molecular cohort of patient-matched (n=126 patients) diagnostic MBs and relapses/subsequent malignancies was profiled by DNA methylation array. Entity, subgroup classification, and genome-wide copy-number aberrations were assigned while parallel next-generation (whole-exome or targeted panel) sequencing on the majority of the cohort facilitated inference of somatic driver mutations. RESULTS Comprised of WNT (2%), SHH (41%), Group 3 (18%), Group 4 (39%), primary tumors retained subgroup affiliation at relapse with the notable exception of 10% of cases. The majority (8/13) of discrepant classifications were determined to be secondary glioblastomas. Additionally, rare (n=3) subgroup-switching events of Group 4 primary tumors to Group 3 relapses were identified coincident with MYC/MYCN pathway alterations. Amongst truly relapsing MBs, copy-number analyses suggest somatic clonal divergence between primary MBs and their respective relapses with Group 3 (55% of alterations shared) and Group 4 tumors (63% alterations shared) sharing a larger proportion of cytogenetic alterations compared to SHH tumors (42% alterations shared; Chi-square p-value &lt; 0.001). Subgroup- and gene-specific patterns of conservation and divergence amongst putative driver genes were also observed. CONCLUSION Integrated molecular analysis of relapsed MB discloses potential mechanisms underlying treatment failure and disease recurrence while motivating rational implementation of relapse-specific therapies. The degree of genetic divergence between primary and relapsed MBs varied by subgroup but suggested considerably higher conservation than prior estimates.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii351-iii351
Author(s):  
Frank Dubois ◽  
Ofer Shapira ◽  
Noah Greenwald ◽  
Travis Zack ◽  
Jessica W Tsai ◽  
...  

Abstract BACKGROUND Driver single nucleotide variants (SNV) and somatic copy number aberrations (SCNA) of pediatric high-grade glioma (pHGGs), including Diffuse Midline Gliomas (DMGs) are characterized. However, structural variants (SVs) in pHGGs and the mechanisms through which they contribute to glioma formation have not been systematically analyzed genome-wide. METHODS Using SvABA for SVs as well as the latest pipelines for SCNAs and SNVs we analyzed whole-genome sequencing from 174 patients. This includes 60 previously unpublished samples, 43 of which are DMGs. Signature analysis allowed us to define pHGG groups with shared SV characteristics. Significantly recurring SV breakpoints and juxtapositions were identified with algorithms we recently developed and the findings were correlated with RNAseq and H3K27ac ChIPseq. RESULTS The SV characteristics in pHGG showed three groups defined by either complex, intermediate or simple signature activities. These associated with distinct combinations of known driver oncogenes. Our statistical analysis revealed recurring SVs in the topologically associating domains of MYCN, MYC, EGFR, PDGFRA & MET. These correlated with increased mRNA expression and amplification of H3K27ac peaks. Complex recurring amplifications showed characteristics of extrachromosomal amplicons and were enriched in coding SVs splitting protein regulatory from effector domains. Integrative analysis of all SCNAs, SNVs & SVs revealed patterns of characteristic combinations between potential drivers and signatures. This included two distinct groups of H3K27M DMGs with either complex or simple signatures and different combinations of associated variants. CONCLUSION Recurrent SVs associate with signatures shaped by an underlying process, which can lead to distinct mechanisms to activate the same oncogene.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Hye Kyung Lee ◽  
Harold E. Smith ◽  
Chengyu Liu ◽  
Michaela Willi ◽  
Lothar Hennighausen

AbstractDeaminase base editing has emerged as a tool to install or correct point mutations in the genomes of living cells in a wide range of organisms. However, the genome-wide off-target effects introduced by base editors in the mammalian genome have been examined in only one study. Here, we have investigated the fidelity of cytosine base editor 4 (BE4) and adenine base editors (ABE) in mouse embryos using unbiased whole-genome sequencing of a family-based trio cohort. The same sgRNA was used for BE4 and ABE. We demonstrate that BE4-edited mice carry an excess of single-nucleotide variants and deletions compared to ABE-edited mice and controls. Therefore, an optimization of cytosine base editors is required to improve its fidelity. While the remarkable fidelity of ABE has implications for a wide range of applications, the occurrence of rare aberrant C-to-T conversions at specific target sites needs to be addressed.


2016 ◽  
Vol 15 ◽  
pp. CIN.S36612 ◽  
Author(s):  
Lun-Ching Chang ◽  
Biswajit Das ◽  
Chih-Jian Lih ◽  
Han Si ◽  
Corinne E. Camalier ◽  
...  

With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly ( r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman's coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.


Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 228-228
Author(s):  
Joachim Kunz ◽  
Tobias Rausch ◽  
Obul R Bandapalli ◽  
Martina U. Muckenthaler ◽  
Adrian M Stuetz ◽  
...  

Abstract Acute precursor T-lymphoblastic leukemia (T-ALL) remains a serious challenge in pediatric oncology, because relapses carry a particularly poor prognosis with high rates of induction failure and death despite generally excellent treatment responses of the initial disease. It is critical, therefore, to understand the molecular evolution of pediatric T-ALL and to elucidate the mechanisms leading to T-ALL relapse and to understand the differences in treatment response between the two phases of the disease. We have thus subjected DNA from bone marrow samples obtained at the time of initial diagnosis, remission and relapse of 14 patients to whole exome sequencing (WES). Eleven patients suffered from early relapse (duration of remission 6-19 months) and 3 patients from late relapse (duration of remission 29-46 months).The Agilent SureSelect Target Enrichment Kit was used to capture human exons for deep sequencing. The captured fragments were sequenced as 100 bp paired reads using an Illumina HiSeq2000 sequencing instrument. All sequenced DNA reads were preprocessed using Trimmomatic (Lohse et al., Nucl. Acids Res., 2012) to clip adapter contaminations and to trim reads for low quality bases. The remaining reads greater than 36bp were mapped to build hg19 of the human reference genome with Stampy (Lunter & Goodson, Genome Res. 2011), using default parameters. Following such preprocessing, the number of mapped reads was >95% for all samples. Single-nucleotide variants (SNVs) were called using SAMtools mpileup (Li et al., Bioinformatics, 2009). The number of exonic SNVs varied between 23,741 and 31,418 per sample. To facilitate a fast classification and identification of candidate driver mutations, all identified coding SNVs were comprehensively annotated using the ANNOVAR framework (Wang et al., Nat. Rev. Genet., 2010). To identify possible somatic driver mutations, candidate SNVs were filtered for non-synonymous, stopgain or stoploss SNVs, requiring an SNV quality greater or equal to 50, and requiring absence of segmental duplications. Leukemia-specific mutations were identified by filtering against the corresponding remission sample and validated by Sanger sequencing of the genomic DNA following PCR amplification. We identified on average 9.3 somatic single nucleotide variants (SNV) and 0.6 insertions and deletions (indels) per patient sample at the time of initial diagnosis and 21.7 SNVs and 0.3 indels in relapse. On average, 6.3 SNVs were detected both at the time of initial diagnosis and in relapse. These SNVs were thus defined as leukemia specific. Further to SNVs, we have also estimated the frequency of copy number variations (CNV) at low resolution. Apart from the deletions resulting from T-cell receptor rearrangement, we identified on average for each patient 0.7 copy number gains and 2.2 copy number losses at the time of initial diagnosis and 0.5 copy number gains and 2.4 copy number losses in relapse. We detected 24/27 copy number alterations both in initial diagnosis and in relapse. The most common CNV detected was the CDKN2A/B deletion on chromosome 9p. Nine genes were recurrently mutated in 2 or more patients thus indicating the functional leukemogenic potential of these SNVs in T-ALL. These recurrent mutations included known oncogenes (Notch1), tumor suppressor genes (FBXW7, PHF6, WT1) and genes conferring drug resistance (NT5C2). In several patients one gene (such as Notch 1, PHF6, WT1) carried different mutations either at the time of initial diagnosis and or in relapse, indicating that the major leukemic clone had been eradicated by primary treatment, but that a minor clone had persisted and expanded during relapse. The types of mutations did not differ significantly between mutations that were either already present at diagnosis or those that were newly acquired in relapse, indicating that the treatment did not cause specific genomic damage. We will further characterize the clonal evolution of T-ALL into relapse by targeted re-sequencing at high depth of genes with either relapse specific or initial-disease specific mutations. In conclusion, T-ALL relapse differs from primary disease by a higher number of leukemogenic SNVs without gross genomic instability resulting in large CNVs. Disclosures: No relevant conflicts of interest to declare.


2018 ◽  
Author(s):  
Paul Ashford ◽  
Camilla S.M. Pang ◽  
Aurelio A. Moya-García ◽  
Tolulope Adeyelu ◽  
Christine A. Orengo

Tumour sequencing identifies highly recurrent point mutations in cancer driver genes, but rare functional mutations are hard to distinguish from large numbers of passengers. We developed a novel computational platform applying a multi-modal approach to filter out passengers and more robustly identify putative driver genes. The primary filter identifies enrichment of cancer mutations in CATH functional families (CATH-FunFams) – structurally and functionally coherent sets of evolutionary related domains. Using structural representatives from CATH-FunFams, we subsequently seek enrichment of mutations in 3D and show that these mutation clusters have a very significant tendency to lie close to known functional sites or conserved sites predicted using CATH-FunFams. Our third filter identifies enrichment of putative driver genes in functionally coherent protein network modules confirmed by literature analysis to be cancer associated.Our approach is complementary to other domain enrichment approaches exploiting Pfam families, but benefits from more functionally coherent groupings of domains. Using a set of mutations from 22 cancers we detect 151 putative cancer drivers, of which 79 are not listed in cancer resources and include recently validated cancer genes EPHA7, DCC netrin-1 receptor and zinc-finger protein ZNF479.


Sign in / Sign up

Export Citation Format

Share Document