scholarly journals Ultra-sensitive mutation detection and genome-wide DNA copy number reconstruction by error corrected circulating tumour DNA sequencing

2017 ◽  
Author(s):  
Sonia Mansukhani ◽  
Louise J. Barber ◽  
Sing Yu Moorcraft ◽  
Michael Davidson ◽  
Andrew Woolston ◽  
...  

AbstractMinimally invasive circulating free DNA (cfDNA) analysis can portray cancer genome landscapes but highly sensitive and specific genetic approaches are necessary to accurately detect mutations with often low variant frequencies. We developed a targeted cfDNA sequencing technology using novel off-the-shelf molecular barcodes for error correction, in combination with custom solution hybrid capture enrichment. Modelling based on cfDNA yields from 58 patients shows that our assay, which requires 25ng of cfDNA input, should be applicable to >95% of patients with metastatic colorectal cancer. Sequencing of a 163.3 kb target region including 32 genes detected 100% of single nucleotide variants with 0.15% variant frequency in cfDNA spike-in experiments. Molecular barcode error correction reduced false positive mutation calls by 98.6%. In a series of 28 patients with metastatic colorectal cancers, 80 out of 91 (88%) mutations previously detected by tumour tissue sequencing were called in the cfDNA. Call rates were similar for single nucleotide variants and small insertions/deletions. Mutations only called in cfDNA but not detectable in matched tumour tissue included, among others, a subclonal resistance driver mutation to anti-EGFR antibodies in theKRASgene, multiple activatingPIK3CAmutations in each of two patients (indicative of parallel evolution), andTP53mutations originating from clonal haematopoiesis. Furthermore, we demonstrate that cfDNA off-target read analysis allows the reconstruction of genome wide copy number aberration profiles from 71% of these 28 cases. This error-corrected ultra-deep cfDNA sequencing assay with a target region that can be readily customized enables broad insights into cancer genomes and evolution.

2018 ◽  
Vol 64 (11) ◽  
pp. 1626-1635 ◽  
Author(s):  
Sonia Mansukhani ◽  
Louise J Barber ◽  
Dimitrios Kleftogiannis ◽  
Sing Yu Moorcraft ◽  
Michael Davidson ◽  
...  

Abstract BACKGROUND Circulating free DNA sequencing (cfDNA-Seq) can portray cancer genome landscapes, but highly sensitive and specific technologies are necessary to accurately detect mutations with often low variant frequencies. METHODS We developed a customizable hybrid-capture cfDNA-Seq technology using off-the-shelf molecular barcodes and a novel duplex DNA molecule identification tool for enhanced error correction. RESULTS Modeling based on cfDNA yields from 58 patients showed that this technology, requiring 25 ng of cfDNA, could be applied to >95% of patients with metastatic colorectal cancer (mCRC). cfDNA-Seq of a 32-gene, 163.3-kbp target region detected 100% of single-nucleotide variants, with 0.15% variant frequency in spike-in experiments. Molecular barcode error correction reduced false-positive mutation calls by 97.5%. In 28 consecutively analyzed patients with mCRC, 80 out of 91 mutations previously detected by tumor tissue sequencing were called in the cfDNA. Call rates were similar for point mutations and indels. cfDNA-Seq identified typical mCRC driver mutations in patients in whom biopsy sequencing had failed or did not include key mCRC driver genes. Mutations only called in cfDNA but undetectable in matched biopsies included a subclonal resistance driver mutation to anti-EGFR antibodies in KRAS, parallel evolution of multiple PIK3CA mutations in 2 cases, and TP53 mutations originating from clonal hematopoiesis. Furthermore, cfDNA-Seq off-target read analysis allowed simultaneous genome-wide copy number profile reconstruction in 20 of 28 cases. Copy number profiles were validated by low-coverage whole-genome sequencing. CONCLUSIONS This error-corrected, ultradeep cfDNA-Seq technology with a customizable target region and publicly available bioinformatics tools enables broad insights into cancer genomes and evolution. ClinicalTrials.gov Identifier NCT02112357


2021 ◽  
Vol 11 (1) ◽  
pp. 33
Author(s):  
Nayoung Han ◽  
Jung Mi Oh ◽  
In-Wha Kim

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii351-iii351
Author(s):  
Frank Dubois ◽  
Ofer Shapira ◽  
Noah Greenwald ◽  
Travis Zack ◽  
Jessica W Tsai ◽  
...  

Abstract BACKGROUND Driver single nucleotide variants (SNV) and somatic copy number aberrations (SCNA) of pediatric high-grade glioma (pHGGs), including Diffuse Midline Gliomas (DMGs) are characterized. However, structural variants (SVs) in pHGGs and the mechanisms through which they contribute to glioma formation have not been systematically analyzed genome-wide. METHODS Using SvABA for SVs as well as the latest pipelines for SCNAs and SNVs we analyzed whole-genome sequencing from 174 patients. This includes 60 previously unpublished samples, 43 of which are DMGs. Signature analysis allowed us to define pHGG groups with shared SV characteristics. Significantly recurring SV breakpoints and juxtapositions were identified with algorithms we recently developed and the findings were correlated with RNAseq and H3K27ac ChIPseq. RESULTS The SV characteristics in pHGG showed three groups defined by either complex, intermediate or simple signature activities. These associated with distinct combinations of known driver oncogenes. Our statistical analysis revealed recurring SVs in the topologically associating domains of MYCN, MYC, EGFR, PDGFRA & MET. These correlated with increased mRNA expression and amplification of H3K27ac peaks. Complex recurring amplifications showed characteristics of extrachromosomal amplicons and were enriched in coding SVs splitting protein regulatory from effector domains. Integrative analysis of all SCNAs, SNVs & SVs revealed patterns of characteristic combinations between potential drivers and signatures. This included two distinct groups of H3K27M DMGs with either complex or simple signatures and different combinations of associated variants. CONCLUSION Recurrent SVs associate with signatures shaped by an underlying process, which can lead to distinct mechanisms to activate the same oncogene.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Sebastian Carrasco Pro ◽  
Katia Bulekova ◽  
Brian Gregor ◽  
Adam Labadorf ◽  
Juan Ignacio Fuxman Bass

Abstract Single nucleotide variants (SNVs) located in transcriptional regulatory regions can result in gene expression changes that lead to adaptive or detrimental phenotypic outcomes. Here, we predict gain or loss of binding sites for 741 transcription factors (TFs) across the human genome. We calculated ‘gainability’ and ‘disruptability’ scores for each TF that represent the likelihood of binding sites being created or disrupted, respectively. We found that functional cis-eQTL SNVs are more likely to alter TF binding sites than rare SNVs in the human population. In addition, we show that cancer somatic mutations have different effects on TF binding sites from different TF families on a cancer-type basis. Finally, we discuss the relationship between these results and cancer mutational signatures. Altogether, we provide a blueprint to study the impact of SNVs derived from genetic variation or disease association on TF binding to gene regulatory regions.


2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Leandro de Araújo Lima ◽  
Ana Cecília Feio-dos-Santos ◽  
Sintia Iole Belangero ◽  
Ary Gadelha ◽  
Rodrigo Affonseca Bressan ◽  
...  

Abstract Many studies have attempted to investigate the genetic susceptibility of Attention-Deficit/Hyperactivity Disorder (ADHD), but without much success. The present study aimed to analyze both single-nucleotide and copy-number variants contributing to the genetic architecture of ADHD. We generated exome data from 30 Brazilian trios with sporadic ADHD. We also analyzed a Brazilian sample of 503 children/adolescent controls from a High Risk Cohort Study for the Development of Childhood Psychiatric Disorders, and also previously published results of five CNV studies and one GWAS meta-analysis of ADHD involving children/adolescents. The results from the Brazilian trios showed that cases with de novo SNVs tend not to have de novo CNVs and vice-versa. Although the sample size is small, we could also see that various comorbidities are more frequent in cases with only inherited variants. Moreover, using only genes expressed in brain, we constructed two “in silico” protein-protein interaction networks, one with genes from any analysis, and other with genes with hits in two analyses. Topological and functional analyses of genes in this network uncovered genes related to synapse, cell adhesion, glutamatergic and serotoninergic pathways, both confirming findings of previous studies and capturing new genes and genetic variants in these pathways.


2020 ◽  
Author(s):  
Celine Charon ◽  
Rodrigue Allodji ◽  
Vincent Meyer ◽  
Jean-François Deleuze

Abstract Quality control methods for genome-wide association studies and fine mapping are commonly used for imputation, however, they result in loss of many single nucleotide polymorphisms (SNPs). To investigate the consequences of filtration on imputation, we studied the direct effects on the number of markers, their allele frequencies, imputation quality scores and post-filtration events. We pre-phrased 1,031 genotyped individuals from diverse ethnicities and compared the imputed variants to 1,089 NCBI recorded individuals for additional validation.Without variant pre-filtration based on quality control (QC), we observed no impairment in the imputation of SNPs that failed QC whereas with pre-filtration there was an overall loss of information. Significant differences between frequencies with and without pre-filtration were found only in the range of very rare (5E-04-1E-03) and rare variants (1E-03-5E-03) (p < 1E-04). Increasing the post-filtration imputation quality score from 0.3 to 0.8 reduced the number of single nucleotide variants (SNVs) <0.001 2.5 fold with or without QC pre-filtration and halved the number of very rare variants (5E-04). As a result, to maintain confidence and enough SNVs, we propose here a 2-step post-filtration approach to increase the number of very rare and rare variants compared to conservative post-filtration methods.


ESC CardioMed ◽  
2018 ◽  
pp. 669-671
Author(s):  
Eric Schulze-Bahr

The human genome consists of approximately 3 billion (3 × 109) base pairs of DNA (around 20,000 genes), organized as 23 chromosomes (diploid parental set), and a small mitochondrial genome (37 genes, including 13 proteins; 16,589 base pairs) of maternal origin. Most human genetic variation is natural, that is, common or rare (minor allele frequency >0.1%) and does not cause disease—apart from every true disease-causing (bona fide) mutation each individual genome harbours more than 3.5 million single nucleotide variants (including >10,000 non-synonymous changes causing amino acid substitutions) and 200–300 large structural or copy number variants (insertions/deletions, up to several thousands of base-pairs) that are non-disease-causing variations and scattered throughout coding and non-coding genomic regions.


2020 ◽  
Vol 6 (22) ◽  
pp. eaaz7835 ◽  
Author(s):  
Sungwon Jeon ◽  
Youngjune Bhak ◽  
Yeonsong Choi ◽  
Yeonsu Jeon ◽  
Seunghoon Kim ◽  
...  

We present the initial phase of the Korean Genome Project (Korea1K), including 1094 whole genomes (sequenced at an average depth of 31×), along with data of 79 quantitative clinical traits. We identified 39 million single-nucleotide variants and indels of which half were singleton or doubleton and detected Korean-specific patterns based on several types of genomic variations. A genome-wide association study illustrated the power of whole-genome sequences for analyzing clinical traits, identifying nine more significant candidate alleles than previously reported from the same linkage disequilibrium blocks. Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the 1KGP panel. As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets. Overall, this study shows that Korea1K can be a useful genotypic and phenotypic resource for clinical and ethnogenetic studies.


2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Hye Kyung Lee ◽  
Harold E. Smith ◽  
Chengyu Liu ◽  
Michaela Willi ◽  
Lothar Hennighausen

AbstractDeaminase base editing has emerged as a tool to install or correct point mutations in the genomes of living cells in a wide range of organisms. However, the genome-wide off-target effects introduced by base editors in the mammalian genome have been examined in only one study. Here, we have investigated the fidelity of cytosine base editor 4 (BE4) and adenine base editors (ABE) in mouse embryos using unbiased whole-genome sequencing of a family-based trio cohort. The same sgRNA was used for BE4 and ABE. We demonstrate that BE4-edited mice carry an excess of single-nucleotide variants and deletions compared to ABE-edited mice and controls. Therefore, an optimization of cytosine base editors is required to improve its fidelity. While the remarkable fidelity of ABE has implications for a wide range of applications, the occurrence of rare aberrant C-to-T conversions at specific target sites needs to be addressed.


Sign in / Sign up

Export Citation Format

Share Document