Targeted Detection of Copy Number Variants Using a Myeloid Malignancy Next Generation Sequencing Mutation Panel Allows Comprehensive Genetic Analysis Using a Single Testing Method

Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 2887-2887 ◽  
Author(s):  
Wei Shen ◽  
Philippe Szankasi ◽  
Maria Sederberg ◽  
Jonathan Schumacher ◽  
Kimberly Frizzell ◽  
...  

Abstract Introduction Myeloid malignancies are clonal disorders of hematopoietic stem and progenitor cells that include myelodysplastic syndromes (MDS), myeloproliferative neoplasms (MPN) myelodysplastic/myeloproliferative (MDS/MPN) overlap neoplasms, and acute myeloid leukemia (AML). Next generation sequencing (NGS) studies have identified a number of recurrently mutated genes that have diagnostic and/or prognostic significance in these disorders. Chromosomal copy number variations (CNVs) including deletions at 5q, 7q, 12p and 17p as well as trisomy 8, are another major type of recurrent genetic alteration with clinical significance in myeloid malignancies. Detection of CNVs has traditionally required specialized testing methods such as cytogenetics/FISH and/or array-based platforms. Thus, comprehensive genetic profiling of myeloid malignancies requires multiple testing strategies at high cost. In an effort to provide more efficient genetic profiling of these disorders, we designed and tested an algorithm to evaluate for CNVs using sequence coverage data derived from a NGS-based 53-gene myeloid mutation panel with the goal of obtaining information on both mutations and CNVs from a single test. Methods The sample cohort included 73 MDS patients, 36 patients with MDS/MPN neoplasms, 70 MPN patients, and 91 AML patients (n=270 total cases). Genomic DNA was extracted from bone marrow or peripheral blood, and enriched for regions of interest by solution capture (SureSelect, Agilent), then sequenced on the Illumina MiSeq, HiSeq 2000 or NextSeq NGS platforms. Gene variants were identified using the software programs FreeBayes, for single nucleotide variants and small insertions/deletions, and Pindel for larger insertions/deletions. To detect CNVs in the targeted regions, the read coverage data was normalized to a Log2 ratio which was generated by comparing the normalized sample coverage to that obtained from a pool of normal controls. CNVs were detected using a circular binary segmentation algorithm. In a subset of cases (n=43) CNVs detected using NGS data were validated by comparing to the results obtained by SNP microarray (CytoScan HD Array, Affymetrix) testing, the current gold standard, and analyzed by CHAS 2.0 (Affymetrix) and Nexus 7.5 (Biodiscovery). KMT2A (MLL) partial tandem duplications detected by NGS analysis were confirmed by quantitative PCR. Comparisons of proportions were performed by Fisher's exact test. Results In the entire cohort of 270 cases, we detected pathogenic mutations in 208 cases (77%). ASXL1 (n=64), SRSF2 (n=40), TET2 (n=39) and DNMT3A (n=37) were among the most frequently mutated genes as has previously been shown. For targeted CNV analysis, seven cases were excluded due to inadequate normalization of the read coverages. In the validation set of 43 cases, all of the targeted CNVs detected by NGS were confirmed by SNP microarray analysis (Figure 1A). Overall, we detected targeted CNVs in 68 cases (25.8%; AML n=32, MDS n=16, MDS/MPN n=9, MPN n=11). The most frequent CNVs were 7q deletion of a region including the genes LUC7L1 and EZH2 (n=21), TP53 deletion (n=9), ETV6 deletion (n=8), gain of RAD21 (possible trisomy 8) (n=8), and 5q deletion of a region including the genes NSD1 and NPM1 (n=4). In addition, we were able to detect exon-level duplications, the so-called KMT2A partial tandem duplication (also known as MLL -PTD), in 9 cases (Figure 1B). In the 63 cases that were negative by mutation analysis (MDS n=26, AML n=17, MDS/MPN n=5, MPN n=15), targeted CNVs including 7q deletion were observed in 4 cases (6%) (MDS n=3, AML n=1). In addition, targeted CNV analysis detected TP53 deletion in 3 TP53 -non-mutated cases and in 6 TP53 -mutated cases, and TET2 deletion in 2 TET2 -non-mutated cases and in 2 TET2 mutated cases. To investigate the association among gene mutations and targeted CNVs, we found that ETV6 deletion was strongly associated with TP53 alterations (both mutation and gene deletion; p<0.001) and 7q deletion was associated with mutations in TP53, KRAS and IDH1 (p= 0.000073, 0.009, 0.026, respectively). Conclusion Our results demonstrated the feasibility of using the same NGS data to detect both somatic mutations and targeted CNVs with enhanced efficiency and potentially lower costs compared to classical methods. Figure 1. Examples of targeted CNVs detected by NGS and comparison to SNP microarray analysis. Figure 1. Examples of targeted CNVs detected by NGS and comparison to SNP microarray analysis. Disclosures South: Affymetrix: Consultancy, Honoraria; ARUP Laboratories: Employment; Lineagen Corporation: Consultancy; Illumina: Consultancy, Honoraria.

2017 ◽  
Vol 71 (4) ◽  
pp. 372-378 ◽  
Author(s):  
Wei Shen ◽  
Christian N Paxton ◽  
Philippe Szankasi ◽  
Maria Longhurst ◽  
Jonathan A Schumacher ◽  
...  

AimsGenetic abnormalities, including copy number variants (CNV), copy number neutral loss of heterozygosity (CN-LOH) and gene mutations, underlie the pathogenesis of myeloid malignancies and serve as important diagnostic, prognostic and/or therapeutic markers. Currently, multiple testing strategies are required for comprehensive genetic testing in myeloid malignancies. The aim of this proof-of-principle study was to investigate the feasibility of combining detection of genome-wide large CNVs, CN-LOH and targeted gene mutations into a single assay using next-generation sequencing (NGS).MethodsFor genome-wide CNV detection, we designed a single nucleotide polymorphism (SNP) sequencing backbone with 22 762 SNP regions evenly distributed across the entire genome. For targeted mutation detection, 62 frequently mutated genes in myeloid malignancies were targeted. We combined this SNP sequencing backbone with a targeted mutation panel, and sequenced 9 healthy individuals and 16 patients with myeloid malignancies using NGS.ResultsWe detected 52 somatic CNVs, 11 instances of CN-LOH and 39 oncogenic mutations in the 16 patients with myeloid malignancies, and none in the 9 healthy individuals. All CNVs and CN-LOH were confirmed by SNP microarray analysis.ConclusionsWe describe a genome-wide SNP sequencing backbone which allows for sensitive detection of genome-wide CNVs and CN-LOH using NGS. This proof-of-principle study has demonstrated that this strategy can provide more comprehensive genetic profiling for patients with myeloid malignancies using a single assay.


2020 ◽  
Author(s):  
Yoo-Jin Kim ◽  
SeungHyun Jung ◽  
Eun-Hye Hur ◽  
Eun-Ji Choi ◽  
Kyoo-Hyung Lee ◽  
...  

Abstract Background: Recent advancements in next-generation sequencing (NGS) technologies allow the simultaneous identification of targeted copy number alterations (CNAs) as well as somatic mutations using the same panel-based NGS data. We investigated whether CNAs detected by the targeted NGS data provided additional clinical implications, over somatic mutations, in myelodysplastic syndrome (MDS). Methods: Targeted deep sequencing of 28 well-known MDS-related genes was performed for 266 patients with MDS. Results: Overall, 215 (80.8%) patients were found to have at least one somatic mutation; 67 (25.2%) had at least one CNA; 227 (85.3%) had either a somatic mutation or CNA; 160 had somatic mutations without CNA; and 12 had CNA without somatic mutations. Considering the clinical variables and somatic mutations alone, multivariate analysis demonstrated that sex, revised International Prognostic Scoring System (IPSS-R) and NRAS and TP53 mutations were independent prognostic factors for overall survival. For AML-free survival, these factors were sex, IPSS-R, and mutations in NRAS, DNMT3A, and complex karyotype/TP53 mutations. When we consider clinical variables along with somatic mutations and CNAs, genetic alterations in TET2, LAMB4, U2AF1, and CBL showed additional significant impact on the survivals. Conclusions: Our study suggests that the concurrent detection of somatic mutations and targeted CNAs may provide clinically useful information for the prognosis of MDS patients.


2021 ◽  
Vol 11 ◽  
Author(s):  
Kun Xie ◽  
Ye Tian ◽  
Xiguo Yuan

Copy number variation (CNV) is a common type of structural variations in human genome and confers biological meanings to human complex diseases. Detection of CNVs is an important step for a systematic analysis of CNVs in medical research of complex diseases. The recent development of next-generation sequencing (NGS) platforms provides unprecedented opportunities for the detection of CNVs at a base-level resolution. However, due to the intrinsic characteristics behind NGS data, accurate detection of CNVs is still a challenging task. In this article, we propose a new density peak-based method, called dpCNV, for the detection of CNVs from NGS data. The algorithm of dpCNV is designed based on density peak clustering algorithm. It extracts two features, i.e., local density and minimum distance, from sequencing read depth (RD) profile and generates a two-dimensional data. Based on the generated data, a two-dimensional null distribution is constructed to test the significance of each genome bin and then the significant genome bins are declared as CNVs. We test the performance of the dpCNV method on a number of simulated datasets and make comparison with several existing methods. The experimental results demonstrate that our proposed method outperforms others in terms of sensitivity and F1-score. We further apply it to a set of real sequencing samples and the results demonstrate the validity of dpCNV. Therefore, we expect that dpCNV can be used as a supplementary to existing methods and may become a routine tool in the field of genome mutation analysis.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yang Guo ◽  
Shuzhen Wang ◽  
Xiguo Yuan

Copy number variation (CNV) is a genomic mutation that plays an important role in tumor evolution and tumor genesis. Accurate detection of CNVs from next-generation sequencing (NGS) data is still a challenging task due to artifacts such as uneven mapped reads and unbalanced amplitudes of gains and losses. This study proposes a new approach called HBOS-CNV to detect CNVs from NGS data. The central point of HBOS-CNV is that it uses a new statistic, the histogram-based outlier score (HBOS), to evaluate the fluctuation of genome bins to determine those of changed copy numbers. In comparison with existing statistics in the evaluation of CNVs, HBOS is a non-linearly transformed value from the observed read depth (RD) value of each genome bin, having the potential ability to relieve the effects resulted from the above artifacts. In the calculation of HBOS values, a dynamic width histogram is utilized to depict the density of bins on the genome being analyzed, which can reduce the effects of noises partially contributed by mapping and sequencing errors. The evaluation of genome bins using such a new statistic can lead to less extremely significant CNVs having a high probability of detection. We evaluated this method using a large number of simulation datasets and compared it with four existing methods (CNVnator, CNV-IFTV, CNV-LOF, and iCopyDav). The results demonstrated that our proposed method outperforms the others in terms of sensitivity, precision, and F1-measure. Furthermore, we applied the proposed method to a set of real sequencing samples from the 1000 Genomes Project and determined a number of CNVs with biological meanings. Thus, the proposed method can be regarded as a routine approach in the field of genome mutation analysis for cancer samples.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Ashish Kumar Singh ◽  
Maren Fridtjofsen Olsen ◽  
Liss Anne Solberg Lavik ◽  
Trine Vold ◽  
Finn Drabløs ◽  
...  

Abstract Background Detection of copy number variation (CNV) in genes associated with disease is important in genetic diagnostics, and next generation sequencing (NGS) technology provides data that can be used for CNV detection. However, CNV detection based on NGS data is in general not often used in diagnostic labs as the data analysis is challenging, especially with data from targeted gene panels. Wet lab methods like MLPA (MRC Holland) are widely used, but are expensive, time consuming and have gene-specific limitations. Our aim has been to develop a bioinformatic tool for CNV detection from NGS data in medical genetic diagnostic samples. Results Our computational pipeline for detection of CNVs in NGS data from targeted gene panels utilizes coverage depth of the captured regions and calculates a copy number ratio score for each region. This is computed by comparing the mean coverage of the sample with the mean coverage of the same region in other samples, defined as a pool. The pipeline selects pools for comparison dynamically from previously sequenced samples, using the pool with an average coverage depth that is nearest to the one of the samples. A sliding window-based approach is used to analyze each region, where length of sliding window and sliding distance can be chosen dynamically to increase or decrease the resolution. This helps in detecting CNVs in small or partial exons. With this pipeline we have correctly identified the CNVs in 36 positive control samples, with sensitivity of 100% and specificity of 91%. We have detected whole gene level deletion/duplication, single/multi exonic level deletion/duplication, partial exonic deletion and mosaic deletion. Since its implementation in mid-2018 it has proven its diagnostic value with more than 45 CNV findings in routine tests. Conclusions With this pipeline as part of our diagnostic practices it is now possible to detect partial, single or multi-exonic, and intragenic CNVs in all genes in our target panel. This has helped our diagnostic lab to expand the portfolio of genes where we offer CNV detection, which previously was limited by the availability of MLPA kits.


Author(s):  
Anne Krogh Nøhr ◽  
Kristian Hanghøj ◽  
Genis Garcia Erill ◽  
Zilong Li ◽  
Ida Moltke ◽  
...  

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.


Sign in / Sign up

Export Citation Format

Share Document