CNVkit: Copy number detection and visualization for targeted sequencing using off-target reads

Mapping Intimacies ◽

10.1101/010876 ◽

2014 ◽

Cited By ~ 6

Author(s):

Eric Talevich ◽

A. Hunter Shain ◽

Thomas Botton ◽

Boris C. Bastian

Keyword(s):

Copy Number ◽

Repetitive Sequences ◽

Copy Number Variants ◽

Gc Content ◽

Read Depth ◽

Comparative Genomic ◽

Sequencing Data ◽

Intergenic Regions ◽

Copy Number Detection ◽

Copy Number Changes

Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massive parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for compatibility with other software. Availability: http://github.com/etal/cnvkit

A machine-learning approach for accurate detection of copy-number variants from exome sequencing

10.1101/460931 ◽

2018 ◽

Author(s):

Vijay Kumar Pounraja ◽

Gopal Jayakar ◽

Matthew Jensen ◽

Neil Kelkar ◽

Santhosh Girirajan

Keyword(s):

Machine Learning ◽

Exome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Gc Content ◽

Clinical Diagnostics ◽

Venn Diagram ◽

Exome Capture ◽

Small Subset ◽

Sequencing Data

ABSTRACTCopy-number variants (CNVs) are a major cause of several genetic disorders, making their detection an essential component of genetic analysis pipelines. Current methods for detecting CNVs from exome sequencing data are limited by high false positive rates and low concordance due to the inherent biases of individual algorithms. To overcome these issues, calls generated by two or more algorithms are often intersected using Venn-diagram approaches to identify “high-confidence” CNVs. However, this approach is inadequate, as it misses potentially true calls that do not have consensus from multiple callers. Here, we present CN-Learn, a machine-learning framework (https://github.com/girirajanlab/CN_Learn) that integrates calls from multiple CNV detection algorithms and learns to accurately identify true CNVs using caller-specific and genomic features from a small subset of validated CNVs. Using CNVs predicted by four exome-based CNV callers (CANOES, CODEX, XHMM and CLAMMS) from 503 samples, we demonstrate that CN-Learn identifies true CNVs at higher precision (~90%) and recall (~85%) rates while maintaining robust performance even when trained with minimal data (~30 samples). CN-Learn recovers twice as many CNVs compared to individual callers or Venn diagram-based approaches, with features such as exome capture probe count, caller concordance and GC content providing the most discriminatory power. In fact, about 58% of all true CNVs recovered by CN-Learn were either singletons or calls that lacked support from at least one caller. Our study underscores the limitations of current approaches for CNV identification and provides an effective method that yields high-quality CNVs for application in clinical diagnostics.

T-CNV: a robust tool for detecting and visualizing copy number variants in targeted sequencing data.

10.21203/rs.3.rs-27672/v1 ◽

2020 ◽

Author(s):

liu ye ◽

wu yangming ◽

zheng zexin ◽

zhou tianliangwen

Keyword(s):

Copy Number ◽

Copy Number Variants ◽

Control Sample ◽

Read Depth ◽

Gaussian Mixture ◽

Sequencing Data ◽

Positive Predict Value ◽

Sliding Method ◽

Targeted Ngs ◽

Cnv Detection

Abstract Background Copy number variants (CNVs) are widespread among human genes, causing Mendelian or sporadic traits, or associating with complex diseases. Several tools have been developed for CNV assessment based on next generation sequencing (NGS) data using Read-depth (RD) strategy. However, maintaining high level of sensitivity and specificity is always challenging. Here, we present a novel, powerful, user-friendly and open accessed tool, T-CNV for CNV detection and visualization in targeted NGS panel.Results T-CNV consists of primary CNV detection and CNV candidates confirmation steps. After computing log2 values of normalized read depth ratio of tumor and normal/control sample, T-CNV confirms each possible CNV candidates by bins method, Gaussian Mixture Model (GMM) clustering approach and window-sliding method. We benchmarked its capacity with MLPA-validated dataset. Compared to three other advanced tools, T-CNV presents excellent performance with 95.42% sensitivity, 99.93% specificity and 93.63% positive predict value in MLPA-validated dataset, while achieving satisfactory performance in simulation study (sensitivity 65.95%, positive predict value 88.71% at coverage 100X).Conclusions T-CNV is a novel and robust tool for CNV detection and visualization in targeted NGS panel consisting of determination of possible CNV candidates and further confirmation by three different methods. It’s publicly available at https://github.com/Top-Gene/T-CNV.

CNVkit-RNA: Copy number inference from RNA-Sequencing data

10.1101/408534 ◽

2018 ◽

Cited By ~ 9

Author(s):

Eric Talevich ◽

A. Hunter Shain

Keyword(s):

Gene Expression ◽

Rna Sequencing ◽

Copy Number ◽

Point Mutations ◽

Gc Content ◽

The Cancer Genome Atlas ◽

Comparative Genomic ◽

Sequencing Data ◽

Number Variation ◽

User Friendly

AbstractRNA-sequencing is most commonly used to measure gene expression, but it is possible to extract genotypic information from RNA-sequencing data, too. Point mutations and translocations can be detected when they occur in expressed genes, however, there are few software solutions to infer copy number information from RNA-sequencing data. This is because a gene’s expression is dictated by a number of variables, including, but not limited to, copy number variation. Here, we report new functionalities within the software package CNVkit that enable copy number inference from RNA-sequencing data. First, CNVkit removes technical variation in gene expression associated with GC-content and transcript length. Next, CNVkit assigns a weight, dictated by several variables, to each transcript with the net effect of preferentially inferring copy number from highly and stably expressed genes. We benchmarked our approach on 105 melanomas from The Cancer Genome Atlas project and observed a high degree of concordance (R = 0.739) between our estimates and those from array comparative genomic hybridization (aCGH) on the same samples. After initial configuration, the software requires few inputs, is able to process a batch of up to 100 samples in less than ten minutes, and can be used in conjunction with pre-existing features of CNVkit, including visualization tools. Overall, we present a rapid, user-friendly software solution to infer copy number information from gene expression data.

Distinguishing melanocytic nevi from melanoma by DNA copy number changes: comparative genomic hybridization as a research and diagnostic tool

Dermatologic Therapy ◽

10.1111/j.1529-8019.2005.00055.x ◽

2006 ◽

Vol 19 (1) ◽

pp. 40-49 ◽

Cited By ~ 147

Author(s):

Jurgen Bauer ◽

Boris C. Bastian

Keyword(s):

Comparative Genomic Hybridization ◽

Diagnostic Tool ◽

Copy Number ◽

Comparative Genomic ◽

Genomic Hybridization ◽

Dna Copy Number ◽

Melanocytic Nevi ◽

Copy Number Changes

Metaphase and array comparative genomic hybridization: unique copy number changes and gene amplification of medulloblastomas in South America

Cancer Genetics and Cytogenetics ◽

10.1016/j.cancergencyto.2006.05.009 ◽

2006 ◽

Vol 170 (1) ◽

pp. 40-47 ◽

Cited By ~ 9

Author(s):

Maisa Yoshimoto ◽

Jane Bayani ◽

Paulo A.S. Nuin ◽

Nasjla S. Silva ◽

Sergio Cavalheiro ◽

...

Keyword(s):

South America ◽

Comparative Genomic Hybridization ◽

Gene Amplification ◽

Copy Number ◽

Array Comparative Genomic Hybridization ◽

Comparative Genomic ◽

Genomic Hybridization ◽

Copy Number Changes

PocaCNV: A Tool to Detect Copy Number Variants from Population-Scale Genome Sequencing Data

10.1109/bibm52615.2021.9669405 ◽

2021 ◽

Author(s):

Zhendong Zhang ◽

Yongzhuang Liu ◽

Gaoyang Li ◽

Yadong Wang

Keyword(s):

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Sequencing Data ◽

Population Scale

Amplification at 9p in Cervical Carcinoma by Comparative Genomic Hybridization

Analytical Cellular Pathology ◽

10.1155/2001/174645 ◽

2001 ◽

Vol 22 (3) ◽

pp. 159-163 ◽

Cited By ~ 8

Author(s):

Kowan J. Jee ◽

Young Tak Kim ◽

Kyu Rae Kim ◽

Yan Aalto ◽

Sakari Knuutila

Keyword(s):

Squamous Cell Carcinoma ◽

Cell Carcinoma ◽

Comparative Genomic Hybridization ◽

Copy Number ◽

Comparative Genomic ◽

Malignant Cells ◽

Genomic Hybridization ◽

Carcinoma Of Cervix ◽

Chromosome Arm ◽

Copy Number Changes

DNA copy number changes were studied by comparative genomic hybridization on 10 tumor specimens of squamous cell carcinoma of cervix obtained from Korean patients. DNA was extracted from paraffin‐embedded sections after removal of non‐malignant cells by microdissection technique. Copy number changes were found in 8/10 tumors. The most frequent changes were chromosome 19 gains (n=6) and losses on chromosomes 4 (n=4), 5 (n=3), and 3p (n=3). A novel finding was amplification in chromosome arm 9p21‐pter in 2 cases. Gains in 1, 3q, 5p, 6p, 8q, 16p, 17, and 20q and losses at 2q, 6q, 8p, 9q, 10p, 11, 13, 16q, and 18q were observed in at least one of the cases.

Peer Review #1 of "GROM-RD: resolving genomic biases to improve read depth detection of copy number variants (v0.1)"

10.7287/peerj.836v0.1/reviews/1 ◽

2015 ◽

Keyword(s):

Peer Review ◽

Copy Number ◽

Copy Number Variants ◽

Read Depth ◽

Depth Detection

GROM-RD: Resolving genomic biases to improve read depth detection of copy number variants

10.7287/peerj.preprints.663 ◽

2014 ◽

Author(s):

Sean D Smith ◽

Joseph K Kawash ◽

Andrey Grigoriev

Keyword(s):

Copy Number ◽

Copy Number Variants ◽

Read Depth ◽

Read Coverage ◽

Novel Approach ◽

Depth Analysis ◽

Gc Bias ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Cnv Detection

Amplifications or deletions of genome segments, known as copy number variants (CNVs), have been associated with many diseases. Read depth analysis of next-generation sequencing (NGS) is an essential method of detecting CNVs. However, genome read coverage is frequently distorted by various biases of NGS platforms, which reduce predictive capabilities of existing approaches. Additionally, the use of read depth tools has been somewhat hindered by imprecise breakpoint identification. We developed GROM-RD, an algorithm that analyzes multiple biases in read coverage to detect CNVs in NGS data. We found non-uniform variance across distinct GC regions after using existing GC bias correction methods and developed a novel approach to normalize such variance. Although complex and repetitive genome segments complicate CNV detection, GROM-RD adjusts for repeat bias and uses a two-pipeline masking approach to detect CNVs in complex and repetitive segments while improving sensitivity in less complicated regions. To overcome a typical weakness of RD methods, GROM-RD employs a CNV search using size-varying overlapping windows to improve breakpoint resolution. We compared our method to two widely used programs based on read depth methods, CNVnator and RDXplorer, and observed improved CNV detection and breakpoint accuracy for GROM-RD. GROM-RD is available at http://grigoriev.rutgers.edu/software/

Clinically significant exome-based copy number variants detected by re-evaluation of exome sequencing data

Dokuz Eylül Üniversitesi Tıp Fakültesi Dergisi ◽

10.5505/deutfd.2021.29053 ◽

2021 ◽

Vol 35 (1) ◽

pp. 1-11

Author(s):

Fatma Kurt Çolak

Keyword(s):

Exome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Sequencing Data ◽

Exome Sequencing Data ◽

Clinically Significant