Detection and characterization of copy number variants based on whole-genome sequencing by DNBSEQ platforms

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.

Download Full-text

Genome‐wide detection of copy number variants in European autochthonous and commercial pig breeds by whole‐genome sequencing of DNA pools identified breed‐characterising copy number states

Animal Genetics ◽

10.1111/age.12954 ◽

2020 ◽

Vol 51 (4) ◽

pp. 541-556

Author(s):

S. Bovo ◽

A. Ribani ◽

M. Muñoz ◽

E. Alves ◽

J. P. Araujo ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Pig Breeds ◽

Genome Wide ◽

Dna Pools

Download Full-text

Performance of copy number variants detection based on whole-genome sequencing by DNBSEQ platforms

BMC Bioinformatics ◽

10.1186/s12859-020-03859-x ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Xinming Liang ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Technology Use ◽

Copy Number ◽

Massively Parallel Sequencing ◽

Copy Number Variants ◽

Whole Genome ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Cnv Detection

Abstract Background DNBSEQ™ platforms are new massively parallel sequencing (MPS) platforms that use DNA nanoball technology. Use of data generated from DNBSEQ™ platforms to detect single nucleotide variants (SNVs) and small insertions and deletions (indels) has proven to be quite effective, while the feasibility of copy number variants (CNVs) detection is unclear. Results Here, we first benchmarked different CNV detection tools based on Illumina whole-genome sequencing (WGS) data of NA12878 and then assessed these tools in CNV detection based on DNBSEQ™ sequencing data from the same sample. When the same tool was used, the CNVs detected based on DNBSEQ™ and Illumina data were similar in quantity, length and distribution, while great differences existed within results from different tools and even based on data from a single platform. We further estimated the CNV detection power based on available CNV benchmarks of NA12878 and found similar precision and sensitivity between the DNBSEQ™ and Illumina platforms. We also found higher precision of CNVs shorter than 1 kbp based on DNBSEQ™ platforms than those based on Illumina platforms by using Pindel, DELLY and LUMPY. We carefully compared these two available benchmarks and found a large proportion of specific CNVs between them. Thus, we constructed a more complete CNV benchmark of NA12878 containing 3512 CNV regions. Conclusions We assessed and benchmarked CNV detections based on WGS with DNBSEQ™ platforms and provide guidelines for future studies.

Download Full-text

Global characterization of copy number variants in epilepsy patients from whole genome sequencing

10.1101/199224 ◽

2017 ◽

Author(s):

Jean Monlong ◽

Simon L. Girard ◽

Caroline Meloche ◽

Maxime Cadieux-Dion ◽

Danielle M. Andrade ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Genetic Variations ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Loss Of Function ◽

Genome Wide

AbstractEpilepsy will affect nearly 3% of people at some point during their lifetime. Previous copy number variants (CNVs) studies of epilepsy have used array-based technology and were restricted to the detection of large or exonic events. In contrast, whole-genome sequencing (WGS) has the potential to more comprehensively profile CNVs but existing analytic methods suffer from limited accuracy. We show that this is in part due to the non-uniformity of read coverage, even after intra-sample normalization. To improve on this, we developed PopSV, an algorithm that uses multiple samples to control for technical variation and enables the robust detection of CNVs. Using WGS and PopSV, we performed a comprehensive characterization of CNVs in 198 individuals affected with epilepsy and 301 controls. For both large and small variants, we found an enrichment of rare exonic events in epilepsy patients, especially in genes with predicted loss-of-function intolerance. Notably, this genome-wide survey also revealed an enrichment of rare non-coding CNVs near previously known epilepsy genes. This enrichment was strongest for non-coding CNVs located within 100 Kbp of an epilepsy gene and in regions associated with changes in the gene expression, such as expression QTLs or DNase I hypersensitive sites. Finally, we report on 21 potentially damaging events that could be associated with known or new candidate epilepsy genes. Our results suggest that comprehensive sequence-based profiling of CNVs could help explain a larger fraction of epilepsy cases.Author summaryEpilepsy is a common neurological disorder affecting around 3% of the population. In some cases, epilepsy is caused by brain trauma or other brain anomalies but there are often no clear causes. Genetic factors have been associated with epilepsy in the past such as rare genetic variations found by linkage studies as well as common genetic variations found by genome-wide association studies and large copy-number variants. We sequenced the genome of ∼200 epilepsy patients and ∼300 healthy controls and compared the distribution of deletion (loss of a copy) and duplication (additional copy) of genomic regions. Thanks to the sequencing technology and a new method that takes advantage of the large sample size, we could compare the distribution of small copy- number variants between epilepsy patients and controls. Overall, we found that small variants are also associated with epilepsy. Indeed, the genome of epilepsy patients had more exonic copy- number variants, especially when rare or affecting genes with predicted loss-of-function intolerance. Focusing on regions around genes that have been previously associated with epilepsy, we also found more non-coding variants in epilepsy patients, especially deletions or variants in regulatory regions. Finally, we provide a list of 21 regions in which we found likely pathogenic variants.

Download Full-text

ClinSV: Clinical grade structural and copy number variant detection from whole genome sequencing data

10.1101/2020.06.30.20143453 ◽

2020 ◽

Author(s):

Andre E Minoche ◽

Ben Lundie ◽

Greg B Peters ◽

Thomas Ohnesorg ◽

Mark Pinese ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

False Positive ◽

Copy Number ◽

False Positive Rate ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data

AbstractWhole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritisation and visualisation method, which identified 99.8% of pathogenic ClinVar CNVs >10kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5–4.5%) and reproducibility high (95–99%). In clinical practice, ClinSV identified reportable variants in 22 of 485 patients (4.7%) of which 35–63% were not detectable by current clinical microarray designs.

Download Full-text

ClinSV: clinical grade structural and copy number variant detection from whole genome sequencing data

Genome Medicine ◽

10.1186/s13073-021-00841-x ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Andre E. Minoche ◽

Ben Lundie ◽

Greg B. Peters ◽

Thomas Ohnesorg ◽

Mark Pinese ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

False Positive ◽

Copy Number ◽

False Positive Rate ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data

AbstractWhole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritization, and visualization framework, which identified 99.8% of simulated pathogenic ClinVar CNVs > 10 kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5–4.5%) and reproducibility high (95–99%). In clinical practice, ClinSV identified reportable variants in 22 of 485 patients (4.7%) of which 35–63% were not detectable by current clinical microarray designs. ClinSV is available at https://github.com/KCCG/ClinSV.

Download Full-text

0306 Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

Journal of Animal Science ◽

10.2527/jam2016-0306 ◽

2016 ◽

Vol 94 (suppl_5) ◽

pp. 146-146

Author(s):

D. M. Bickhart ◽

L. Xu ◽

J. L. Hutchison ◽

J. B. Cole ◽

D. J. Null ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genetic Markers ◽

Genome Sequencing ◽

Copy Number ◽

Large Scale ◽

Copy Number Variants ◽

Whole Genome

Download Full-text

Copy‐Number Variants Detection by Low‐Pass Whole‐Genome Sequencing

Current Protocols in Human Genetics ◽

10.1002/cphg.43 ◽

2017 ◽

Vol 94 (1) ◽

Cited By ~ 4

Author(s):

Zirui Dong ◽

Weiwei Xie ◽

Haixiao Chen ◽

Jinjin Xu ◽

Huilin Wang ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Low Pass

Download Full-text

SECNVs: A Simulator of Copy Number Variants and Whole-Exome Sequences from Reference Genomes

10.1101/824128 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yue Xing ◽

Alan R. Dabney ◽

Xiao Li ◽

Guosong Wang ◽

Clare A. Gill ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Whole Genome ◽

Sequencing Data ◽

Software Applications ◽

Exome Sequencing Data ◽

Whole Exome ◽

Whole Exome Sequencing Data

AbstractCopy number variants are insertions and deletions of 1 kb or larger in a genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested SECNVs (Simulator of Exome Copy Number Variants), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.

Download Full-text

JAX-CNV: A whole genome sequencing-based algorithm for copy number detection at clinical grade level

10.1101/2021.03.16.21252173 ◽

2021 ◽

Author(s):

Wan-Ping Lee ◽

Qihui Zhu ◽

Xiaofei Yang ◽

Silvia Liu ◽

Eliza Cerveira ◽

...

Keyword(s):

False Discovery Rate ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variant ◽

Fold Increase ◽

Chromosomal Microarray ◽

Whole Genome ◽

False Discovery ◽

Calling Algorithm

We aimed to develop a whole genome sequencing (WGS)-based copy number variant (CNV) calling algorithm with the potential of replacing chromosomal microarray assay (CMA) for clinical diagnosis. JAX-CNV is thus developed for CNV detection from WGS. The performance of this CNV calling algorithm was evaluated in a blinded manner on 31 samples and compared to the results of clinically-validated CMAs. Comparing to 112 CNVs reported by clinically-validated CMAs of the 31 samples, JAX-CNV is 100% recalling them. Besides, JAX-CNV identified an average of 30 CNVs per individual that is an approximately seven-fold increase compared to calls of clinically-validated CMAs. Experimental validation of 24 randomly selected CNVs, showed one false positive (i.e., a false discovery rate of 4.17%). A robustness test on lower-coverage data revealed a 100% sensitivity for CNVs greater than 300 kb (the current threshold for College of American Pathologists) down to 10x coverage. For CNVs greater than 50 kb, sensitivities were 100% for coverages deeper than 20x, 97% for 15x, and 95% for 10x. We developed a WGS-based CNV pipeline, including this newly developed CNV caller JAX-CNV, and found it capable of detecting CMA reported CNVs at 100% sensitivity with about 4% false discovery rate. We propose that JAX-CNV could be further examined in a multi-institutional study to justify the transition of first-tier genetic testing from CMAs to WGS. JAX-CNV is available on https://github.com/TheJacksonLaboratory/JAX-CNV.

Download Full-text

Combining callers improves the detection of copy number variants from whole-genome sequencing

European Journal of Human Genetics ◽

10.1038/s41431-021-00983-x ◽

2021 ◽

Author(s):

Marie Coutelier ◽

Manuel Holtgrewe ◽

Marten Jäger ◽

Ricarda Flöttman ◽

Martin A. Mensah ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Computation Time ◽

Comparative Genomic ◽

Whole Genome ◽

Base Pairs ◽

Whole Exome ◽

Human Pathology

AbstractCopy Number Variants (CNVs) are deletions, duplications or insertions larger than 50 base pairs. They account for a large percentage of the normal genome variation and play major roles in human pathology. While array-based approaches have long been used to detect them in clinical practice, whole-genome sequencing (WGS) bears the promise to allow concomitant exploration of CNVs and smaller variants. However, accurately calling CNVs from WGS remains a difficult computational task, for which a consensus is still lacking. In this paper, we explore practical calling options to reach the best compromise between sensitivity and sensibility. We show that callers based on different signal (paired-end reads, split reads, coverage depth) yield complementary results. We suggest approaches combining four selected callers (Manta, Delly, ERDS, CNVnator) and a regenotyping tool (SV2), and show that this is applicable in everyday practice in terms of computation time and further interpretation. We demonstrate the superiority of these approaches over array-based Comparative Genomic Hybridization (aCGH), specifically regarding the lack of resolution in breakpoint definition and the detection of potentially relevant CNVs. Finally, we confirm our results on the NA12878 benchmark genome, as well as one clinically validated sample. In conclusion, we suggest that WGS constitutes a timely and economically valid alternative to the combination of aCGH and whole-exome sequencing.

Download Full-text