scholarly journals SECNVs: A Simulator of Copy Number Variants and Whole-Exome Sequences from Reference Genomes

2019 ◽  
Author(s):  
Yue Xing ◽  
Alan R. Dabney ◽  
Xiao Li ◽  
Guosong Wang ◽  
Clare A. Gill ◽  
...  

AbstractCopy number variants are insertions and deletions of 1 kb or larger in a genome that play an important role in phenotypic changes and human disease. Many software applications have been developed to detect copy number variants using either whole-genome sequencing or whole-exome sequencing data. However, there is poor agreement in the results from these applications. Simulated datasets containing copy number variants allow comprehensive comparisons of the operating characteristics of existing and novel copy number variant detection methods. Several software applications have been developed to simulate copy number variants and other structural variants in whole-genome sequencing data. However, none of the applications reliably simulate copy number variants in whole-exome sequencing data. We have developed and tested SECNVs (Simulator of Exome Copy Number Variants), a fast, robust and customizable software application for simulating copy number variants and whole-exome sequences from a reference genome. SECNVs is easy to install, implements a wide range of commands to customize simulations, can output multiple samples at once, and incorporates a pipeline to output rearranged genomes, short reads and BAM files in a single command. Variants generated by SECNVs are detected with high sensitivity and precision by tools commonly used to detect copy number variants. SECNVs is publicly available at https://github.com/YJulyXing/SECNVs.

Author(s):  
Marie Coutelier ◽  
Manuel Holtgrewe ◽  
Marten Jäger ◽  
Ricarda Flöttman ◽  
Martin A. Mensah ◽  
...  

AbstractCopy Number Variants (CNVs) are deletions, duplications or insertions larger than 50 base pairs. They account for a large percentage of the normal genome variation and play major roles in human pathology. While array-based approaches have long been used to detect them in clinical practice, whole-genome sequencing (WGS) bears the promise to allow concomitant exploration of CNVs and smaller variants. However, accurately calling CNVs from WGS remains a difficult computational task, for which a consensus is still lacking. In this paper, we explore practical calling options to reach the best compromise between sensitivity and sensibility. We show that callers based on different signal (paired-end reads, split reads, coverage depth) yield complementary results. We suggest approaches combining four selected callers (Manta, Delly, ERDS, CNVnator) and a regenotyping tool (SV2), and show that this is applicable in everyday practice in terms of computation time and further interpretation. We demonstrate the superiority of these approaches over array-based Comparative Genomic Hybridization (aCGH), specifically regarding the lack of resolution in breakpoint definition and the detection of potentially relevant CNVs. Finally, we confirm our results on the NA12878 benchmark genome, as well as one clinically validated sample. In conclusion, we suggest that WGS constitutes a timely and economically valid alternative to the combination of aCGH and whole-exome sequencing.


2014 ◽  
Vol 42 (12) ◽  
pp. e97-e97 ◽  
Author(s):  
Daniel Backenroth ◽  
Jason Homsy ◽  
Laura R. Murillo ◽  
Joe Glessner ◽  
Edwin Lin ◽  
...  

2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Junhua Rao ◽  
Lihua Peng ◽  
Xinming Liang ◽  
Hui Jiang ◽  
Chunyu Geng ◽  
...  

Abstract Background DNBSEQ™ platforms are new massively parallel sequencing (MPS) platforms that use DNA nanoball technology. Use of data generated from DNBSEQ™ platforms to detect single nucleotide variants (SNVs) and small insertions and deletions (indels) has proven to be quite effective, while the feasibility of copy number variants (CNVs) detection is unclear. Results Here, we first benchmarked different CNV detection tools based on Illumina whole-genome sequencing (WGS) data of NA12878 and then assessed these tools in CNV detection based on DNBSEQ™ sequencing data from the same sample. When the same tool was used, the CNVs detected based on DNBSEQ™ and Illumina data were similar in quantity, length and distribution, while great differences existed within results from different tools and even based on data from a single platform. We further estimated the CNV detection power based on available CNV benchmarks of NA12878 and found similar precision and sensitivity between the DNBSEQ™ and Illumina platforms. We also found higher precision of CNVs shorter than 1 kbp based on DNBSEQ™ platforms than those based on Illumina platforms by using Pindel, DELLY and LUMPY. We carefully compared these two available benchmarks and found a large proportion of specific CNVs between them. Thus, we constructed a more complete CNV benchmark of NA12878 containing 3512 CNV regions. Conclusions We assessed and benchmarked CNV detections based on WGS with DNBSEQ™ platforms and provide guidelines for future studies.


2013 ◽  
Vol 14 (10) ◽  
pp. R120 ◽  
Author(s):  
Alberto Magi ◽  
Lorenzo Tattini ◽  
Ingrid Cifola ◽  
Romina D’Aurizio ◽  
Matteo Benelli ◽  
...  

2016 ◽  
pp. gkw695 ◽  
Author(s):  
Romina D'Aurizio ◽  
Tommaso Pippucci ◽  
Lorenzo Tattini ◽  
Betti Giusti ◽  
Marco Pellegrini ◽  
...  

2020 ◽  
Author(s):  
Andre E Minoche ◽  
Ben Lundie ◽  
Greg B Peters ◽  
Thomas Ohnesorg ◽  
Mark Pinese ◽  
...  

AbstractWhole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritisation and visualisation method, which identified 99.8% of pathogenic ClinVar CNVs >10kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5–4.5%) and reproducibility high (95–99%). In clinical practice, ClinSV identified reportable variants in 22 of 485 patients (4.7%) of which 35–63% were not detectable by current clinical microarray designs.


2021 ◽  
Author(s):  
Migle Gabrielaite ◽  
Mathias Husted Torp ◽  
Sergio Andreu-Sánchez ◽  
Filipe Garrett Vieira ◽  
Christina Bligaard Pedersen ◽  
...  

Background: Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. The clinically relevant CNVs are hard to detect because CNVs are common structural variations that define large parts of the normal human genome. CNV calling from short-read sequencing data has the potential to leverage available cohort studies and allow full genomic profiling in the clinic without the need for additional data modalities. Questions regarding performance of CNV calling tools for clinical use and suitable sequencing protocols remain poorly addressed, mainly because of the lack of good reference data sets. Methods: We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a unique reference cohort encompassing 39 whole genome sequencing (WGS) samples paired with analysis by the current clinical standard—SNP-array based CNV calling. Additionally, for nine of these samples we performed whole exome sequencing (WES) performed, in order to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Results: Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Filtering output by CNV ranks from tools did not salvage precision. Several tools had better performance patterns for NA12878, and we hypothesize that this is the result of overfitting during the tool development. Conclusions: We suggest combining tools with the best recall: GATK gCNV, Lumpy, DELLY, and cn.MOPS. These tools also capture different CNVs. Further improvements in precision requires additional development of tools, reference data sets, and annotation of CNVs, potentially assisted by the use of background panels for filtering of frequently called variants.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Andre E. Minoche ◽  
Ben Lundie ◽  
Greg B. Peters ◽  
Thomas Ohnesorg ◽  
Mark Pinese ◽  
...  

AbstractWhole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritization, and visualization framework, which identified 99.8% of simulated pathogenic ClinVar CNVs > 10 kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5–4.5%) and reproducibility high (95–99%). In clinical practice, ClinSV identified reportable variants in 22 of 485 patients (4.7%) of which 35–63% were not detectable by current clinical microarray designs. ClinSV is available at https://github.com/KCCG/ClinSV.


Sign in / Sign up

Export Citation Format

Share Document