Hidden Markov Model-Based CNV Detection Algorithms for Illumina Genotyping Microarrays

Somatic alterations in DNA copy number have been well studied in numerous malignancies, yet the role of germline DNA copy number variation in cancer is still emerging. Genotyping microarrays generate allele-specific signal intensities to determine genotype, but may also be used to infer DNA copy number using additional computational approaches. Numerous tools have been developed to analyze Illumina genotype microarray data for copy number variant (CNV) discovery, although commonly utilized algorithms freely available to the public employ approaches based upon the use of hidden Markov models (HMMs). QuantiSNP, PennCNV, and GenoCN utilize HMMs with six copy number states but vary in how transition and emission probabilities are calculated. Performance of these CNV detection algorithms has been shown to be variable between both genotyping platforms and data sets, although HMM approaches generally outperform other current methods. Low sensitivity is prevalent with HMM-based algorithms, suggesting the need for continued improvement in CNV detection methodologies.

Download Full-text

CNV-BAC: Copy number Variation Detection in Bacterial Circular Genome

Bioinformatics ◽

10.1093/bioinformatics/btaa208 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3890-3891

Author(s):

Linjie Wu ◽

Han Wang ◽

Yuchao Xia ◽

Ruibin Xi

Keyword(s):

Copy Number Variation ◽

Copy Number ◽

Genome Structure ◽

Real Data ◽

Read Depth ◽

Supplementary Information ◽

Circular Genome ◽

Number Variation ◽

Copy Number Variation Detection ◽

Cnv Detection

Abstract Motivation Whole-genome sequencing (WGS) is widely used for copy number variation (CNV) detection. However, for most bacteria, their circular genome structure and high replication rate make reads more enriched near the replication origin. CNV detection based on read depth could be seriously influenced by such replication bias. Results We show that the replication bias is widespread using ∼200 bacterial WGS data. We develop CNV-BAC (CNV-Bacteria) that can properly normalize the replication bias and other known biases in bacterial WGS data and can accurately detect CNVs. Simulation and real data analysis show that CNV-BAC achieves the best performance in CNV detection compared with available algorithms. Availability and implementation CNV-BAC is available at https://github.com/XiDsLab/CNV-BAC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DNA copy number variation screening in familial hypercholesterolemia-related genes

Atherosclerosis ◽

10.1016/j.atherosclerosis.2018.06.218 ◽

2018 ◽

Vol 275 ◽

pp. e79

Author(s):

M. Iacocca ◽

J. Wang ◽

J. Dron ◽

H. Cao ◽

J. Robinson ◽

...

Keyword(s):

Copy Number Variation ◽

Familial Hypercholesterolemia ◽

Copy Number ◽

Dna Copy Number ◽

Dna Copy Number Variation ◽

Number Variation

Download Full-text

Detection and characterization of copy number variants based on whole-genome sequencing by DNBSEQ platforms

10.1101/786962 ◽

2019 ◽

Author(s):

Junhua Rao ◽

Lihua Peng ◽

Fang Chen ◽

Hui Jiang ◽

Chunyu Geng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Copy Number ◽

Copy Number Variants ◽

Copy Number Variant ◽

Whole Genome ◽

Genome Wide ◽

Wide Range ◽

Distribution Sensitivity ◽

Cnv Detection

AbstractBackgroundNext-generation sequence (NGS) has rapidly developed in past years which makes whole-genome sequencing (WGS) becoming a more cost- and time-efficient choice in wide range of biological researches. We usually focus on some variant detection via WGS data, such as detection of single nucleotide polymorphism (SNP), insertion and deletion (Indel) and copy number variant (CNV), which playing an important role in many human diseases. However, the feasibility of CNV detection based on WGS by DNBSEQ™ platforms was unclear. We systematically analysed the genome-wide CNV detection power of DNBSEQ™ platforms and Illumina platforms on NA12878 with five commonly used tools, respectively.ResultsDNBSEQ™ platforms showed stable ability to detect slighter more CNVs on genome-wide (average 1.24-fold than Illumina platforms). Then, CNVs based on DNBSEQ™ platforms and Illumina platforms were evaluated with two public benchmarks of NA12878, respectively. DNBSEQ™ and Illumina platforms showed similar sensitivities and precisions on both two benchmarks. Further, the difference between tools for CNV detection was analyzed, and indicated the selection of tool for CNV detection could affected the CNV performance, such as count, distribution, sensitivity and precision.ConclusionThe major contribution of this paper is providing a comprehensive guide for CNV detection based on WGS by DNBSEQ™ platforms for the first time.

Download Full-text

HandyCNV: Standardized Summary, Annotation, Comparison, and Visualization of Copy Number Variant, Copy Number Variation Region, and Runs of Homozygosity

Frontiers in Genetics ◽

10.3389/fgene.2021.731355 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jinghang Zhou ◽

Liyuan Liu ◽

Thomas J. Lopdell ◽

Dorian J. Garrick ◽

Yuangang Shi

Keyword(s):

Copy Number ◽

Copy Number Variants ◽

Copy Number Variant ◽

R Package ◽

Copy Number Variation Region ◽

Runs Of Homozygosity ◽

Number Variation ◽

Genomic Studies ◽

Computing Platforms ◽

Analysis System

Detection of CNVs (copy number variants) and ROH (runs of homozygosity) from SNP (single nucleotide polymorphism) genotyping data is often required in genomic studies. The post-analysis of CNV and ROH generally involves many steps, potentially across multiple computing platforms, which requires the researchers to be familiar with many different tools. In order to get around this problem and improve research efficiency, we present an R package that integrates the summarization, annotation, map conversion, comparison and visualization functions involved in studies of CNV and ROH. This one-stop post-analysis system is standardized, comprehensive, reproducible, timesaving, and user-friendly for researchers in humans and most diploid livestock species.

Download Full-text

Copy number variant detection with low-coverage whole-genome sequencing is a viable replacement for the traditional array-CGH

10.1101/2020.09.07.20183665 ◽

2020 ◽

Author(s):

Marcel Kucharik ◽

Jaroslav Budis ◽

Michaela Hyblova ◽

Gabriel Minarik ◽

Tomas Szemes

Keyword(s):

In Silico ◽

Copy Number ◽

Normal Population ◽

Genetic Disorders ◽

Prenatal Testing ◽

In Silico Analysis ◽

Copy Number Variant ◽

Detection Algorithm ◽

Copy Number Variations ◽

Cnv Detection

Copy number variations (CNVs) are a type of structural variant involving alterations in the number of copies of specific regions of DNA, which can either be deleted or duplicated. CNVs contribute substantially to normal population variability; however, abnormal CNVs cause numerous genetic disorders. Nowadays, several methods for CNV detection are used, from the conventional cytogenetic analysis through microarray-based methods (aCGH) to next-generation sequencing (NGS). We present GenomeScreen - NGS based CNV detection method based on a previously described CNV detection algorithm used for non-invasive prenatal testing (NIPT). We determined theoretical limits of its accuracy and confirmed it with extensive in-silico study and already genotyped samples. Theoretically, at least 6M uniquely mapped reads are required to detect CNV with a length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in-silico analysis showed the requirement at least 8M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has a 200 kb mean resolution. GenomeScreen and aCGH both detected 59 deviations, GenomeScreen furthermore detected 134 other (usually) smaller variations. Furthermore, the overall cost per sample is about 2-3x lower in the case of GenomeScreen.

Download Full-text