reconCNV: interactive visualization of copy number data from high-throughput sequencing

Author(s):  
Raghu Chandramohan ◽  
Nipun Kakkar ◽  
Angshumoy Roy ◽  
D Williams Parsons

Abstract Summary Copy number variation (CNV) is an important category of unbalanced structural rearrangement. While methods for detecting CNV in high-throughput targeted sequencing have become increasingly sophisticated, dedicated tools for interactive and dynamic visualization of CNV from these data are still lacking. We describe reconCNV, a tool that produces an interactive and annotated web-based dashboard for viewing and summarizing CNVs detected in next-generation sequencing (NGS) data. reconCNV is designed to work with delimited result files from most NGS CNV callers with minor adjustments to the configuration file. The reconCNV output is an HTML file that is viewable on any modern web browser, requires no backend server, and can be readily appended to existing analysis pipelines. In addition to a standard CNV track for visualizing relative fold change and absolute copy number, the tool includes an auxiliary variant allele fraction track for visualizing underlying allelic imbalance and loss of heterozygosity. A feature to mask assay-specific technical artifacts and a direct HTML link out to the UCSC Genome Browser are also included to augment the reviewer experience. By providing a light-weight plugin for interactive visualization to existing NGS CNV pipelines, reconCNV can facilitate efficient NGS CNV visualization and interpretation in both research and clinical settings. Availability and implementation The source code and documentation including a tutorial can be accessed at https://github.com/rghu/reconCNV as well as a Docker image at https://hub.docker.com/repository/docker/raghuc1990/reconcnv. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 160 (11-12) ◽  
pp. 634-642
Author(s):  
Shiqiang Luo ◽  
Xingyuan Chen ◽  
Tizhen Yan ◽  
Jiaolian Ya ◽  
Zehui Xu ◽  
...  

High-throughput sequencing based on copy number variation (CNV-seq) is commonly used to detect chromosomal abnormalities. This study identifies chromosomal abnormalities in aborted embryos/fetuses in early and middle pregnancy and explores the application value of CNV-seq in determining the causes of pregnancy termination. High-throughput sequencing was used to detect chromosome copy number variations (CNVs) in 116 aborted embryos in early and middle pregnancy. The detection data were compared with the Database of Genomic Variants (DGV), the Database of Chromosomal Imbalance and Phenotype in Humans using Ensemble Resources (DECIPHER), and the Online Mendelian Inheritance in Man (OMIM) database to determine the CNV type and the clinical significance. High-throughput sequencing results were successfully obtained in 109 out of 116 specimens, with a detection success rate of 93.97%. In brief, there were 64 cases with abnormal chromosome numbers and 23 cases with CNVs, in which 10 were pathogenic mutations and 13 were variants of uncertain significance. An abnormal chromosome number is the most important reason for embryo termination in early and middle pregnancy, followed by pathogenic chromosome CNVs. CNV-seq can quickly and accurately detect chromosome abnormalities and identify microdeletion and microduplication CNVs that cannot be detected by conventional chromosome analysis, which is convenient and efficient for genetic etiology diagnosis in miscarriage.


2020 ◽  
Vol 36 (12) ◽  
pp. 3890-3891
Author(s):  
Linjie Wu ◽  
Han Wang ◽  
Yuchao Xia ◽  
Ruibin Xi

Abstract Motivation Whole-genome sequencing (WGS) is widely used for copy number variation (CNV) detection. However, for most bacteria, their circular genome structure and high replication rate make reads more enriched near the replication origin. CNV detection based on read depth could be seriously influenced by such replication bias. Results We show that the replication bias is widespread using ∼200 bacterial WGS data. We develop CNV-BAC (CNV-Bacteria) that can properly normalize the replication bias and other known biases in bacterial WGS data and can accurately detect CNVs. Simulation and real data analysis show that CNV-BAC achieves the best performance in CNV detection compared with available algorithms. Availability and implementation CNV-BAC is available at https://github.com/XiDsLab/CNV-BAC. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Hai Yang ◽  
Daming Zhu

Copy number variation (CNV) is a prevalent kind of genetic structural variation which leads to an abnormal number of copies of large genomic regions, such as gain or loss of DNA segments larger than 1[Formula: see text]kb. CNV exists not only in human genome but also in plant genome. Current researches have testified that CNV is associated with many complex diseases. In this paper, guanine-cytosine (GC) bias, mappability and their effect on read depth signals in sequencing data are discussed first. Subsequently, a new correction method for GC bias and an improved combinatorial detection algorithm for CNV using high-throughput sequencing reads based on hidden Markov model (CNV-HMM) are proposed. The corrected read depth signals have lower correlation with GC content, mappability of reads and the width of analysis window. Then we create a hidden Markov model which maps the reads onto the reference genome and records the unmapped reads. The unmapped reads are counted and normalized. The CNV-HMM detects the abnormal signal of read count and gains the candidate CNVs using the expectation maximization (EM) algorithm. Finally, we filter the candidate CNVs using split reads to promote the performance of our algorithm. The experiment result indicates that the CNV-HMM algorithm has higher accuracy and sensitivity for CNVs detection than most current detection algorithms.


2020 ◽  
Vol 36 (12) ◽  
pp. 3632-3636 ◽  
Author(s):  
Weibo Zheng ◽  
Jing Chen ◽  
Thomas G Doak ◽  
Weibo Song ◽  
Ying Yan

Abstract Motivation Programmed DNA elimination (PDE) plays a crucial role in the transitions between germline and somatic genomes in diverse organisms ranging from unicellular ciliates to multicellular nematodes. However, software specific for the detection of DNA splicing events is scarce. In this paper, we describe Accurate Deletion Finder (ADFinder), an efficient detector of PDEs using high-throughput sequencing data. ADFinder can predict PDEs with relatively low sequencing coverage, detect multiple alternative splicing forms in the same genomic location and calculate the frequency for each splicing event. This software will facilitate research of PDEs and all down-stream analyses. Results By analyzing genome-wide DNA splicing events in two micronuclear genomes of Oxytricha trifallax and Tetrahymena thermophila, we prove that ADFinder is effective in predicting large scale PDEs. Availability and implementation The source codes and manual of ADFinder are available in our GitHub website: https://github.com/weibozheng/ADFinder. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Yuansheng Liu ◽  
Xiaocai Zhang ◽  
Quan Zou ◽  
Xiangxiang Zeng

Abstract Summary Removing duplicate and near-duplicate reads, generated by high-throughput sequencing technologies, is able to reduce computational resources in downstream applications. Here we develop minirmd, a de novo tool to remove duplicate reads via multiple rounds of clustering using different length of minimizer. Experiments demonstrate that minirmd removes more near-duplicate reads than existing clustering approaches and is faster than existing multi-core tools. To the best of our knowledge, minirmd is the first tool to remove near-duplicates on reverse-complementary strand. Availability and implementation https://github.com/yuansliu/minirmd. Supplementary information Supplementary data are available at Bioinformatics online.


2006 ◽  
Vol 16 (12) ◽  
pp. 1566-1574 ◽  
Author(s):  
H. Fiegler ◽  
R. Redon ◽  
D. Andrews ◽  
C. Scott ◽  
R. Andrews ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document