scholarly journals An accurate and powerful method for copy number variation detection

2019 ◽  
Vol 35 (17) ◽  
pp. 2891-2898
Author(s):  
Feifei Xiao ◽  
Xizhi Luo ◽  
Ning Hao ◽  
Yue S Niu ◽  
Xiangjun Xiao ◽  
...  

Abstract Motivation Integration of multiple genetic sources for copy number variation detection (CNV) is a powerful approach to improve the identification of variants associated with complex traits. Although it has been shown that the widely used change point based methods can increase statistical power to identify variants, it remains challenging to effectively detect CNVs with weak signals due to the noisy nature of genotyping intensity data. We previously developed modSaRa, a normal mean-based model on a screening and ranking algorithm for copy number variation identification which presented desirable sensitivity with high computational efficiency. To boost statistical power for the identification of variants, here we present a novel improvement that integrates the relative allelic intensity with external information from empirical statistics with modeling, which we called modSaRa2. Results Simulation studies illustrated that modSaRa2 markedly improved both sensitivity and specificity over existing methods for analyzing array-based data. The improvement in weak CNV signal detection is the most substantial, while it also simultaneously improves stability when CNV size varies. The application of the new method to a whole genome melanoma dataset identified novel candidate melanoma risk associated deletions on chromosome bands 1p22.2 and duplications on 6p22, 6q25 and 19p13 regions, which may facilitate the understanding of the possible roles of germline copy number variants in the etiology of melanoma. Availability and implementation http://c2s2.yale.edu/software/modSaRa2 or https://github.com/FeifeiXiaoUSC/modSaRa2. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 36 (12) ◽  
pp. 3890-3891
Author(s):  
Linjie Wu ◽  
Han Wang ◽  
Yuchao Xia ◽  
Ruibin Xi

Abstract Motivation Whole-genome sequencing (WGS) is widely used for copy number variation (CNV) detection. However, for most bacteria, their circular genome structure and high replication rate make reads more enriched near the replication origin. CNV detection based on read depth could be seriously influenced by such replication bias. Results We show that the replication bias is widespread using ∼200 bacterial WGS data. We develop CNV-BAC (CNV-Bacteria) that can properly normalize the replication bias and other known biases in bacterial WGS data and can accurately detect CNVs. Simulation and real data analysis show that CNV-BAC achieves the best performance in CNV detection compared with available algorithms. Availability and implementation CNV-BAC is available at https://github.com/XiDsLab/CNV-BAC. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Grace Png ◽  
Daniel Suveges ◽  
Young-Chan Park ◽  
Klaudia Walter ◽  
Kousik Kundu ◽  
...  

MotivationCopy number variants (CNVs) are large deletions or duplications at least 50 to 200 base pairs long. They play an important role in multiple disorders, but accurate calling of CNVs remains challenging. Most current approaches to CNV detection use raw read alignments, which are computationally intensive to process.ResultsWe use a regression tree-based approach to call CNVs from whole-genome sequencing (WGS, > 18x) variant call-sets in 6,898 samples across four European cohorts, and describe a rich large variation landscape comprising 1,320 CNVs. 61.8% of detected events have been previously reported in the Database of Genomic Variants. 23% of high-quality deletions affect entire genes, and we recapitulate known events such as theGSTM1andRHDgene deletions. We test for association between the detected deletions and 275 protein levels in 1,457 individuals to assess the potential clinical impact of the detected CNVs. We describe the LD structure and copy number variation underlying the association between levels of the CCL3 protein and a complex structural variant (MAF = 0.15, p = 3.6×10-12) affectingCCL3L3, a paralog of theCCL3gene. We also identify acis-association between a low-frequencyNOMO1deletion and the protein product of this gene (MAF = 0.02, p = 2.2×10-7), for which nocis-ortrans-single nucleotide variant-driven protein quantitative trait locus (pQTL) has been documented to date. This work demonstrates that existing population-wide WGS call-sets can be mined for CNVs with minimal computational overhead, delivering insight into a less well-studied, yet potentially impactful class of genetic variant.AvailabilityThe regression tree based approach, UN-CNVc, is available as an R and bash executable on GitHub athttps://github.com/agilly/[email protected];[email protected] InformationSupplementary information is appended.


2014 ◽  
Vol 31 (9) ◽  
pp. 1341-1348 ◽  
Author(s):  
Feifei Xiao ◽  
Xiaoyi Min ◽  
Heping Zhang

2018 ◽  
Vol 5 (3) ◽  
pp. 307-314 ◽  
Author(s):  
Lydia Sagath ◽  
Vilma-Lotta Lehtokari ◽  
Salla Välipakka ◽  
Bjarne Udd ◽  
Carina Wallgren-Pettersson ◽  
...  

2015 ◽  
Vol 143 (suppl_1) ◽  
pp. A013-A013
Author(s):  
Linda B. Baughn ◽  
Getiria Onsongo ◽  
Matthew Bower ◽  
Christine Henzler ◽  
Kevin A.T. Silverstein ◽  
...  

2011 ◽  
Vol 12 (8) ◽  
pp. R80 ◽  
Author(s):  
Jiqiu Cheng ◽  
Evelyne Vanneste ◽  
Peter Konings ◽  
Thierry Voet ◽  
Joris R Vermeesch ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document