Computational analysis of copy number variation in plant genomes

2021 ◽  
Author(s):  
Raúl Y. Wijfjes
BMC Genomics ◽  
2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Raúl Y. Wijfjes ◽  
Sandra Smit ◽  
Dick de Ridder

Abstract Background Copy number variation (CNV) is thought to actively contribute to adaptive evolution of plant species. While many computational algorithms are available to detect copy number variation from whole genome sequencing datasets, the typical complexity of plant data likely introduces false positive calls. Results To enable reliable and comprehensive detection of CNV in plant genomes, we developed Hecaton, a novel computational workflow tailored to plants, that integrates calls from multiple state-of-the-art algorithms through a machine-learning approach. In this paper, we demonstrate that Hecaton outperforms current methods when applied to short read sequencing data of Arabidopsis thaliana, rice, maize, and tomato. Moreover, it correctly detects dispersed duplications, a type of CNV commonly found in plant species, in contrast to several state-of-the-art tools that erroneously represent this type of CNV as overlapping deletions and tandem duplications. Finally, Hecaton scales well in terms of memory usage and running time when applied to short read datasets of domesticated and wild tomato accessions. Conclusions Hecaton provides a robust method to detect CNV in plants. We expect it to be of immediate interest to both applied and fundamental research on the relationship between genotype and phenotype in plants.


2019 ◽  
Author(s):  
Raúl Wijfjes ◽  
Sandra Smit ◽  
Dick de Ridder

AbstractCopy number variation (CNV) is thought to actively contribute to adaptive evolution of plant species. While many computational algorithms are available to detect copy number variation from whole genome sequencing datasets, the typical complexity of plant data likely introduces false positive calls.To enable reliable and comprehensive detection of CNV in plant genomes, we developed Hecaton, a novel computational workflow tailored to plants, that integrates calls from multiple state-of-the-art algorithms through a machine-learning approach. In this paper, we demonstrate that Hecaton outperforms current methods when applied to short read sequencing data of A. thaliana, rice, maize, and tomato. Moreover, it correctly detects dispersed duplications, a type of CNV commonly found in plant species, in contrast to several state-of-the-art tools that erroneously represent this type of CNV as overlapping deletions and tandem duplications. Finally, Hecaton scales well in terms of memory usage and running time when applied to short read datasets of domesticated and wild tomato accessions. Hecaton provides a robust method to detect CNV in plants. We expect it to be of immediate interest to both applied and fundamental research on the relationship between genotype and phenotype in plants.


2015 ◽  
Vol 76 (S 01) ◽  
Author(s):  
Georgios Zenonos ◽  
Peter Howard ◽  
Maureen Lyons-Weiler ◽  
Wang Eric ◽  
William LaFambroise ◽  
...  

BIOCELL ◽  
2018 ◽  
Vol 42 (3) ◽  
pp. 87-91 ◽  
Author(s):  
Sergio LAURITO ◽  
Juan A. CUETO ◽  
Jimena PEREZ ◽  
Mar韆 ROQU�

Sign in / Sign up

Export Citation Format

Share Document