scholarly journals BamToCov: an efficient toolkit for sequence coverage calculations

2021 ◽  
Author(s):  
Giovanni Birolo ◽  
Andrea Telatin

Many genomics applications requires the calculation of nucleotide coverage of a reference or counting how many reads maps in a reference region. Here we present BamToCov, a suite of tools for rapid and flexible coverage calculations relying on a memory efficient algorithm and designed for flexible integration in bespoke pipelines. The tools of the suite will process sorted BAM or CRAM files, allowing to extract coverage information using different filtering approaches. BamToCov tools, unlike existing tools already available, have been developed to require a minimum amount of memory, to be easily integrated in workflows, and to allow for strand-specific coverage analyses. The unique coverage calculation algorithm makes it the ideal choice for the analysis of long reads alignments. The programs and their documentation are freely available at https://github.com/telatin/bamtocov.

Author(s):  
Quang Tran ◽  
Alexej Abyzov

Abstract Summary Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation—LongAGE—based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations. Availability and implementation LongAGE is implemented in C++ and available on Github at https://github.com/Coaxecva/LongAGE. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 51 (2) ◽  
pp. 595-625 ◽  
Author(s):  
Souleymane Zida ◽  
Philippe Fournier-Viger ◽  
Jerry Chun-Wei Lin ◽  
Cheng-Wei Wu ◽  
Vincent S. Tseng

Sign in / Sign up

Export Citation Format

Share Document