base call
Recently Published Documents


TOTAL DOCUMENTS

5
(FIVE YEARS 1)

H-INDEX

2
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Guillermo Dufort y Alvarez ◽  
Gadiel Seroussi ◽  
Pablo Smircich ◽  
Jose Roberto Sotelo ◽  
Idoia Ochoa ◽  
...  

Nanopore sequencing technologies are rapidly gaining popularity, in part, due to the massive amounts of genomic data they produce in short periods of time (up to 8.5 TB of data in less than 72 hs). In order to reduce the costs of transmission and storage, efficient compression methods for this type of data are needed. Unlike short-read technologies, nanopore sequencing generates long noisy reads of variable length. In this note we introduce RENANO, a reference-based lossless FASTQ data compressor, specifically tailored to compress FASTQ files generated with nanopore sequencing technologies. RENANO builds on the recent compressor ENANO, which is currently state of the art. It focuses on improving the compression of the base call sequence portion of the FASTQ file, leaving the other parts of ENANO intact. Two novel reference-based compression algorithms are introduced, contemplating different scenarios: in the first scenario, a reference genome is available without cost to both the compressor and the decompressor; in the second, the reference genome is available only on the compressor side, and a compacted version of the reference is transmitted to the decompressor as part of the compressed file. To evaluate the proposed algorithms, we compare RENANO against ENANO on several publicly available nanopore datasets. In the first scenario considered, RENANO improves the base call sequences compression of ENANO by 40.8%, on average, over all the datasets. As for total compression (including the other parts of the FASTQ file), the average improvement is 13.1%. In the second scenario considered, the base call compression improvements of RENANO over ENANO range from 15.2% to 49.0%, depending on the coverage of the compressed dataset, while in terms of total size, the improvements range from 5.1% to 16.5%.


2008 ◽  
Vol 36 (10) ◽  
pp. 3194-3201 ◽  
Author(s):  
A. P. Malanoski ◽  
B. Lin ◽  
D. A. Stenger

2007 ◽  
Vol 35 (21) ◽  
pp. e148-e148 ◽  
Author(s):  
Gagan A. Pandya ◽  
Michael H. Holmes ◽  
Sirisha Sunkara ◽  
Andrew Sparks ◽  
Yun Bai ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document