scholarly journals Fast nanopore sequencing data analysis with SLOW5

Author(s):  
Hasindu Gamaarachchi ◽  
Hiruna Samarakoon ◽  
Sasha P. Jenner ◽  
James M. Ferguson ◽  
Timothy G. Amos ◽  
...  

AbstractNanopore sequencing depends on the FAST5 file format, which does not allow efficient parallel analysis. Here we introduce SLOW5, an alternative format engineered for efficient parallelization and acceleration of nanopore data analysis. Using the example of DNA methylation profiling of a human genome, analysis runtime is reduced from more than two weeks to approximately 10.5 h on a typical high-performance computer. SLOW5 is approximately 25% smaller than FAST5 and delivers consistent improvements on different computer architectures.

2021 ◽  
Author(s):  
Hasindu Gamaarachchi ◽  
Hiruna Samarakoon ◽  
Sasha P. Jenner ◽  
James M Ferguson ◽  
Timothy G. Amos ◽  
...  

Nanopore sequencing is an emerging genomic technology with great potential. However, the storage and analysis of nanopore sequencing data have become major bottlenecks preventing more widespread adoption in research and clinical genomics. Here, we elucidate an inherent limitation in the file format used to store raw nanopore data, known as FAST5, that prevents efficient analysis on high-performance computing (HPC) systems. To overcome this we have developed SLOW5, an alternative file format that permits efficient parallelisation and, thereby, acceleration of nanopore data analysis. For example, we show that using SLOW5 format, instead of FAST5, reduces the time and cost of genome-wide DNA methylation profiling by an order of magnitude on common HPC systems, and delivers consistent improvements on a wide range of different architectures. With a simple, accessible file structure and a ~25% reduction in size compared to FAST5, SLOW5 format will deliver substantial benefits to all areas of the nanopore community.


2021 ◽  
Author(s):  
Hasindu Gamaarachchi ◽  
Hiruna Samarakoon ◽  
Sasha Jenner ◽  
James Ferguson ◽  
Timothy Amos ◽  
...  

Abstract Nanopore sequencing is an emerging genomic technology with great potential. However, the storage and analysis of nanopore sequencing data have become major bottlenecks preventing more widespread adoption in research and clinical genomics. Here, we elucidate an inherent limitation in the file format used to store raw nanopore data – known as FAST5 – that prevents efficient analysis on high-performance computing (HPC) systems. To overcome this we have developed SLOW5, an alternative file format that permits efficient parallelisation and, thereby, acceleration of nanopore data analysis. For example, we show that using SLOW5 format, instead of FAST5, reduces the time and cost of genome-wide DNA methylation profiling by an order of magnitude on common HPC systems, and delivers consistent improvements on a wide range of different architectures. With a simple, accessible file structure and a ~25% reduction in size compared to FAST5, SLOW5 format will deliver substantial benefits to all areas of the nanopore community.


Author(s):  
Alberto Magi ◽  
Roberto Semeraro ◽  
Alessandra Mingrino ◽  
Betti Giusti ◽  
Romina D’Aurizio

PLoS ONE ◽  
2014 ◽  
Vol 9 (6) ◽  
pp. e99033 ◽  
Author(s):  
Luis Santana-Quintero ◽  
Hayley Dingerdissen ◽  
Jean Thierry-Mieg ◽  
Raja Mazumder ◽  
Vahan Simonyan

Genomics ◽  
2017 ◽  
Vol 109 (2) ◽  
pp. 83-90 ◽  
Author(s):  
Yan Guo ◽  
Yulin Dai ◽  
Hui Yu ◽  
Shilin Zhao ◽  
David C. Samuels ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document