QuAdTrim: Overcoming computational bottlenecks in sequence quality control
AbstractWith the recent torrent of high throughput sequencing (HTS) data the necessity for highly efficient algorithms for common tasks is paramount. One task for which the basis for all further analysis of HTS data is initial data quality control, that is, the removal or trimming of poor quality reads from the dataset. Here we present QuAdTrim, a quality control and adapter trimming algorithm for HTS data that is up to 57 times faster and uses less than 0.06% of the memory of other commonly used HTS quality control programs. QuAdTrim will reduce the time and memory required for quality control of HTS data, and in doing, will reduce the computational demands of a fundamental step in HTS data analysis. Additionally, QuAdTrim impliments the removal of homopolymer Gs from the 3’ end of sequence reads, a common error generated on the NovaSeq, NextSeq and iSeq100 platforms.Availability and ImplementationThe source code is freely available on bitbucket under a BSD licence, see COPYING file for details: https://bitbucket.org/arobinson/quadtrimContactAndrew Robinson andrewjrobinson at gmail dot com