1) Environmental bulk samples often contain many taxa with biomass differences of several orders of magnitude. This can be problematic in DNA metabarcoding and metagenomic high throughput sequencing approaches, as large specimens contribute over proportionally much DNA template. Thus a few specimens of high biomass will dominate the dataset, potentially leading to smaller specimens remaining undetected. Sorting of samples and balancing the amounts of tissue used per size fraction should improve detection rates, but has not been systematically tested.
2) Here we tested the effects of size sorting on taxa detection using freshwater macroinvertebrates. Kick sampling was performed at two locations of a low-mountain stream in West Germany, specimens were morphologically identified and sorted into small, medium and large size classes (< 2.5x5, 5x10 and up to 10x20 mm). Tissue from the 3 size categories was extracted individually, and pooled to simulate bulk samples that were not sorted and samples which were sorted and then pooled proportionately by specimen size. DNA from all 5 extractions of both samples was amplified using 4 different freshwater primer sets for the COI gene and sequenced on a HiSeq Illumina sequencer.
3) Sorting taxa by size and pooling them proportionately according to their abundance lead to a more equal amplification compared to the processing of complete samples without sorting. The sorted samples recovered 30% more taxa than the unsorted samples, at the same sequencing depth. Our results imply that sequencing depth can be decreased ~ 5 fold when sorting the samples into three size classes.
4) Our results demonstrate that even a coarse size sorting can substantially improve detection rates. While high throughput sequencing will become more accessible and cheaper within the next years, sorting bulk samples by specimen biomass is a simple yet efficient method to reduce current sequencing costs.