Abstract
BackgroundAdvances in the analysis of amplicon sequence datasets have introduced a methodological shift in how research teams investigate microbial biodiversity, away from the classification and downstream analyses of traditional operational taxonomic units (OTUs), and towards the usage of amplicon sequence variants (ASVs). While ASVs have several inherent properties that make them desirable compared to OTUs, questions remain as to the influence that these pipelines have on the ecological patterns being assessed, especially when compared to other methodological choices made when processing data (e.g. rarefaction) and computing diversity indices. ResultsWe compared the respective influences of using ASVs vs. OTU-based pipelines, rarefaction of the community table, and OTU similarity threshold (97% vs. 99%) on the ecological signals detected in freshwater invertebrate and environmental (sediment, seston) 16S rRNA data sets, determining the effects on alpha diversity, beta diversity and taxonomic composition. While the choice of OTU vs. ASV pipeline significantly influenced unweighted alpha and beta diversities and changed the ecological signal detected, weighted indices such as the Shannon index, Bray-Curtis dissimilarity, and weighted Unifrac scores were not impacted by the pipeline followed. By comparison, OTU threshold and rarefaction had a minimal impact effect on all measurements, although rarefaction improved overall signals, especially in OTU-based datasets. The identification of major classes and genera identified revealed significant discrepancies across methodologies. ConclusionWe provide a list of recommendations for the analysis of 16S rRNA amplicon data. We notably recommend the use of ASVs when analyzing alpha-diversity patterns, especially in species-rich or environmental samples. Abundance weighted alpha- and beta-diversity indices should also be preferred compared to ones based on the presence-absence of biological units.