Rarity of microbial species: In search of reliable associations
AbstractThe role of microbial interactions on the properties of microbiota is a topic of key interest in microbial ecology. Microbiota contain hundreds to thousands of operational taxonomic units (OTUs), most of which are rare. This feature of community structure can lead to methodological difficulties: simulations have shown that methods for detecting pairwise associations between OTUs (which presumably reflect interactions) yield problematic results. The performance of association detection tools is impaired for a high proportion of zeros in OTU table. Here, we explored the statistical testability of such associations given occurrence and read abundance data. The goal was to understand the impact of OTU rarity on the testability of correlation coefficients. We found that a large proportion of pairwise associations, especially negative associations, cannot be reliably tested. This constraint could hamper the identification of candidate biological agents that could be used to control rare pathogens. Consequently, identifying testable associations could serve as an objective method for trimming datasets (in lieu of current empirical approaches). This trimming strategy could significantly reduce the computation time and improve inference of association networks. When OTU prevalence is low, association measures for occurrence and read abundance data are correlated, raising questions about the information actually being captured.