High viral abundance and low diversity are associated with increased CRISPR-Cas prevalence across microbial ecosystems
CRISPR-Cas are adaptive immune systems that protect their hosts against viruses and other parasitic mobile genetic elements. Consequently, selection from viruses and other genetic parasites is often assumed to drive the acquisition and maintenance of these immune systems in nature, but this remains untested. Here, we analyse the abundance of CRISPR arrays in natural environments using metagenomic datasets from 332 terrestrial, aquatic and host-associated ecosystems. For each metagenome we quantified viral abundance and levels of viral community diversity to test whether these variables can explain variation in CRISPR-Cas abundance across ecosystems. We find a strong positive correlation between CRISPR-Cas abundance and viral abundance. In addition, when controlling for differences in viral abundance, we found that the CRISPR-Cas systems are more abundant when viral diversity is low. We also found differences in relative CRISPR-Cas abundance among environments, with environmental classification explaining ~24% of variation in CRISPR-Cas abundance. However, the correlations with viral abundance and diversity are broadly consistent across diverse natural environments. These results indicate that viral abundance and diversity are major ecological factors that drive the selection and maintenance of CRISPR-Cas in microbial ecosystems.