The status of the microbial census: an update
A census is typically carried out for people at a national level; however, microbial ecologists have implemented a molecular census of bacteria and archaea by sequencing their 16S rRNA genes. We assessed how well the microbial census of full-length 16S rRNA gene sequences is proceeding in the context of recent advances in high throughput sequencing technologies. Among the 1,411,234 and 53,546 full-length bacterial and archaeal sequences sequences, 94.5% and 95.1% of the bacterial and archeaeal sequences, respectively, belonged to operational taxonomic units (OTUs) that have been observed more than once. Although these metrics suggest that the census is approaching completion, 29.2% of the bacterial and 38.5% of the archaeal OTUs have been observed more than once. Thus, there is still considerable microbial diversity to be explored. Unfortunately, the rate of new full-length sequences has been declining and new sequences are primarily being deposited by a small number of studies. Furthermore, sequences from soil and aquatic environments, which are known to be rich in bacterial diversity, only represent 7.8 and 16.5% of the census while sequences associated with zoonotic environments represent 55.0% of the census. Continued use of traditional approaches and new technologies such as single cell genomics and short read assembly are likely to improve our ability to sample rare OTUs if it is possible to overcome this sampling bias. The success of ongoing efforts to use short read sequencing to characterize microbial communities requires that researchers strive to expand the depth and breadth of the microbial census.