Contamination detection and microbiome exploration with GRIMER
Exploring microbiome data is a time-consuming task that can be only partially automated due to the specific requirements and goals of each project. Visualizations and analysis platforms are crucial to better guide this step. Best practices in the field are constantly evolving and many pitfalls can lead to biased outcomes. Compositionality of data and sample contamination are two important points that should be carefully considered in early stages of microbiome studies. Detecting contamination can be a challenging task, especially in low-biomass samples or in studies lacking proper controls by design. However, external evidences and commonly identified contaminant taxa can be used to discover and mitigate contamination. We propose GRIMER, a tool that automates analysis, generates plots and runs external tools to create a portable dashboard integrating annotation, taxonomy and metadata. It unifies several sources of evidence towards contamination detection. GRIMER is independent of quantification methods and directly analyses contingency tables to create an interactive and offline report. GRIMER reports can be created in seconds and are accessible for non-specialists, providing an intuitive set of charts to explore data distribution among observations and samples and its connections with external sources. Further, we compiled an extensive list of common contaminants and possible external contaminant taxa reported in the literature and use it to annotate data. GRIMER is open-source and available at: https://gitlab.com/dacs-hpi/grimer