Presence-only and Presence-absence Data for Comparing Species Distribution Modeling Methods
Species distribution models (SDMs) are widely used to predict and study distributions of species. Many different modeling methods and associated algorithms are used and continue to emerge. It is important to understand how different approaches perform, particularly when applied to species occurrence records that were not gathered in structured surveys (e.g. opportunistic records). This need motivated a large-scale, collaborative effort, published in 2006, that aimed to create objective comparisons of algorithm performance. As a benchmark, and to facilitate future comparisons of approaches, here we publish that dataset: point location records for 226 anonymized species from six regions of the world, with accompanying predictor variables in raster (grid) and point formats. A particularly interesting characteristic of this dataset is that independent presence-absence survey data are available for evaluation alongside the presence-only species occurrence data intended for modeling. The dataset is available on Open Science Framework and as an R package and can be used as a benchmark for modeling approaches and for testing new ways to evaluate the accuracy of SDMs.