Evaluating species distribution models with discrimination accuracy is uninformative for many applications

AbstractAimSpecies distribution models are used across evolution, ecology, conservation, and epidemiology to make critical decisions and study biological phenomena, often in cases where experimental approaches are intractable. Choices regarding optimal models, methods, and data are typically made based on discrimination accuracy: a model’s ability to predict subsets of species occurrence data that were withheld during model construction. However, empirical applications of these models often involve making biological inferences based on continuous estimates of relative habitat suitability as a function of environmental predictor variables. We term the reliability of these biological inferences “functional accuracy.” We explore the link between discrimination accuracy and functional accuracy.MethodsUsing a simulation approach we investigate whether models that make good predictions of species distributions correctly infer the underlying relationship between environmental predictors and the suitability of habitat.ResultsWe demonstrate that discrimination accuracy is only informative when models are simple and similar in structure to the true niche, or when data partitioning is geographically structured. However, the utility of discrimination accuracy for selecting models with high functional accuracy was low in all cases.Main conclusionsThese results suggest that many empirical studies and decisions are based on criteria that are unrelated to models’ usefulness for their intended purpose. We argue that empirical modeling studies need to place significantly more emphasis on biological insight into the plausibility of models, and that the current approach of maximizing discrimination accuracy at the expense of other considerations is detrimental to both the empirical and methodological literature in this active field. Finally, we argue that future development of the field must include an increased emphasis on simulation; methodological studies based on ability to predict withheld occurrence data may be largely uninformative about best practices for applications where interpretation of models relies on estimating ecological processes, and will unduly penalize more biologically informative modeling approaches.

Download Full-text

Application of species distribution models in stream ecosystems: the challenges of spatial and temporal scale, environmental predictors and species occurrence data

Fundamental and Applied Limnology / Archiv für Hydrobiologie ◽

10.1127/fal/2015/0627 ◽

2015 ◽

Vol 186 (1) ◽

pp. 45-61 ◽

Cited By ~ 40

Author(s):

Sami Domisch ◽

Sonja C. Jähnig ◽

John P. Simaika ◽

Mathias Kuemmerlen ◽

Stefan Stoll

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

Temporal Scale ◽

Stream Ecosystems ◽

Environmental Predictors ◽

Species Occurrence ◽

Distribution Models ◽

Occurrence Data

Download Full-text

Species distribution models for invasive Eurasian watermilfoil highlight the importance of data quality and limitations of discrimination accuracy metrics

Ecology and Evolution ◽

10.1002/ece3.8002 ◽

2021 ◽

Author(s):

Shyam M. Thomas ◽

Michael R. Verhoeven ◽

Jake R. Walsh ◽

Daniel J. Larkin ◽

Gretchen J. A. Hansen

Keyword(s):

Data Quality ◽

Species Distribution ◽

Species Distribution Models ◽

Discrimination Accuracy ◽

Distribution Models ◽

Eurasian Watermilfoil

Download Full-text

Comparing species distribution models constructed with different subsets of environmental predictors

Diversity and Distributions ◽

10.1111/ddi.12247 ◽

2014 ◽

Vol 21 (1) ◽

pp. 23-35 ◽

Cited By ~ 73

Author(s):

David N. Bucklin ◽

Mathieu Basille ◽

Allison M. Benscoter ◽

Laura A. Brandt ◽

Frank J. Mazzotti ◽

...

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

Environmental Predictors ◽

Distribution Models

Download Full-text

Use of Anecdotal Occurrence Data in Species Distribution Models: An Example Based on the White-Nosed Coati (Nasua narica) in the American Southwest

Animals ◽

10.3390/ani3020327 ◽

2013 ◽

Vol 3 (2) ◽

pp. 327-348 ◽

Cited By ~ 6

Author(s):

Jennifer Frey ◽

Jeremy Lewis ◽

Rachel Guy ◽

James Stuart

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

American Southwest ◽

Distribution Models ◽

Nasua Narica ◽

Occurrence Data

Download Full-text

Finer grain size increases effects of error and changes influence of environmental predictors on species distribution models

Ecological Informatics ◽

10.1016/j.ecoinf.2013.02.003 ◽

2013 ◽

Vol 15 ◽

pp. 8-13 ◽

Cited By ~ 11

Author(s):

Brice B. Hanberry

Keyword(s):

Grain Size ◽

Species Distribution ◽

Species Distribution Models ◽

Environmental Predictors ◽

Distribution Models

Download Full-text

Do joint species distribution models reliably detect interspecific interactions from co-occurrence data in homogenous environments?

Ecography ◽

10.1111/ecog.03315 ◽

2018 ◽

Vol 41 (11) ◽

pp. 1812-1819 ◽

Cited By ~ 30

Author(s):

Damaris Zurell ◽

Laura J. Pollock ◽

Wilfried Thuiller

Keyword(s):

Species Distribution ◽

Species Distribution Models ◽

Interspecific Interactions ◽

Distribution Models ◽

Occurrence Data

Download Full-text

Exploring snake occurrence records: Spatial biases and marginal gains from accessible social media

PeerJ ◽

10.7717/peerj.8059 ◽

2019 ◽

Vol 7 ◽

pp. e8059 ◽

Cited By ~ 2

Author(s):

Benjamin M. Marshall ◽

Colin T. Strine

Keyword(s):

Social Media ◽

North America ◽

South America ◽

Species Distribution ◽

Species Distribution Models ◽

Conservation Status ◽

Model Performance ◽

Distribution Models ◽

Occurrence Data ◽

Occurrence Records

A species’ distribution provides fundamental information on: climatic niche, biogeography, and conservation status. Species distribution models often use occurrence records from biodiversity databases, subject to spatial and taxonomic biases. Deficiencies in occurrence data can lead to incomplete species distribution estimates. We can incorporate other data sources to supplement occurrence datasets. The general public is creating (via GPS-enabled cameras to photograph wildlife) incidental occurrence records that may present an opportunity to improve species distribution models. We investigated (1) occurrence data of a cryptic group of animals: non-marine snakes, in a biodiversity database (Global Biodiversity Information Facility (GBIF)) and determined (2) whether incidental occurrence records extracted from geo-tagged social media images (Flickr) could improve distribution models for 18 tropical snake species. We provide R code to search for and extract data from images using Flickr’s API. We show the biodiversity database’s 302,386 records disproportionately originate from North America, Europe and Oceania (250,063, 82.7%), with substantial gaps in tropical areas that host the highest snake diversity. North America, Europe and Oceania averaged several hundred records per species; whereas Asia, Africa and South America averaged less than 35 per species. Occurrence density showed similar patterns; Asia, Africa and South America have roughly ten-fold fewer records per 100 km2than other regions. Social media provided 44,687 potential records. However, including them in distribution models only marginally impacted niche estimations; niche overlap indices were consistently over 0.9. Similarly, we show negligible differences in Maxent model performance between models trained using GBIF-only and Flickr-supplemented datasets. Model performance appeared dependent on species, rather than number of occurrences or training dataset. We suggest that for tropical snakes, accessible social media currently fails to deliver appreciable benefits for estimating species distributions; but due to the variation between species and the rapid growth in social media data, may still be worth considering in future contexts.

Download Full-text

Bounding species distribution models

Current Zoology ◽

10.1093/czoolo/57.5.642 ◽

2011 ◽

Vol 57 (5) ◽

pp. 642-647 ◽

Cited By ~ 21

Author(s):

Thomas J. Stohlgren ◽

Catherine S. Jarnevich ◽

Wayne E. Esaias ◽

Jeffrey T. Morisette

Keyword(s):

Species Distribution ◽

Honey Bees ◽

Best Practice ◽

Species Distribution Models ◽

Regression Tree ◽

Classification And Regression Tree ◽

Suitable Habitat ◽

Environmental Predictors ◽

Distribution Models ◽

Africanized Honey Bees

Abstract Species distribution models are increasing in popularity for mapping suitable habitat for species of management concern. Many investigators now recognize that extrapolations of these models with geographic information systems (GIS) might be sensitive to the environmental bounds of the data used in their development, yet there is no recommended best practice for “clamping” model extrapolations. We relied on two commonly used modeling approaches: classification and regression tree (CART) and maximum entropy (Maxent) models, and we tested a simple alteration of the model extrapolations, bounding extrapolations to the maximum and minimum values of primary environmental predictors, to provide a more realistic map of suitable habitat of hybridized Africanized honey bees in the southwestern United States. Findings suggest that multiple models of bounding, and the most conservative bounding of species distribution models, like those presented here, should probably replace the unbounded or loosely bounded techniques currently used.

Download Full-text

SPATIAL SCALE EFFECTS OF SAMPLING ON THE INTERPOLATION OF SPECIES DISTRIBUTION MODELS IN THE SOUTHWESTERN AMAZON

Revista Árvore ◽

10.1590/0100-67622016000400005 ◽

2016 ◽

Vol 40 (4) ◽

pp. 617-625 ◽

Cited By ~ 1

Author(s):

Symone Maria de Melo Figueiredo ◽

Eduardo Martins Venticinque ◽

Evandro Orfanó Figueiredo

Keyword(s):

Forest Management ◽

Spatial Scale ◽

Species Distribution ◽

Tree Species ◽

Species Distribution Models ◽

Scale Effects ◽

Projection Area ◽

Distribution Models ◽

Forest Inventories ◽

Occurrence Data

ABSTRACT Knowledge of the geographical distribution of timber tree species in the Amazon is still scarce. This is especially true at the local level, thereby limiting natural resource management actions. Forest inventories are key sources of information on the occurrence of such species. However, areas with approved forest management plans are mostly located near access roads and the main industrial centers. The present study aimed to assess the spatial scale effects of forest inventories used as sources of occurrence data in the interpolation of potential species distribution models. The occurrence data of a group of six forest tree species were divided into four geographical areas during the modeling process. Several sampling schemes were then tested applying the maximum entropy algorithm, using the following predictor variables: elevation, slope, exposure, normalized difference vegetation index (NDVI) and height above the nearest drainage (HAND). The results revealed that using occurrence data from only one geographical area with unique environmental characteristics increased both model overfitting to input data and omission error rates. The use of a diagonal systematic sampling scheme and lower threshold values led to improved model performance. Forest inventories may be used to predict areas with a high probability of species occurrence, provided they are located in forest management plan regions representative of the environmental range of the model projection area.

Download Full-text

Simple is sometimes better: a test of the transferability of species distribution models

ICES Journal of Marine Science ◽

10.1093/icesjms/fsaa024 ◽

2020 ◽

Vol 77 (5) ◽

pp. 1752-1761

Author(s):

Danielle E Haulsee ◽

Matthew W Breece ◽

Dewayne A Fox ◽

Matthew J Oliver

Keyword(s):

Species Distribution ◽

Mixed Model ◽

Species Distribution Models ◽

Autonomous Underwater Vehicle ◽

Spatial Scales ◽

Test Model ◽

Atlantic Sturgeon ◽

Environmental Predictors ◽

Distribution Models ◽

New Locations

Abstract Species distribution models (SDMs) are often empirically developed on spatially and temporally biased samples and then applied over much larger spatial scales to test ecological hypotheses or to inform management. Underlying this approach is the assumption that the statistical relationships between species observations and environmental predictors are applicable to other locations and times. However, testing and quantifying the transferability of these models to new locations and times can be a challenge for resource managers because of the technical difficulty in obtaining species observations in new locations in a dynamic environment. Here, we apply two SDMs developed in the Mid-Atlantic Bight for Atlantic sturgeon (Acipenser oxyrhynchus oxyrhynchus) to the South Atlantic Bight and use an autonomous underwater vehicle to test model predictions. We compare Atlantic sturgeon occurrence to two SDMs: one associating sturgeon occurrence with simple seascapes and one developed through coupling occurrences with environmental predictors in a generalized additive mixed model (GAMM). Our analysis showed that the seascape model was transferable across these disparate regions; however, the complex GAMM was not. The association of the imperilled Atlantic sturgeon with simple seascapes allows managers to easily integrate this remotely sensed dynamic oceanographic product into future ecosystem-based management strategies.

Download Full-text