scholarly journals Data quantity is more important than its spatial bias for predictive species distribution modelling

2020 ◽  
Author(s):  
Willson Gaul ◽  
Dinara Sadykova ◽  
Hannah J. White ◽  
Lupe León-Sánchez ◽  
Paul Caplat ◽  
...  

ABSTRACTBiological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of 1) spatial bias in training data, 2) sample size (the average number of observations per species), and 3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but only when the bias was relatively strong. Sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10411
Author(s):  
Willson Gaul ◽  
Dinara Sadykova ◽  
Hannah J. White ◽  
Lupe Leon-Sanchez ◽  
Paul Caplat ◽  
...  

Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of (1) spatial bias in training data, (2) sample size (the average number of observations per species), and (3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.


2019 ◽  
Author(s):  
Colin J. Carlson

embarcadero is an R package of convenience tools for species distribution modelling with Bayesian additive regression trees (BART), a powerful machine learning approach that has been rarely applied to ecological problems. Like other classification and regression tree methods, BART estimates the probability of a binary outcome based on a set of decision trees. Unlike other methods, BART iteratively generates sets of trees based on a set of priors about tree structure and nodes, and builds a posterior distribution of estimated classification probabilities. So far, BARTs have yet to be applied to species distribution modelling. embarcadero is a workflow wrapper for BART species distribution models, and includes functionality for easy spatial prediction, an automated variable selection procedure, several types of partial dependence visualization, and other tools for ecological application. The embarcadero package is available open source on Github and intended for eventual CRAN release. To show how embarcadero can be used by ecologists, I illustrate a BART workflow for a virtual species distribution model. The supplement includes a more advanced vignette showing how BART can be used for mapping disease transmission risk, using the example of Crimean-Congo haemorrhagic fever in Africa.


2020 ◽  
Vol 77 (5) ◽  
pp. 1841-1853
Author(s):  
Chongliang Zhang ◽  
Yong Chen ◽  
Binduo Xu ◽  
Ying Xue ◽  
Yiping Ren

Abstract Varying catchability is a common feature in fisheries and has great impacts on fisheries assessments and species distribution models. However, spatial variations in catchability have been rarely evaluated, especially in the multispecies context. We advocate that the need for multispecies models stands for both challenges and opportunities to handle spatial catchability. This study evaluated the influence of spatially varying catchability on the performance of a novel joint species distribution model, namely Hierarchical Modelling of Species Communities (HMSC). We implemented the model under nine simulation scenarios to account for diverse spatial patterns of catchability and conducted empirical tests using survey data from Yellow Sea, China. Our results showed that ignoring variability in catchability could lead to substantial errors in the inferences of species response to environment. Meanwhile, the models’ predictive power was less impacted, yielding proper predictions of relative abundance. Incorporating a spatially autocorrelated structure substantially improved the predictability of HMSC in both simulation and empirical tests. Nevertheless, combined sources of spatial catchabilities could largely diminish the advantage of HMSC in inference and prediction. We highlight situations where catchability needs to be explicitly accounted for in modelling fish distributions, and suggest directions for future applications and development of JSDMs.


2012 ◽  
Vol 367 (1586) ◽  
pp. 247-258 ◽  
Author(s):  
Colin M. Beale ◽  
Jack J. Lennon

Motivated by the need to solve ecological problems (climate change, habitat fragmentation and biological invasions), there has been increasing interest in species distribution models (SDMs). Predictions from these models inform conservation policy, invasive species management and disease-control measures. However, predictions are subject to uncertainty, the degree and source of which is often unrecognized. Here, we review the SDM literature in the context of uncertainty, focusing on three main classes of SDM: niche-based models, demographic models and process-based models. We identify sources of uncertainty for each class and discuss how uncertainty can be minimized or included in the modelling process to give realistic measures of confidence around predictions. Because this has typically not been performed, we conclude that uncertainty in SDMs has often been underestimated and a false precision assigned to predictions of geographical distribution. We identify areas where development of new statistical tools will improve predictions from distribution models, notably the development of hierarchical models that link different types of distribution model and their attendant uncertainties across spatial scales. Finally, we discuss the need to develop more defensible methods for assessing predictive performance, quantifying model goodness-of-fit and for assessing the significance of model covariates.


Author(s):  
M. Z. G. Untalan ◽  
D. F. M. Burgos ◽  
K. P. Martinez

Abstract. Maxent is a machine learning model used for species distribution modelling (SDM) that is rising in popularity. As with any species distribution model, it needs to be validated for certain species before being used to generate insights and trusted predictions. Using Maxent, SDM of two endemic species in the Philippines, Varanus palawanensis (Palawan monitor lizard) and Caprimulgus manillensis (Philippine nightjar), were created using presence-only data, with 14 V. palawanensis and 771 C. manillensis occurrences, and 19 bioclimatic variables from BIOCLIM. This study shows the consistency to historical facts of Maxent on two endemic species of the Philippines of varying nature. The applicability of Maxent on the two very different species show that Maxent has high likelihood to give good results for other species. Showing that Maxent is applicable to the species of the Philippines gives additional tools for ecologists and national administrators to lead the development of the Philippines in the direction that conserves the biodiversity of the Philippines and that increases the productivity and quality of life in the Philippines.


2021 ◽  
Vol 13 (8) ◽  
pp. 1495
Author(s):  
Jehyeok Rew ◽  
Yongjang Cho ◽  
Eenjun Hwang

Species distribution models have been used for various purposes, such as conserving species, discovering potential habitats, and obtaining evolutionary insights by predicting species occurrence. Many statistical and machine-learning-based approaches have been proposed to construct effective species distribution models, but with limited success due to spatial biases in presences and imbalanced presence-absences. We propose a novel species distribution model to address these problems based on bootstrap aggregating (bagging) ensembles of deep neural networks (DNNs). We first generate bootstraps considering presence-absence data on spatial balance to alleviate the bias problem. Then we construct DNNs using environmental data from presence and absence locations, and finally combine these into an ensemble model using three voting methods to improve prediction accuracy. Extensive experiments verified the proposed model’s effectiveness for species in South Korea using crowdsourced observations that have spatial biases. The proposed model achieved more accurate and robust prediction results than the current best practice models.


2018 ◽  
Vol 373 (1761) ◽  
pp. 20170446 ◽  
Author(s):  
Scott Jarvie ◽  
Jens-Christian Svenning

Trophic rewilding, the (re)introduction of species to promote self-regulating biodiverse ecosystems, is a future-oriented approach to ecological restoration. In the twenty-first century and beyond, human-mediated climate change looms as a major threat to global biodiversity and ecosystem function. A critical aspect in planning trophic rewilding projects is the selection of suitable sites that match the needs of the focal species under both current and future climates. Species distribution models (SDMs) are currently the main tools to derive spatially explicit predictions of environmental suitability for species, but the extent of their adoption for trophic rewilding projects has been limited. Here, we provide an overview of applications of SDMs to trophic rewilding projects, outline methodological choices and issues, and provide a synthesis and outlook. We then predict the potential distribution of 17 large-bodied taxa proposed as trophic rewilding candidates and which represent different continents and habitats. We identified widespread climatic suitability for these species in the discussed (re)introduction regions under current climates. Climatic conditions generally remain suitable in the future, although some species will experience reduced suitability in parts of these regions. We conclude that climate change is not a major barrier to trophic rewilding as currently discussed in the literature.This article is part of the theme issue ‘Trophic rewilding: consequences for ecosystems under global change’.


2021 ◽  
Author(s):  
Gabriel Dansereau ◽  
Pierre Legendre ◽  
Timothée Poisot

Aim: Local contributions to beta diversity (LCBD) can be used to identify sites with high ecological uniqueness and exceptional species composition within a region of interest. Yet, these indices are typically used on local or regional scales with relatively few sites, as they require information on complete community compositions difficult to acquire on larger scales. Here, we investigate how LCBD indices can be used to predict ecological uniqueness over broad spatial extents using species distribution modelling and citizen science data. Location: North America. Time period: 2000s. Major taxa studied: Parulidae. Methods: We used Bayesian additive regression trees (BARTs) to predict warbler species distributions in North America based on observations recorded in the eBird database. We then calculated LCBD indices for observed and predicted data and examined the site-wise difference using direct comparison, a spatial autocorrelation test, and generalized linear regression. We also investigated the relationship between LCBD values and species richness in different regions and at various spatial extents and the effect of the proportion of rare species on the relationship. Results: Our results showed that the relationship between richness and LCBD values varies according to the region and the spatial extent at which it is applied. It is also affected by the proportion of rare species in the community. Species distribution models provided highly correlated estimates with observed data, although spatially autocorrelated. Main conclusions: Sites identified as unique over broad spatial extents may vary according to the regional richness, total extent size, and the proportion of rare species. Species distribution modelling can be used to predict ecological uniqueness over broad spatial extents, which could help identify beta diversity hotspots and important targets for conservation purposes in unsampled locations.


2009 ◽  
Vol 21 (1) ◽  
pp. 39-49
Author(s):  
Karla Donato Fook ◽  
Silvana Amaral ◽  
Antônio Miguel Vieira Monteiro ◽  
Gilberto Câmara ◽  
Arimatéa de Carvalho Ximenes ◽  
...  

Currently, biodiversity conservation is one of the most urgent and important themes. Biodiversity researchers use species distribution models to make inferences about species occurrences and locations. These models are fundamental for fauna and flora preservation, as well as for decision making processes for urban and regional planning and development. Species distribution modelling tools use large biodiversity datasets which are globally distributed, can be in different computational platforms, and are hard to access and manipulate. The scientific community needs infrastructures in which biodiversity researchers can collaborate and share knowledge. In this context, we present a computational environment that supports the collaboration in species distribution modelling network on the Web. This environment is based on a modelling experiment catalogue and on a set of geoweb services, the Web Biodiversity Collaborative Modelling Services - WBCMS.


2018 ◽  
Author(s):  
Roozbeh Valavi ◽  
Jane Elith ◽  
José J. Lahoz-Monfort ◽  
Gurutzeta Guillera-Arroita

SummaryWhen applied to structured data, conventional random cross-validation techniques can lead to underestimation of prediction error, and may result in inappropriate model selection.We present the R package blockCV, a new toolbox for cross-validation of species distribution modelling.The package can generate spatially or environmentally separated folds. It includes tools to measure spatial autocorrelation ranges in candidate covariates, providing the user with insights into the spatial structure in these data. It also offers interactive graphical capabilities for creating spatial blocks and exploring data folds.Package blockCV enables modellers to more easily implement a range of evaluation approaches. It will help the modelling community learn more about the impacts of evaluation approaches on our understanding of predictive performance of species distribution models.


Sign in / Sign up

Export Citation Format

Share Document