scholarly journals Modelling the distribution of rare invertebrates by correcting class imbalance and spatial bias

2022 ◽  
Author(s):  
Willson B Gaul ◽  
Dinara Sadykova ◽  
Hannah J White ◽  
Lupe León-Sánchez ◽  
Paul Caplat ◽  
...  

Aim: Soil arthropods are important decomposers and nutrient cyclers, but are poorly represented on national and international conservation Red Lists. Opportunistic biological records for soil invertebrates are often sparse, and contain few observations of rare species but a relatively large number of non-detection observations (a problem known as class imbalance). Robinson et al. (2018) proposed a method for sub-sampling non-detection data using a spatial grid to improve class balance and spatial bias in bird data. For taxa that are less intensively sampled, datasets are smaller, which poses a challenge because under-sampling data removes information. We tested whether spatial under-sampling improved prediction performance of species distribution models for millipedes, for which large datasets are not available. We also tested whether using environmental predictor variables provided additional information beyond what is captured by spatial position for predicting species distributions. Location: Island of Ireland. Methods: We tested the spatial under-sampling method of Robinson et al. (2018) by using biological records to train species distribution models of rare millipedes. Results: Using spatially under-sampled training data improved species distribution model sensitivity (true positive rate) but decreased model specificity (true negative rate). The decrease in specificity was minimal for rarer species and was accompanied by substantial increases in sensitivity. For common species, specificity decreased more, and sensitivity increased less, making spatial under-sampling most useful for rare species. Geographic coordinates were as good as or better than environmental variables for predicting distributions of two out of six species. Main Conclusions: Spatial under-sampling improved prediction performance of species distribution models for rare soil arthropod species. Spatial under-sampling was most effective for rarer species. The good prediction performance of models using geographic coordinates is promising for modeling distributions of poorly studied species for which little is known about ecological or physiological determinants of occurrence.

Phytotaxa ◽  
2018 ◽  
Vol 348 (4) ◽  
pp. 254 ◽  
Author(s):  
J.-ANTONIO VÁZQUEZ-GARCÍA ◽  
DAVID A. NEILL ◽  
VIACHESLAV SHALISKO ◽  
FRANK ARROYO ◽  
R. EFRÉN MERINO-SANTI

Magnolia mercedesiarum, a new species from the eastern slopes of the Andes in northern Ecuador, is described and illustrated, and a key to Ecuadorian Magnolia (subsect. Talauma) is provided. This species differs from M. vargasiana in having broadly elliptic leaves that have an obtuse base vs. suborbicular and subcordate to cordate, glabrous stipular scars, more numerous lateral veins per side and fewer stamens. It also differs from M. llanganatensis in having leaf blades broadly elliptic vs. elliptic, longer petioles, less numerous lateral leaf veins per side, larger fruits and more numerous petals and carpels. Using MaxEnt species distribution models and IUCN threat criteria, M. mercedesiarum has a potential distribution area of less than 3307 km² and is assessed as Endangered (EN): B1 ab (i, ii, iii). The relevance of systematic vegetation sampling in the discovery of rare species is highlighted.


2020 ◽  
Author(s):  
Willson Gaul ◽  
Dinara Sadykova ◽  
Hannah J. White ◽  
Lupe León-Sánchez ◽  
Paul Caplat ◽  
...  

ABSTRACTBiological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of 1) spatial bias in training data, 2) sample size (the average number of observations per species), and 3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but only when the bias was relatively strong. Sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.


2020 ◽  
Vol 27 (1) ◽  
pp. 95-108
Author(s):  
Nolan A. Helmstetter ◽  
Courtney J. Conway ◽  
Bryan S. Stevens ◽  
Amanda R. Goldberg

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10411
Author(s):  
Willson Gaul ◽  
Dinara Sadykova ◽  
Hannah J. White ◽  
Lupe Leon-Sanchez ◽  
Paul Caplat ◽  
...  

Biological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of (1) spatial bias in training data, (2) sample size (the average number of observations per species), and (3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.


2021 ◽  
Vol 13 (8) ◽  
pp. 1495
Author(s):  
Jehyeok Rew ◽  
Yongjang Cho ◽  
Eenjun Hwang

Species distribution models have been used for various purposes, such as conserving species, discovering potential habitats, and obtaining evolutionary insights by predicting species occurrence. Many statistical and machine-learning-based approaches have been proposed to construct effective species distribution models, but with limited success due to spatial biases in presences and imbalanced presence-absences. We propose a novel species distribution model to address these problems based on bootstrap aggregating (bagging) ensembles of deep neural networks (DNNs). We first generate bootstraps considering presence-absence data on spatial balance to alleviate the bias problem. Then we construct DNNs using environmental data from presence and absence locations, and finally combine these into an ensemble model using three voting methods to improve prediction accuracy. Extensive experiments verified the proposed model’s effectiveness for species in South Korea using crowdsourced observations that have spatial biases. The proposed model achieved more accurate and robust prediction results than the current best practice models.


Sign in / Sign up

Export Citation Format

Share Document