scholarly journals Reliability in Distribution Modeling—A Synthesis and Step-by-Step Guidelines for Improved Practice

2021 ◽  
Vol 9 ◽  
Author(s):  
Anders Bryn ◽  
Trine Bekkby ◽  
Eli Rinde ◽  
Hege Gundersen ◽  
Rune Halvorsen

Information about the distribution of a study object (e.g., species or habitat) is essential in face of increasing pressure from land or sea use, and climate change. Distribution models are instrumental for acquiring such information, but also encumbered by uncertainties caused by different sources of error, bias and inaccuracy that need to be dealt with. In this paper we identify the most common sources of uncertainties and link them to different phases in the modeling process. Our aim is to outline the implications of these uncertainties for the reliability of distribution models and to summarize the precautions needed to be taken. We performed a step-by-step assessment of errors, biases and inaccuracies related to the five main steps in a standard distribution modeling process: (1) ecological understanding, assumptions and problem formulation; (2) data collection and preparation; (3) choice of modeling method, model tuning and parameterization; (4) evaluation of models; and, finally, (5) implementation and use. Our synthesis highlights the need to consider the entire distribution modeling process when the reliability and applicability of the models are assessed. A key recommendation is to evaluate the model properly by use of a dataset that is collected independently of the training data. We support initiatives to establish international protocols and open geodatabases for distribution models.

2011 ◽  
Vol 4 (4) ◽  
pp. 390-401 ◽  
Author(s):  
Gary N. Ervin ◽  
D. Christopher Holly

AbstractSpecies distribution modeling is a tool that is gaining widespread use in the projection of future distributions of invasive species and has important potential as a tool for monitoring invasive species spread. However, the transferability of models from one area to another has been inadequately investigated. This study aimed to determine the degree to which species distribution models (SDMs) for cogongrass, developed with distribution data from Mississippi (USA), could be applied to a similar area in neighboring Alabama. Cogongrass distribution data collected in Mississippi were used to train an SDM that was then tested for accuracy and transferability with cogongrass distribution data collected by a forest management company in Alabama. Analyses indicated the SDM had a relatively high predictive ability within the region of the training data but had poor transferability to the Alabama data. Analysis of the Alabama data, via independent SDM development, indicated that predicted cogongrass distribution in Alabama was more strongly correlated with soil variables than was the case in Mississippi, where the SDM was most strongly correlated with tree canopy cover. Results suggest that model transferability is influenced strongly by (1) data collection methods, (2) landscape context of the survey data, and (3) variations in qualitative aspects of environmental data used in model development.


Symmetry ◽  
2018 ◽  
Vol 10 (9) ◽  
pp. 369 ◽  
Author(s):  
Huawei Zhai ◽  
Licheng Cui ◽  
Yu Nie ◽  
Xiaowei Xu ◽  
Weishi Zhang

In order to meet the real-time public travel demands, the bus operators need to adjust the timetables in time. Therefore, it is necessary to predict the variations of the short-term passenger flow. Under the help of the advanced public transportation systems, a large amount of real-time data about passenger flow is collected from the automatic passenger counters, automatic fare collection systems, etc. Using these data, different kinds of methods are proposed to predict future variations of the short-term bus passenger flow. Based on the properties and background knowledge, these methods are classified into three categories: linear, nonlinear and combined methods. Their performances are evaluated in detail in the major aspects of the prediction accuracy, the complexity of training data structure and modeling process. For comparison, some long-term prediction methods are also analyzed simply. At last, it points that, with the help of automatic technology, a large amount of data about passenger flow will be collected, and using the big data technology to speed up the data preprocessing and modeling process may be one of the directions worthy of study in the future.


2019 ◽  
Vol 11 (4) ◽  
pp. 1138
Author(s):  
Gabriel Cardoza-Martínez ◽  
Jorge Becerra-López ◽  
Citlalli Esparza-Estrada ◽  
José Estrada-Rodríguez ◽  
Alexander Czaja ◽  
...  

It has frequently been reported that species with strong niche conservatism will not be able to adapt to new climatic conditions, so they must migrate or go extinct. We have evaluated the shifts in climatic niche occupation of the species Astrophytum coahuilense and its potential distribution in Mexico. We understand niche occupation as the geographic zones with available habitats and with the presence of the species. To assess shifts in climatic niche occupation, we used niche overlap analysis, while potential distribution modeling was performed based on the principle of maximum entropy. The results indicate that this species presents a limited amplitude in its climate niche. This restriction of the climatic niche of A. coahuilense limits its ability to colonize new geographical areas with different climatic environments. On the other hand, the potential distribution models obtained from the present study allow us to identify potential zones based on the climatic requirements of the species. This information is important to identify high priority areas for the conservation of A. coahuilense.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e4095 ◽  
Author(s):  
Jason L. Brown ◽  
Joseph R. Bennett ◽  
Connor M. French

SDMtoolbox 2.0 is a software package for spatial studies of ecology, evolution, and genetics. The release of SDMtoolbox 2.0 allows researchers to use the most current ArcGIS software and MaxEnt software, and reduces the amount of time that would be spent developing common solutions. The central aim of this software is to automate complicated and repetitive spatial analyses in an intuitive graphical user interface. One core tenant facilitates careful parameterization of species distribution models (SDMs) to maximize each model’s discriminatory ability and minimize overfitting. This includes carefully processing of occurrence data, environmental data, and model parameterization. This program directly interfaces with MaxEnt, one of the most powerful and widely used species distribution modeling software programs, although SDMtoolbox 2.0 is not limited to species distribution modeling or restricted to modeling in MaxEnt. Many of the SDM pre- and post-processing tools have ‘universal’ analogs for use with any modeling software. The current version contains a total of 79 scripts that harness the power of ArcGIS for macroecology, landscape genetics, and evolutionary studies. For example, these tools allow for biodiversity quantification (such as species richness or corrected weighted endemism), generation of least-cost paths and corridors among shared haplotypes, assessment of the significance of spatial randomizations, and enforcement of dispersal limitations of SDMs projected into future climates—to only name a few functions contained in SDMtoolbox 2.0. Lastly, dozens of generalized tools exists for batch processing and conversion of GIS data types or formats, which are broadly useful to any ArcMap user.


2021 ◽  
Vol 118 (23) ◽  
pp. e2022169118
Author(s):  
Jamie Hudson ◽  
Juan Carlos Castilla ◽  
Peter R. Teske ◽  
Luciano B. Beheregaray ◽  
Ivan D. Haigh ◽  
...  

Explaining why some species are widespread, while others are not, is fundamental to biogeography, ecology, and evolutionary biology. A unique way to study evolutionary and ecological mechanisms that either limit species’ spread or facilitate range expansions is to conduct research on species that have restricted distributions. Nonindigenous species, particularly those that are highly invasive but have not yet spread beyond the introduced site, represent ideal systems to study range size changes. Here, we used species distribution modeling and genomic data to study the restricted range of a highly invasive Australian marine species, the ascidian Pyura praeputialis. This species is an aggressive space occupier in its introduced range (Chile), where it has fundamentally altered the coastal community. We found high genomic diversity in Chile, indicating high adaptive potential. In addition, genomic data clearly showed that a single region from Australia was the only donor of genotypes to the introduced range. We identified over 3,500 km of suitable habitat adjacent to its current introduced range that has so far not been occupied, and importantly species distribution models were only accurate when genomic data were considered. Our results suggest that a slight change in currents, or a change in shipping routes, may lead to an expansion of the species’ introduced range that will encompass a vast portion of the South American coast. Our study shows how the use of population genomics and species distribution modeling in combination can unravel mechanisms shaping range sizes and forecast future range shifts of invasive species.


2016 ◽  
Vol 283 (1826) ◽  
pp. 20152817 ◽  
Author(s):  
Kaitlin C. Maguire ◽  
Diego Nieto-Lugilde ◽  
Jessica L. Blois ◽  
Matthew C. Fitzpatrick ◽  
John W. Williams ◽  
...  

Species distribution models (SDMs) assume species exist in isolation and do not influence one another's distributions, thus potentially limiting their ability to predict biodiversity patterns. Community-level models (CLMs) capitalize on species co-occurrences to fit shared environmental responses of species and communities, and therefore may result in more robust and transferable models. Here, we conduct a controlled comparison of five paired SDMs and CLMs across changing climates, using palaeoclimatic simulations and fossil-pollen records of eastern North America for the past 21 000 years. Both SDMs and CLMs performed poorly when projected to time periods that are temporally distant and climatically dissimilar from those in which they were fit; however, CLMs generally outperformed SDMs in these instances, especially when models were fit with sparse calibration datasets. Additionally, CLMs did not over-fit training data, unlike SDMs. The expected emergence of novel climates presents a major forecasting challenge for all models, but CLMs may better rise to this challenge by borrowing information from co-occurring taxa.


2009 ◽  
Vol 21 (6) ◽  
pp. 1776-1795 ◽  
Author(s):  
Dalei Wu

α-integration and α-GMM have been recently proposed for integrated stochastic modeling. However, there has not been an approach to date for estimating model parameters for α-GMM in a statistical way, based on a set of training data. In this letter, parameter updating formulas are mathematically derived based on maximum likelihood criterion using an adapted expectation-maximization algorithm. With this method, model parameters for α-GMM are reestimated in an iterative way. The updating formulas were found to be simple and systematically compatible with the GMM equations. This advantage renders the α-GMM a superset of the GMM but with similar computational complexity. This method has been effectively applied to realistic speaker recognition applications.


2020 ◽  
Author(s):  
Willson Gaul ◽  
Dinara Sadykova ◽  
Hannah J. White ◽  
Lupe León-Sánchez ◽  
Paul Caplat ◽  
...  

ABSTRACTBiological records are often the data of choice for training predictive species distribution models (SDMs), but spatial sampling bias is pervasive in biological records data at multiple spatial scales and is thought to impair the performance of SDMs. We simulated presences and absences of virtual species as well as the process of recording these species to evaluate the effect on species distribution model prediction performance of 1) spatial bias in training data, 2) sample size (the average number of observations per species), and 3) the choice of species distribution modelling method. Our approach is novel in quantifying and applying real-world spatial sampling biases to simulated data. Spatial bias in training data decreased species distribution model prediction performance, but only when the bias was relatively strong. Sample size and the choice of modelling method were more important than spatial bias in determining the prediction performance of species distribution models.


2016 ◽  
Author(s):  
Pascal O Title ◽  
Jordan B Bemmels

AbstractSpecies distribution modeling is a valuable tool with many applications across ecology and evolutionary biology. The selection of biologically meaningful environmental variables that determine relative habitat suitability is a crucial aspect of the modeling pipeline. The 19 bioclimatic variables from WorldClim are frequently employed, primarily because they are easily accessible and available globally for past, present and future climate scenarios. Yet, the availability of relatively few other comparable environmental datasets potentially limits our ability to select appropriate variables that will most successfully characterize a species’ distribution. We identified a set of 16 climatic and two topographic variables in the literature, which we call the envirem dataset, many of which are likely to have direct relevance to ecological or physiological processes determining species distributions. We generated this set of variables at the same resolutions as WorldClim, for the present, mid-Holocene, and Last Glacial Maximum (LGM). For 20 North American vertebrate species, we then assessed whether including the envirem variables led to improved species distribution models compared to models using only the existing WorldClim variables. We found that including the ENVIREM dataset in the pool of variables to select from led to substantial improvements in niche modeling performance in 17 out of 20 species. We also show that, when comparing models constructed with different environmental variables, differences in projected distributions were often greater in the LGM than in the present. These variables are worth consideration in species distribution modeling applications, especially as many of the variables have direct links to processes important for species ecology. We provide these variables for download at multiple resolutions and for several time periods at envirem.github.io. Furthermore, we have written the ‘envirem’ R package to facilitate the generation of these variables from other input datasets.


Sign in / Sign up

Export Citation Format

Share Document