Improving species distribution model predictive accuracy using species abundance: Application with boosted regression trees

embarcadero: Species distribution modelling with Bayesian additive regression trees in R

10.1101/774604 ◽

2019 ◽

Cited By ~ 1

Author(s):

Colin J. Carlson

Keyword(s):

Species Distribution ◽

Species Distribution Model ◽

Regression Trees ◽

Species Distribution Modelling ◽

Classification And Regression Tree ◽

Transmission Risk ◽

Distribution Model ◽

Distribution Modelling ◽

Additive Regression ◽

Bayesian Additive Regression Trees

embarcadero is an R package of convenience tools for species distribution modelling with Bayesian additive regression trees (BART), a powerful machine learning approach that has been rarely applied to ecological problems. Like other classification and regression tree methods, BART estimates the probability of a binary outcome based on a set of decision trees. Unlike other methods, BART iteratively generates sets of trees based on a set of priors about tree structure and nodes, and builds a posterior distribution of estimated classification probabilities. So far, BARTs have yet to be applied to species distribution modelling. embarcadero is a workflow wrapper for BART species distribution models, and includes functionality for easy spatial prediction, an automated variable selection procedure, several types of partial dependence visualization, and other tools for ecological application. The embarcadero package is available open source on Github and intended for eventual CRAN release. To show how embarcadero can be used by ecologists, I illustrate a BART workflow for a virtual species distribution model. The supplement includes a more advanced vignette showing how BART can be used for mapping disease transmission risk, using the example of Crimean-Congo haemorrhagic fever in Africa.

Download Full-text

How to assess species distribution model accuracy: using internal-aspatial or external-spatial methods?

10.7287/peerj.preprints.27257 ◽

2018 ◽

Author(s):

Chunrong Mi ◽

Falk Huettmann ◽

Yumin Guo

Keyword(s):

Species Distribution ◽

Species Distribution Model ◽

Predictive Accuracy ◽

Landscape Planning ◽

Spatial Prediction ◽

Distribution Model ◽

Model Accuracy ◽

Climate Change Research ◽

Spatial Methods ◽

Occurrence Data

Species distribution models (SDMs) have become an increasingly important tool in ecology, biogeography, evolution and, more recently, in conservation management, landscape planning and climate change research. The assessment of their predictive accuracy is one fundamental issue in the development and application of SDMs. Accuracy assessments for models should have a close connection to the intended use of the model. However, we found that the common evaluation method (we named internal-aspatial) usually ignored how the spatial prediction map actually looks like, and achieves for the real-world species distribution and for application. Therefore, in this research we proposed a spatial method to evaluate model performance by assessing how the prediction maps look like (we named external-spatial). We took Hooded Crane (Grus monacha) as a case, in this research, to compare these two methods (internal-aspatial and external-spatial) performance. Both of the two methods were expressed with three commonly used SDM evaluation criteria (AUC, Kappa and TSS). In addition, model accuracy was also assessed via evaluating the prediction maps with knowledge of the study species and alternative occurrence data assistance. We used two popular data mining algorithms (Random Forest and TreeNet) and ran 8 experiments using 1, 3, 5, 8, 11, 21, 29 and 78 predictors, allowing to develop overall 16 models for this assessment. Results indicated that AUC had a significant linear relationship with Kappa and TSS. Both of interal-aspatial and external-spatial methods could get higher AUC values and they were close. This indicated that internal-aspatial model assessments can serve as powerful assessment-aspatiual metrics without the need of secondary data even! However, internal-aspatial, external-spatial, prediction map evaluation and alternative occurrence data could not distinguish well models with different sets of predictors. This is the first time the concept of spatial assessment criteria is expressed and assessed. Overall, we hope to see more study on meaningful spatial criteria and proposed more and better methods to evaluate SDMs and distribution map in the future.

Download Full-text

How to assess species distribution model accuracy: using internal-aspatial or external-spatial methods?

10.7287/peerj.preprints.27257v1 ◽

2018 ◽

Author(s):

Chunrong Mi ◽

Falk Huettmann ◽

Yumin Guo

Keyword(s):

Species Distribution ◽

Species Distribution Model ◽

Predictive Accuracy ◽

Landscape Planning ◽

Spatial Prediction ◽

Distribution Model ◽

Model Accuracy ◽

Climate Change Research ◽

Spatial Methods ◽

Occurrence Data

Species distribution models (SDMs) have become an increasingly important tool in ecology, biogeography, evolution and, more recently, in conservation management, landscape planning and climate change research. The assessment of their predictive accuracy is one fundamental issue in the development and application of SDMs. Accuracy assessments for models should have a close connection to the intended use of the model. However, we found that the common evaluation method (we named internal-aspatial) usually ignored how the spatial prediction map actually looks like, and achieves for the real-world species distribution and for application. Therefore, in this research we proposed a spatial method to evaluate model performance by assessing how the prediction maps look like (we named external-spatial). We took Hooded Crane (Grus monacha) as a case, in this research, to compare these two methods (internal-aspatial and external-spatial) performance. Both of the two methods were expressed with three commonly used SDM evaluation criteria (AUC, Kappa and TSS). In addition, model accuracy was also assessed via evaluating the prediction maps with knowledge of the study species and alternative occurrence data assistance. We used two popular data mining algorithms (Random Forest and TreeNet) and ran 8 experiments using 1, 3, 5, 8, 11, 21, 29 and 78 predictors, allowing to develop overall 16 models for this assessment. Results indicated that AUC had a significant linear relationship with Kappa and TSS. Both of interal-aspatial and external-spatial methods could get higher AUC values and they were close. This indicated that internal-aspatial model assessments can serve as powerful assessment-aspatiual metrics without the need of secondary data even! However, internal-aspatial, external-spatial, prediction map evaluation and alternative occurrence data could not distinguish well models with different sets of predictors. This is the first time the concept of spatial assessment criteria is expressed and assessed. Overall, we hope to see more study on meaningful spatial criteria and proposed more and better methods to evaluate SDMs and distribution map in the future.

Download Full-text

Species Distribution Model Predictions of the Critically Endangered Grey Nurse Shark in Australia

10.21203/rs.3.rs-677819/v1 ◽

2021 ◽

Author(s):

Guanfang Su

Keyword(s):

Species Distribution ◽

Species Distribution Model ◽

Boosted Regression Trees ◽

Critically Endangered ◽

Distribution Model ◽

Distribution Models ◽

Nurse Shark ◽

Distributional Range ◽

Specificity And Sensitivity ◽

Leave One Out

Abstract Species distribution models (SDMs) are commonly used to forecast how threatened species are influenced by climate change. The grey nurse shark (Carcharias tauras) is a critically endangered species inhabiting both the east and west coasts of Australia, with negligible genetic interchange between the two populations. I used Generalized Linear Models (GLM), Maximum Entropy (MaxEnt) models and Boosted Regression Trees (BRT) to predict the distribution of the grey nurse shark. The data were a sample of presence-only data, derived from the known grey nurse shark sighting locations, from the east coasts of Australia, with pseudo-absences generated and bootstrapped from a restricted background. I verified these models using leave-one-out cross validation and model metrics including AICc, BIC, percentage of deviance explained, leave-one-out cross-validated R2, AUC, maximum Cohen’s Kappa, specificity and sensitivity. Cross-validated R2 was used as an overall comparison method across model types. I performed out-of-source validation by comparing model projection with the distributional range of the ragged tooth shark (Carcharias taurus) in South Africa. The prediction of the selected model was consistent with the current distributional range of the ragged tooth shark.

Download Full-text

A Robust Prediction Model for Species Distribution Using Bagging Ensembles with Deep Neural Networks

Remote Sensing ◽

10.3390/rs13081495 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1495

Author(s):

Jehyeok Rew ◽

Yongjang Cho ◽

Eenjun Hwang

Keyword(s):

Neural Networks ◽

Species Distribution ◽

Best Practice ◽

Deep Neural Networks ◽

Species Distribution Models ◽

Species Distribution Model ◽

Environmental Data ◽

Distribution Model ◽

Distribution Models ◽

Robust Prediction

Species distribution models have been used for various purposes, such as conserving species, discovering potential habitats, and obtaining evolutionary insights by predicting species occurrence. Many statistical and machine-learning-based approaches have been proposed to construct effective species distribution models, but with limited success due to spatial biases in presences and imbalanced presence-absences. We propose a novel species distribution model to address these problems based on bootstrap aggregating (bagging) ensembles of deep neural networks (DNNs). We first generate bootstraps considering presence-absence data on spatial balance to alleviate the bias problem. Then we construct DNNs using environmental data from presence and absence locations, and finally combine these into an ensemble model using three voting methods to improve prediction accuracy. Extensive experiments verified the proposed model’s effectiveness for species in South Korea using crowdsourced observations that have spatial biases. The proposed model achieved more accurate and robust prediction results than the current best practice models.

Download Full-text