scholarly journals The impact of data quality filtering of opportunistic citizen science data on species distribution model performance

2021 ◽  
Vol 444 ◽  
pp. 109453
Author(s):  
Camille Van Eupen ◽  
Dirk Maes ◽  
Marc Herremans ◽  
Kristijn R.R. Swinnen ◽  
Ben Somers ◽  
...  
2019 ◽  
Author(s):  
A Johnston ◽  
WM Hochachka ◽  
ME Strimas-Mackey ◽  
V Ruiz Gutierrez ◽  
OJ Robinson ◽  
...  

AbstractCitizen science data are valuable for addressing a wide range of ecological research questions, and there has been a rapid increase in the scope and volume of data available. However, data from large-scale citizen science projects typically present a number of challenges that can inhibit robust ecological inferences. These challenges include: species bias, spatial bias, and variation in effort.To demonstrate addressing key challenges in analysing citizen science data, we use the example of estimating species distributions with data from eBird, a large semi-structured citizen science project. We estimate two widely applied metrics of species distributions: encounter rate and occupancy probability. For each metric, we assess the impact of data processing steps that either degrade or refine the data used in the analyses. We also test whether differences in model performance are maintained at different sample sizes.Model performance improved when data processing and analytical methods addressed the challenges arising from citizen science data. The largest gains in model performance were achieved with: 1) the use of complete checklists (where observers report all the species they detect and identify); and 2) the use of covariates describing variation in effort and detectability for each checklist. Occupancy models were more robust to a lack of complete checklists and effort variables. Improvements in model performance with data refinement were more evident with larger sample sizes.Here, we describe processes to refine semi-structured citizen science data to estimate species distributions. We demonstrate the value of complete checklists, which can inform the design and adaptation of citizen science projects. We also demonstrate the value of information on effort. The methods we have outlined are also likely to improve other forms of inference, and will enable researchers to conduct robust analyses and harness the vast ecological knowledge that exists within citizen science data.


2018 ◽  
Author(s):  
Daniel Zamorano ◽  
Fabio Labra ◽  
Marcelo Villarroel ◽  
Luca Mao ◽  
Shaw Lucy ◽  
...  

Despite its theoretical relationship, the effect of body size on the performance of species distribution models (SDM) has only been assessed in a few studies of terrestrial taxa. We aim to assess the effect of body size on the performance of SDM in river fish. We study seven Chilean freshwater fish, using models trained with three different sets of predictor variables: ecological (Eco), anthropogenic (Antr) and both (Eco+Antr). Our results indicate that the performance of the Eco+Antr models improves with fish size. These results highlight the importance of two novel predictive layers: the source of river flow and the overproduction of biotopes by anthropogenic activities. We compare our work with previous studies that modeled river fish, and observe a similar relationship in most cases. We discuss the current challenges of the modeling of riverine species, and how our work helps suggest possible solutions.


2021 ◽  
Vol 13 (10) ◽  
pp. 1904
Author(s):  
Walter De Simone ◽  
Marina Allegrezza ◽  
Anna Rita Frattaroli ◽  
Silvia Montecchiari ◽  
Giulio Tesei ◽  
...  

Remote sensing (RS) has been widely adopted as a tool to investigate several biotic and abiotic factors, directly and indirectly, related to biodiversity conservation. European grasslands are one of the most biodiverse habitats in Europe. Most of these habitats are subject to priority conservation measure, and several human-induced processes threaten them. The broad expansions of few dominant species are usually reported as drivers of biodiversity loss. In this context, using Sentinel-2 (S2) images, we investigate the distribution of one of the most spreading species in the Central Apennine: Brachypodium genuense. We performed a binary Random Forest (RF) classification of B. genuense using RS images and field-sampled presence/absence data. Then, we integrate the occurrences obtained from RS classification into species distribution models to identify the topographic drivers of B. genuense distribution in the study area. Lastly, the impact of B. genuense distribution in the Natura 2000 (N2k) habitats (Annex I of the European Habitat Directive) was assessed by overlay analysis. The RF classification process detected cover of B. genuense with an overall accuracy of 94.79%. The topographic species distribution model shows that the most relevant topographic variables that influence the distribution of B. genuense are slope, elevation, solar radiation, and topographic wet index (TWI) in order of importance. The overlay analysis shows that 74.04% of the B. genuense identified in the study area falls on the semi-natural dry grasslands. The study highlights the RS classification and the topographic species distribution model’s importance as an integrated workflow for mapping a broad-expansion species such as B. genuense. The coupled techniques presented in this work should apply to other plant communities with remotely recognizable characteristics for more effective management of N2k habitats.


Sign in / Sign up

Export Citation Format

Share Document