scholarly journals Occurrence cubes: a new paradigm for aggregating species occurrence data

2020 ◽  
Author(s):  
Damiano Oldoni ◽  
Quentin Groom ◽  
Tim Adriaens ◽  
Amy J.S. Davis ◽  
Lien Reyserhove ◽  
...  

In this paper we describe a method of aggregating species occurrence data into what we coined “occurrence cubes”. The aggregated data can be perceived as a cube with three dimensions - taxonomic, temporal and geographic - and takes into account the spatial uncertainty of each occurrence. The aggregation level of each of the three dimensions can be adapted to the scope. Built on Open Science principles, the method is easily automated and reproducible, and can be used for species trend indicators, maps and distribution models. We are using the method to aggregate species occurrence data for Europe per taxon, year and 1km2 European reference grid, to feed indicators and risk mapping/modelling for the Tracking Invasive Alien Species (TrIAS) project.

Author(s):  
Damiano Oldoni ◽  
Quentin Groom ◽  
Peter Desmet

The digital era has brought about an impressive increase in the volume of published species occurrence data. Research infrastructures such as the Global Biodiversity Information Facility (GBIF), the digitization of legacy data, and the use of mobile applications have all played a role in this transition. More data implies, unavoidably, more heterogeneity at multiple levels as a result of the different methods and standards used to collect data. Data standardization and aggregation help to reduce this heterogeneity. Furthermore, intermediate data products that can be used for activities such as mapping, modeling and monitoring improve the repeatability and reproducibility of biodiversity research (Kissling et al. 2017). Occurrences can be defined as events in a three-dimensional space where the dimensions are taxonomic (what), temporal (when) and spatial (where). They are then aggregated into what we coined occurrence cube (Fig. 1). The taxonomic dimension is categorical. Research infrastructures like GBIF use a taxonomic backbone, thus making data aggregation at species level or higher rank relatively easy. The temporal dimension is a continuum and the temporal uncertainty is usually lower than the typical aggregation span, typically a year. Regarding the spatial dimension, occurrences are typically filtered to remove those with too large an uncertainty to fit the grid scheme being used. Meaning that the spatial uncertainty is largely unused. We developed a method to take into account this spatial uncertainty while aggregating data. In particular, we state that an occurrence is spatially representable as a closed plane figure such as a circle, hexagon or square, never as the geometric centre (centroid) of it. As for GBIF occurrence data, the coordinateUncertaintyInMeters is defined as the radius describing the smallest circle containing the whole of the location (see Darwin Core standard). So, spatially speaking, we refer to occurrences as circles, even if the method described below is general. After harvesting the occurrence data and providing a data quality assessment (e.g. removing occurrences without coordinates or with suspicious coordinates) we can assign occurrences to a reference grid such as the European reference grid of the European Environment Agency (EEA) at 1 km scale. In this spatial aggregation we randomly choose a point within the occurrence circle and assign it to the grid cell in which it is contained. We can aggregate further by time (e.g. by year) and taxonomy (e.g. by species), where aggregating means counting how many occurrences are in each specific taxonomic-spatial-temporal unit. The analogy with geometry goes further: the occurrence cube can, as any cube, be projected on an orthogonal plane by aggregating along one of the three dimensions. In particular, projecting the cube on the taxonomic and temporal dimensions can be done by adding up the number of occurrences, or counting the number of occupied cells, thus estimating the area of occupancy. The occurrence cube paradigm has been developed within the Tracking Invasive Alien Species (TrIAS) project (Vanderhoeven et al. 2017) following Open Science and FAIR principles. We created and published occurrence cubes at the species level for Belgium and Italy (Oldoni et al. 2020b) and the occurrence cubes for non-native taxa in Belgium and Europe (Oldoni et al. 2020a).


2020 ◽  
Vol 15 (2) ◽  
pp. 69-80
Author(s):  
Jane Elith ◽  
Catherine Graham ◽  
Roozbeh Valavi ◽  
Meinrad Abegg ◽  
Caroline Bruce ◽  
...  

Species distribution models (SDMs) are widely used to predict and study distributions of species. Many different modeling methods and associated algorithms are used and continue to emerge. It is important to understand how different approaches perform, particularly when applied to species occurrence records that were not gathered in struc­tured surveys (e.g. opportunistic records). This need motivated a large-scale, collaborative effort, published in 2006, that aimed to create objective comparisons of algorithm performance. As a benchmark, and to facilitate future comparisons of approaches, here we publish that dataset: point location records for 226 anonymized species from six regions of the world, with accompanying predictor variables in raster (grid) and point formats. A particularly interesting characteristic of this dataset is that independent presence-absence survey data are available for evaluation alongside the presence-only species occurrence data intended for modeling. The dataset is available on Open Science Framework and as an R package and can be used as a benchmark for modeling approaches and for testing new ways to evaluate the accuracy of SDMs.


2018 ◽  
Vol 2 ◽  
pp. e25864
Author(s):  
Rabetrano Tsiky

Recognizing the abundance and the accumulation of information and data on biodiversity that are still poorly exploited and even unfunded, the REBIOMA project (Madagascar Biodiversity Networking), in collaboration with partners, has developed an online dataportal in order to provide easy access to information and critical data, to support conservation planning and the expansion of scientific and professional activities in Madagascar biodiversity. The mission of the REBIOMA data portal is to serve quality-labeled, up-to-date species occurrence data and environmental niche models for Madagascar’s flora and fauna, both marine and terrestrial. REBIOMA is a project of the Wildlife Conservation Society Madagascar and the University of California, Berkeley. REBIOMA serves species occurrence data for marine and terrestrial regions of Madagascar. Following upload, data is automatically validated against a geographic mask and a taxonomic authority. Data providers can decide whether their data will be public, private, or shared only with selected collaborators. Data reviewers can add quality labels to individual records, allowing selection of data for modeling and conservation assessments according to quality. Portal users can query data in numerous ways. One of the key features of the REBIOMA web portal is its support for species distribution models, created from taxonomically valid and quality-reviewed occurrence data. Species distribution models are produced for species for which there are at least eight, reliably reviewed, non-duplicate (per grid cell) records. Maximum Entropy Modeling (MaxEnt for short) is used to produce continuous distribution models from these occurrence records and environmental data for different eras: past (1950), current (2000), and future (2080). The result is generally interpreted as a prediction of habitat suitability. Results for each model are available on the portal and ready for download as ASCII and HTML files. The REBIOMA Data Portal address is http://data.rebioma.net, or visit http://www.rebioma.netfor more general information about the entire REBIOMA project.


2007 ◽  
Vol 45 (1) ◽  
pp. 239-247 ◽  
Author(s):  
Catherine H Graham ◽  
Jane Elith ◽  
Robert J Hijmans ◽  
Antoine Guisan ◽  
A Townsend Peterson ◽  
...  

2018 ◽  
Author(s):  
Jorge Velásquez-Tibatá ◽  
María H. Olaya-Rodríguez ◽  
Daniel López-Lozano ◽  
César Gutiérrez ◽  
Iván González ◽  
...  

AbstractInformation on species distribution is recognized as a crucial input for biodiversity conservation and management. To that end, considerable resources have been dedicated towards increasing the quantity and availability of species occurrence data, boosting their use in species distribution modeling and online platforms for their dissemination. Currently, those platforms face the challenge of bringing biology into modeling by making informed decisions that result in meaningful models. Here we describe BioModelos, a modeling approach supported by an online system and a core team, whereby a network of experts contributes to the development of species distribution models by assessing the quality of occurrence data, identifying potentially limiting environmental variables, establishing species’ accessible areas and validating qualitatively modeling predictions. Models developed through BioModelos become publicly available once validated by experts, furthering their use in conservation applications. This approach has been implemented in Colombia since 2013 and it currently consist of a network of nearly 500 experts that collaboratively contribute to enhance the knowledge on the distribution of a growing number of species and where it has aided the development of several decision support products such as national risk assessments and biodiversity compensation manuals. BioModelos is an example of operationalization of an essential biodiversity variable at a national level through the implementation of a research infrastructure that enhances the value of open access species data.


2021 ◽  
pp. 308-324
Author(s):  
Rebekah D. Wallace ◽  
Charles T. Bargeron ◽  
Joseph H. LaForest ◽  
Rachel L. Carroll

Author(s):  
Michael K. Young ◽  
Daniel J. Isaak ◽  
Kevin S. McKelvey ◽  
Michael K. Schwartz ◽  
Kellie J. Carim ◽  
...  

Geosciences ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 48
Author(s):  
Margaret F.J. Dolan ◽  
Rebecca E. Ross ◽  
Jon Albretsen ◽  
Jofrid Skarðhamar ◽  
Genoveva Gonzalez-Mirelis ◽  
...  

The use of habitat distribution models (HDMs) has become common in benthic habitat mapping for combining limited seabed observations with full-coverage environmental data to produce classified maps showing predicted habitat distribution for an entire study area. However, relatively few HDMs include oceanographic predictors, or present spatial validity or uncertainty analyses to support the classified predictions. Without reference studies it can be challenging to assess which type of oceanographic model data should be used, or developed, for this purpose. In this study, we compare biotope maps built using predictor variable suites from three different oceanographic models with differing levels of detail on near-bottom conditions. These results are compared with a baseline model without oceanographic predictors. We use associated spatial validity and uncertainty analyses to assess which oceanographic data may be best suited to biotope mapping. Our results show how spatial validity and uncertainty metrics capture differences between HDM outputs which are otherwise not apparent from standard non-spatial accuracy assessments or the classified maps themselves. We conclude that biotope HDMs incorporating high-resolution, preferably bottom-optimised, oceanography data can best minimise spatial uncertainty and maximise spatial validity. Furthermore, our results suggest that incorporating coarser oceanographic data may lead to more uncertainty than omitting such data.


2015 ◽  
Vol 46 (4) ◽  
pp. 159-166 ◽  
Author(s):  
J. Pěknicová ◽  
D. Petrus ◽  
K. Berchová-Bímová

AbstractThe distribution of invasive plants depends on several environmental factors, e.g. on the distance from the vector of spreading, invaded community composition, land-use, etc. The species distribution models, a research tool for invasive plants spread prediction, involve the combination of environmental factors, occurrence data, and statistical approach. For the construction of the presented distribution model, the occurrence data on invasive plants (Solidagosp.,Fallopiasp.,Robinia pseudoaccacia,andHeracleum mantegazzianum) and Natura 2000 habitat types from the Protected Landscape Area Kokořínsko have been intersected in ArcGIS and statistically analyzed. The data analysis was focused on (1) verification of the accuracy of the Natura 2000 habitat map layer, and the accordance with the habitats occupied by invasive species and (2) identification of a suitable scale of intersection between the habitat and species distribution. Data suitability was evaluated for the construction of the model on local scale. Based on the data, the invaded habitat types were described and the optimal scale grid was evaluated. The results show the suitability of Natura 2000 habitat types for modelling, however more input data (e.g. on soil types, elevation) are needed.


Sign in / Sign up

Export Citation Format

Share Document