scholarly journals Documenting Natural History Collections in GBIF

Author(s):  
Tim Robertson ◽  
Marcos Gonzalez ◽  
Morten Høfft ◽  
Marie Grosjean

The Global Biodiversity Information Facility (GBIF) was established by governments in 2001, largely through the initiative and leadership of the natural history collections community, following the 1999 recommendation by a working group under the Megascience Forum (predecessor of the Global Science Forum) of the Organization for Economic Cooperation and Development (OECD). Over 20 years, GBIF has helped develop standards and convened a global community of data-publishing institutions, aggregating over one billion species occurrence records freely and openly available for use in research and policy making. Of these more than 150 million records originate from specimens preserved by the collections community. The recent adoption of the Global Registry of Scientific Collections by GBIF (https://www.gbif.org/news/5kyAslpqTVxYqZTwYn1cub) is the first step by GBIF to better enable a picture of the natural history collections of the world along with the associated science that they have and continue to enable. Recognising that other collection metadata initiatives exists, GBIF aims to discuss with the community and progress topics such as: Synchronising with existing metadata catalogues to ensure accurate, up-to-date information is available without unnecessary burden for authors Defining, testing and formalizing the Collection Descriptions standard (https://github.com/tdwg/cd) Providing clear guidelines of citation practice for collections, potentially building on the success of the Digital Object Identifier (DOI) approach used for datasets mediated through GBIF.org. Tracking citations of use through both data downloads and through references in literature, such as materials examined in a taxonomic publication Improving the linkages and discoverability of specimen records derived from the same collecting event but preserved in multiple institutions Improving the linkages between the people involved in collecting, preserving, and identifying specimen records through the use of Open Researcher and Contributor IDs (ORCID) Lowering the technical threshold to deploy tools such as “data dashboards” and specimen search/download on collection related websites Synchronising with existing metadata catalogues to ensure accurate, up-to-date information is available without unnecessary burden for authors Defining, testing and formalizing the Collection Descriptions standard (https://github.com/tdwg/cd) Providing clear guidelines of citation practice for collections, potentially building on the success of the Digital Object Identifier (DOI) approach used for datasets mediated through GBIF.org. Tracking citations of use through both data downloads and through references in literature, such as materials examined in a taxonomic publication Improving the linkages and discoverability of specimen records derived from the same collecting event but preserved in multiple institutions Improving the linkages between the people involved in collecting, preserving, and identifying specimen records through the use of Open Researcher and Contributor IDs (ORCID) Lowering the technical threshold to deploy tools such as “data dashboards” and specimen search/download on collection related websites The progress made to date will be summarised and a roadmap for the future will be introduced.

Author(s):  
Niels Raes ◽  
Emily van Egmond ◽  
Ana Casino ◽  
Matt Woodburn ◽  
Deborah L Paul

With digitisation of natural history collections over the past decades, their traditional roles — for taxonomic studies and public education — have been greatly expanded into the fields of biodiversity assessments, climate change impact studies, trait analyses, sequencing, 3D object analyses etc. (Nelson and Ellis 2019; Watanabe 2019). Initial estimates of the global natural history collection range between 1.2 and 2.1 billion specimens (Ariño 2010), of which 169 million (8-14% - as of April 2019) are available at some level of digitisation through the Global Biodiversity Information Facility (GBIF). With iDigBio (Integrated Digitized Biocollections) established in the United States and with the European DiSSCo (Distributed Systems of Scientific Collections) accepted on the ESFRI roadmap, it has become a priority to digitize natural history collections at an industrialized scale. Both iDigBio and DiSSCo aim at mobilising, unifying and delivering bio- and geo-diversity information at the scale, form and precision required by scientific communities, and thereby transform a fragmented landscape into a coherent and responsive research infrastructure. In order to prioritise digitisation based on scientific demand, and efficiency using industrial digitisation pipelines, it is required to arrive at a uniform and unambiguously accepted collection description standard that would allow comparing, grouping and analysing natural history collections at diverse levels. Several initiatives attempt to unambiguously describe natural history collections using taxonomic and storage classification schemes. These initiatives include One World Collection, Global Registry of Scientific Collections (GRSciColl), TDWG (Taxonomic Databases Working Group) Natural Collection Descriptions (NCD) and CETAF (Consortium of European Taxonomy Facilities) passports, among others. In a collaborative effort of DiSSCo, ICEDIG (Innovation and consolidation for large scale digitisation of natural heritage), iDigBio, TDWG and the Task Group Collection Digitisation Dashboards, the various schemes were compared in a cross-walk analysis to propose a preliminary natural collection description standard that is supported by the wider community. In the process, two main user groups of collection descriptions standards were identified; scientists and collection managers. The classification produced intends to meet requirements from them both, resulting in three classification schemes that exist in parallel to each other (van Egmond et al. 2019). For scientific purposes a ‘Taxonomic’ and ‘Stratigraphic’ classification were defined, and for management purposes a ‘Storage’ classification. The latter is derived from specimen preservation types (e.g. dried, liquid preserved) defining storage requirements and the physical location of specimens in collection holding facilities. The three parallel collection classifications can be cross-sectioned with a ‘Geographic’ classification to assign sub-collections to major terrestrial and marine regions, which allow scientists to identify particular taxonomic or stratigraphic (sub-)collections from major geographical or marine regions of interest. Finally, to measure the level of digitisation of institutional collections and progress of digitisation through time, the number of digitised specimens for each geographically cross-sectioned (sub-)collection can be derived from institutional collection management systems (CMS). As digitisation has different levels of completeness a ‘Digitisation’ scheme has been adopted to quantify the level of digitisation of a collection from Saarenmaa et al. 2019, ranging from ‘not digitised’ to extensively digitised, recorded in a progressive scale of MIDS (Minimal Information for Digital Specimen). The applicability of this preliminary classification will be discussed and visualized in a Collection Digitisation Dashboards (CDD) to demonstrate how the implementation of a collection description standard allows the identification of existing gaps in taxonomic and geographic coverage and levels of digitisation of natural history collections. This set of common classification schemes and dashboard design (van Egmond et al. 2019) will be contributed to the TDWG Collection Description interest group to ultimately arrive at the common goal of a 'World Collection Catalogue'.


Author(s):  
Erica Krimmel ◽  
Austin Mast ◽  
Deborah Paul ◽  
Robert Bruhn ◽  
Nelson Rios ◽  
...  

Genomic evidence suggests that the causative virus of COVID-19 (SARS-CoV-2) was introduced to humans from horseshoe bats (family Rhinolophidae) (Andersen et al. 2020) and that species in this family as well as in the closely related Hipposideridae and Rhinonycteridae families are reservoirs of several SARS-like coronaviruses (Gouilh et al. 2011). Specimens collected over the past 400 years and curated by natural history collections around the world provide an essential reference as we work to understand the distributions, life histories, and evolutionary relationships of these bats and their viruses. While the importance of biodiversity specimens to emerging infectious disease research is clear, empowering disease researchers with specimen data is a relatively new goal for the collections community (DiEuliis et al. 2016). Recognizing this, a team from Florida State University is collaborating with partners at GEOLocate, Bionomia, University of Florida, the American Museum of Natural History, and Arizona State University to produce a deduplicated, georeferenced, vetted, and versioned data product of the world's specimens of horseshoe bats and relatives for researchers studying COVID-19. The project will serve as a model for future rapid data product deployments about biodiversity specimens. The project underscores the value of biodiversity data aggregators iDigBio and the Global Biodiversity Information Facility (GBIF), which are sources for 58,617 and 79,862 records, respectively, as of July 2020, of horseshoe bat and relative specimens held by over one hundred natural history collections. Although much of the specimen-based biodiversity data served by iDigBio and GBIF is high quality, it can be considered raw data and therefore often requires additional wrangling, standardizing, and enhancement to be fit for specific applications. The project will create efficiencies for the coronavirus research community by producing an enhanced, research-ready data product, which will be versioned and published through Zenodo, an open-access repository (see doi.org/10.5281/zenodo.3974999). In this talk, we highlight lessons learned from the initial phases of the project, including deduplicating specimen records, standardizing country information, and enhancing taxonomic information. We also report on our progress to date, related to enhancing information about agents (e.g., collectors or determiners) associated with these specimens, and to georeferencing specimen localities. We seek also to explore how much we can use the added agent information (i.e., ORCID iDs and Wikidata Q identifiers) to inform our georeferencing efforts and to support crediting those collecting and doing identifications. The project will georeference approximately one third of our specimen records, based on those lacking geospatial coordinates but containing textual locality descriptions. We furthermore provide an overview of our holistic approach to enhancing specimen records, which we hope will maximize the value of the bat specimens at the center of what has been recently termed the "extended specimen network" (Lendemer et al. 2020). The centrality of the physical specimen in the network reinforces the importance of archived materials for reproducible research. Recognizing this, we view the collections providing data to iDigBio and GBIF as essential partners, as we expect that they will be responsible for the long-term management of enhanced data associated with the physical specimens they curate. We hope that this project can provide a model for better facilitating the reintegration of enhanced data back into local specimen data management systems.


Author(s):  
David Shorthouse ◽  
Roderic Page

Through the Bloodhound proof-of-concept, https://bloodhound-tracker.net an international audience of collectors and determiners of natural history specimens are engaged in the emotive act of claiming their specimens and attributing other specimens to living and deceased mentors and colleagues. Behind the scenes, these claims build links between Open Researcher and Contributor Identifiers (ORCID, https://orcid.org) or Wikidata identifiers for people and Global Biodiversity Information Facility (GBIF) specimen identifiers, predicated by the Darwin Core terms, recordedBy (collected) and identifiedBy (determined). Here we additionally describe the socio-technical challenge in unequivocally resolving people names in legacy specimen data and propose lightweight and reusable solutions. The unique identifiers for the affiliations of active researchers are obtained from ORCID whereas the unique identifiers for institutions where specimens are actively curated are resolved through Wikidata. By constructing closed loops of links between person, specimen, and institution, an interesting suite of potential metrics emerges, all due to the activities of employees and their network of professional relationships. This approach balances a desire for individuals to receive formal recognition for their efforts in natural history collections with that of an institutional-level need to alter budgets in response to easily obtained numeric trends in national and international reach. If handled in a coordinating fashion, this reporting technique may be a significant new driver for specimen digitization efforts on par with Altmetric, https://www.altmetric.com, an important new tool that tracks the impact of publications and delights administrators and authors alike.


2018 ◽  
Vol 2 ◽  
pp. e26473
Author(s):  
Molly Phillips ◽  
Anne Basham ◽  
Marc Cubeta ◽  
Kari Harris ◽  
Jonathan Hendricks ◽  
...  

Natural history collections around the world are currently being digitized with the resulting data and associated media now shared online in aggregators such as the Global Biodiversity Information Facility and Integrated Digitized Biocollections (iDigBio). These collections and their resources are accessible and discoverable through online portals to not only researchers and collections professionals, but to educators, students, and other potential downstream users. Primary and secondary education (K-12) in the United States is going through its own revolution with many states adopting Next Generation Science Standards (NGSS https://www.nextgenscience.org/). The new standards emphasize science practices for analyzing and interpreting data and connect to cross-cutting concepts such as cause and effect and patterns. NGSS and natural history collections data portals seem to complement each other. Nevertheless, many educators and students are unaware of the digital resources available or are overwhelmed with working in aggregated databases created by scientists. To better address this challenge, participants within the National Science Foundation Advancing Digitization for Biodiversity Collections program (ADBC) have been working to increase awareness of, and scaffold learning for, digitized collections with K-12 educators and learners. They are accomplishing this through individual programs at institutions across the country as part of the Thematic Collections Networks and collaboratively through the iDigBio Education and Outreach Working Group. ADBC partners have focused on incorporating digital data and resources into K-12 classrooms through training workshops and webinars for both educators and collections professionals, as well as through creating educational resources, websites, and applications that use digital collections data. This presentation includes lessons learned from engaging K-12 audiences with digital data, summarizes available resources for both educators and collections professionals, shares how to become involved, and provides ways to facilitate transfer of educational resources to the K-12 community.


Author(s):  
Marcus De Almeida ◽  
Ângelo Pinto ◽  
Alcimar Carvalho

Natural history collections (NHC) are guardians of biodiversity (Lane 1996) and essential to understand the natural world and its evolutionary processes. They hold samples of morphological and genetic heritages of living and extinct biotas, helping to reconstruct the timeline of life over the centuries (Gardner 2014). Primary data from specimens in NHC are crucial elements for research in many areas of biological sciences, considered the “bricks” of systematics and therefore one of the pillars for evolutionary studies (Troudet 2018). For this reason, studies carried out in NHC are essential for the development of the scientific knowledge and are pivotal for the scientific-technological progress of a nation (Camargo 2015). The digitization and availability of primary data on biodiversity from NHC represents a inexpensive, practical and secure means of exchanging information, allowing collaboration between institutions and researchers. In this sense, initiatives such as the Sistema de Informação sobre a Biodiversidade Brasileira (SiBBr), a country-level branch of the Global Biodiversity Information Facility (GBIF) platform, aim to encourage and establish ways for the informatization of biological collections and their type specimens. Known for housing one of the largest and oldest collections of insects in the world focused on Neotropical fauna, the Entomological Collection of the Museu Nacional of Federal University of Rio de Janeiro (MNRJ) had more than 3,000 primary types and approximately 12,005,000 specimens, of which about 96% were lost in the tragic fire occurred at the institution on September 2, 2018. The SiBBr project was active in that collection from 2016 to 2019 and enabled the digitization and preservation of data from the type material of many insect orders, including the charismatic dragonflies (order Odonata). Due to the end of the agreement between SiBBr and the Museu Nacional, most of the obtained primary data are pending full curation and, therefore, are not yet available to the public and researchers. The MNRJ housed the biggest and most important collection of dragonflies among all Central and South American institutions. It assembled most of the physical records of neotropical dragonfly fauna gathered over the last 80 years, many of which are of undescribed taxa. Unfortunately, almost all material was permanently lost. This study aims to gather, analyze and publicize primary data of the type material of dragonflies housed in the MNRJ, ensuring the preservation of its history, as well as providing data on the taxonomy and diversity of this marvelous group of insects. A total of 11 families, 50 genera and 131 species were recorded, belonging to the suborders Anisoptera and Zygoptera with distributional records widespread in South America. The MNRJ housed 105 holotypes of dragonflies' nomina representing 11.7% of the richness of the Brazilian Odonata fauna (901 spp.), a country with the highest number of species of the biosphere. The impact of the loss of this collection to studies of these insects is unprecedented, since some enigmatic and monotypic genera such as Brasiliogomphus, Fluminagrion and Roppaneura lost 100% of their type series, while others most diverse such as Lauromacromia, Oxyagrion and Neocordulia lost 50%, 35% and 31% of their holotypes. Therefore, due to the registration and preservation of primary biodiversity data, this work reiterates the importance of curating and digitizing biological scientific collections. Furthermore, it shows extreme relevance for preserving information on existing biodiversity permanently and providing support for future research. Digitization and interconnecting digital extended specimen data proves to be one of the main and most effective ways to protect NHC heritage and their primary data against catastrophic events.


2020 ◽  
Author(s):  
Kyle Copas

<p>GBIF—the Global Biodiversity Information Facility—and its network of more than 1,500 institutions maintain the world's largest index of biodiversity data (https://www.gbif.org), containing nearly 1.4 billion species occurrence records. This infrastructure offers a model of best practices, both technological and cultural, that other domains may wish to adapt or emulate to ensure that its users have free, FAIR and open access to data.</p><p>The availability of community-supported data and metadata standards in the biodiversity informatics community, combined with the adoption (in 2014) of open Creative Commons licensing for data shared with GBIF, established the necessary preconditions for the network's recent growth.</p><p>But GBIF's development of a data citation system based on the uses of DOIs—Digital Object Identifiers—has established an approach for using unique identifiers to establish direct links between scientific research and the underlying data on which it depends. The resulting state-of-the-art system tracks uses and reuses of data in research and credits data citations back to individual datasets and publishers, helping to ensure the transparency of biodiversity-related scientific analyses.</p><p>In 2015, GBIF began issuing a unique Digital Object Identifier (DOI) for every data download. This system resolves each download to a landing page containing 1) the taxonomic, geographic, temporal and other search parameters used to generate the download; 2) a quantitative map of the underlying datasets that contributed to the download; and 3) a simple citation to be included in works that rely on the data.</p><p>When authors cite these download DOIs, they in effect assert direct links between scientific papers and underlying data. Crossref registers these links through Event Data, enabling GBIF to track citation counts automatically for each download, dataset and publisher. These counts expand to display a bibliography of all research reuses of the data.This system improves the incentives for institutions to share open data by providing quantifiable measures demonstrating the value and impact of sharing data for others' research.</p><p>GBIF is a mature infrastructure that supports a wide pool of researchers publish two peer-reviewed journal articles that rely on this data every day. That said, the citation-tracking and -crediting system has room for improvement. At present, 21% of papers using GBIF-mediated data provide DOI citations—which represents a 30% increase over 2018. Through outreach to authors and collaboration with journals, GBIF aims to continue this trend.</p><p>In addition, members of the GBIF network are seeking to extend citation credits to individuals through tools like Bloodhound Tracker (https://www.bloodhound-tracker.net) using persistent identifiers from ORCID and Wikidata IDs. This approach provides a compelling model for the scientific and scholarly benefits of treating individual data records from specimens as micro- or nanopublications—first-class research objects that advancing both FAIR data and open science.</p>


Author(s):  
Donald Hobern ◽  
Deborah L Paul ◽  
Tim Robertson ◽  
Quentin Groom ◽  
Barbara Thiers ◽  
...  

Information about natural history collections helps to map the complex landscape of research resources and assists researchers in locating and contacting the holders of specimens. Collection records contribute to the development of a fully interlinked biodiversity knowledge graph (Page 2016), showcasing the existence and importance of museums and herbaria and supplying context to available data on specimens. These records also potentially open new avenues for fresh use of these collections and for accelerating their full availability online. A number of international (e.g., Index Herbariorum, GRSciColl) regional (e.g. DiSSCo and CETAF) national (e.g., ALA and the Living Atlases, iDigBio US Collections Catalog) and institutional networks (e.g., The Field Museum) separately document subsets of the world's collections, and the Biodiversity Information Standards (TDWG) Collection Descriptions Interest Group is actively developing standards to support information sharing on collections. However, these efforts do not yet combine to deliver a comprehensive and connected view of all collections globally. The Global Biodiversity Information Facility (GBIF) received funding as part of the European Commission-funded SYNTHESYS+ 7 project to explore development of a roadmap towards delivering such a view, in part as a contribution towards the establishment of DiSSCo services within a global ecosystem of collection catalogues. Between 17 and 29 April 2020, a coordination team comprising international representatives from multiple networks ran Advancing the Catalogue of the World’s Natural History Collections, a fully online consultation using the GBIF Discourse forum platform to guide discussion around 26 consultation topics identified in an initial Ideas Paper (Hobern et al. 2020). Discussions included support for contributions in Spanish, Chinese and French and were summarised daily throughout the consultation. The consultation confirmed broad agreement around the needs and goals for a comprehensive catalogue of the world’s natural history collections, along with possible strategies to overcome the challenges. This presentation will summarise the results and recommendations.


2018 ◽  
Vol 2 ◽  
pp. e25857
Author(s):  
Rafael Borges ◽  
Wilian França Costa ◽  
Antonio Saraiva ◽  
Vera Imperatriz-Fonseca ◽  
Tereza Giannini

Natural history collections are of extreme importance as they safeguard data from both spatial and temporal sources. Biological collections store the biodiversity information of the majority of the world´s ecosystems, including data from extinct and threatened species. Worldwide, interactions betweens species perform important functions that contribute to the maintenance of the environment. The use of biodiversity by human society generates the so called Ecosystems Services (Nature’s Contributions to People), which may act at a local or even a global scale, as is the case with crop pollination services. Bees are the most important pollinator group and are responsible for the pollination of approximately 80% of Angiosperms and 75% of the crops worldwide. Bee pollinator decline has raised concern globally, the loss and degradation of habitat being one of the causes, with detrimental impacts on food production and biodiversity. In this context, we suggest incorporating and providing spatially explicit plant-pollinator interaction data into natural history collections databases. Plant-pollinator interaction traits (morphological, biochemical, physiological, structural, phenological or behavioural characteristics of organisms that influence performance or fitness) can firstly be identified through pollination syndromes by using floral traits such as size, shape, color, odor and the reward. Bee body size (estimated usually by intertegular distance) and tongue length are important traits that can be used to evaluate bee-flower compatibility and also to estimate an average flight range for each bee species through the landscape. Since interaction is context dependent, data on functional traits could be associated with spatial references, such as geographic coordinates, altitude and land use where species were collected. Such information is usually available in data repositories delivered by collections. Thus, the association of species identification, functional traits and occurrences can act as an important tool for understanding local ecosystem processes, to forecast impacts based on land use and climate change and also for assisting decision making processes for nature conservation. Online databases must also be linked to a Digital Object Identifier (DOI), as is the case for data plubications, so that the work of providing the data can be properly acknowledged and cited in the literature.


2013 ◽  
Vol 47 (1) ◽  
pp. 15-41 ◽  
Author(s):  
ELISE S. LIPKOWITZ

AbstractIn order to recast scholarly understanding of scientific cosmopolitanism during the French Revolution, this essay examines the stories of the natural-history collections of the Dutch Stadholder and the French naturalist Labillardière that were seized as war booty. The essay contextualizes French and British savants' responses to the seized collections within their respective understandings of the relationship between science and state and of the property rights associated with scientific collections, and definitions of war booty that antedated modern transnational legal conventions. The essay argues that the French and British savants' responses to seized natural-history collections demonstrate no universal approach to their treatment. Nonetheless, it contends that the French and British approaches to these collections reveal the emergence in the 1790s of new forms of scientific nationalism that purported to be cosmopolitan – French scientific universalism and British liberal scientific improvement.


Author(s):  
Elspeth Haston ◽  
Lorna Mitchell

The specimens held in natural history collections around the world are the direct result of the effort of thousands of people over hundreds of years. However, the way that the names of these people have been recorded within the collections has never been fully standardised, and this makes the process of correctly assigning the event relating to the specimen to an individual difficult at best, and impossible at worst. The events in which people are related to specimens include collecting, identifying, naming, loaning and owning. Whilst there are resources in the botanical community that hold information on many collectors and authors of plant names, the residual number of unknown people and the effort required to disambiguate them is daunting. Moreover, in many cases, the work carried out within the collection to disambiguate the names relating to the specimens is often not recorded and made available, generally due to the lack of a system to do so. This situation is making it extremely difficult to search for collections within the main aggregators, such as GBIF —the Global Biodiversity Information Facility— , and severely hampers our ability to link collections both within and between institutes and disciplines. When we look at benefits of linking collections and people, the need to agree and implement a system of managing people names becomes increasingly urgent.


Sign in / Sign up

Export Citation Format

Share Document