Closing Gaps But Increasing Bias In North American Butterfly Inventory Completeness

AbstractAggregate biodiversity data from museum specimens and community observations have promise for macroscale ecological analyses. Despite this, many groups are under-sampled, and sampling is not homogeneous across space. Here we used butterflies, the best documented group of insects, to examine inventory completeness across North America. We separated digitally accessible butterfly records into those from natural history collections and burgeoning community science observations to determine if these data sources have differential spatio-taxonomic biases. When we combined all data, we found startling under-sampling in regions with the most dramatic trajectories of climate change and across biomes. We also found support for the hypothesis that community science observations are filling more gaps in sampling but are more biased towards areas with the highest human footprint. Finally, we found that both types of occurrences have familial-level taxonomic completeness biases, in contrast to the hypothesis of less taxonomic bias in natural history collections data. These results suggest that higher inventory completeness, driven by rapid growth of community science observations, is partially offset by higher spatio-taxonomic biases. We use the findings here to provide recommendations on how to alleviate some of these gaps in the context of prioritizing global change research.

Download Full-text

Applying computer vision to digitised natural history collections for climate change research: temperature-size responses in British butterflies

10.1101/2021.12.21.473511 ◽

2021 ◽

Author(s):

Rebecca J Wilson ◽

Alexandre F de Siqueira ◽

Stephen J Brooks ◽

Benjamin W Price ◽

Lea M Simon ◽

...

Keyword(s):

Climate Change ◽

Computer Vision ◽

Natural History ◽

Global Change ◽

Adult Size ◽

Phenotypic Data ◽

Climate Change Research ◽

Biotic Response ◽

Natural History Collections ◽

Global Change Research

Natural history collections (NHCs) are invaluable resources for understanding biotic response to global change. Museums around the world are currently imaging specimens, capturing specimen data, and making them freely available online. In parallel to the digitisation effort, there have been great advancements in computer vision (CV): the computer trained automated recognition/detection, and measurement of features in digital images. Applying CV to digitised NHCs has the potential to greatly accelerate the use of NHCs for biotic response to global change research. In this paper, we apply CV to a very large, digitised collection to test hypotheses in an established area of biotic response to climate change research: temperature-size responses. We develop a CV pipeline (Mothra) and apply it to the NHM iCollections of British butterflies (>180,000 specimens). Mothra automatically detects the specimen in the image, sets the scale, measures wing features (e.g., forewing length), determines the orientation of the specimen (pinned ventrally or dorsally), and identifies the sex. We pair these measurements and meta-data with temperature records to test how adult size varies with temperature during the immature stages of species and to assess patterns of sexual-size dimorphism across species and families. Mothra accurately measures the forewing lengths of butterfly specimens and compared to manual baseline measurements, Mothra accurately determines sex and forewing lengths of butterfly specimens. Females are the larger sex in most species and an increase in adult body size with warm monthly temperatures during the late larval stages is the most common temperature size response. These results confirm suspected patterns and support hypotheses based on recent studies using a smaller dataset of manually measured specimens. We show that CV can be a powerful tool to efficiently and accurately extract phenotypic data from a very large collection of digital NHCs. In the future, CV will become widely applied to digital NHC collections to advance ecological and evolutionary research and to accelerate the use of NHCs for biotic response to global change research.

Download Full-text

Using insect natural history collections to study global change impacts: challenges and opportunities

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2017.0405 ◽

2018 ◽

Vol 374 (1763) ◽

pp. 20170405 ◽

Cited By ~ 18

Author(s):

Heather M. Kharouba ◽

Jayme M. M. Lewthwaite ◽

Rob Guralnick ◽

Jeremy T. Kerr ◽

Mark Vellend

Keyword(s):

Natural History ◽

Global Change ◽

Global Changes ◽

Genomic Information ◽

Theme Issue ◽

The Past ◽

Natural History Collections ◽

Global Change Research ◽

Challenges And Opportunities

Over the past two decades, natural history collections (NHCs) have played an increasingly prominent role in global change research, but they have still greater potential, especially for the most diverse group of animals on Earth: insects. Here, we review the role of NHCs in advancing our understanding of the ecological and evolutionary responses of insects to recent global changes. Insect NHCs have helped document changes in insects' geographical distributions, phenology, phenotypic and genotypic traits over time periods up to a century. Recent work demonstrates the enormous potential of NHCs data for examining insect responses at multiple temporal, spatial and phylogenetic scales. Moving forward, insect NHCs offer unique opportunities to examine the morphological, chemical and genomic information in each specimen, thus advancing our understanding of the processes underlying species’ ecological and evolutionary responses to rapid, widespread global changes. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the anthropocene’.

Download Full-text

Rapid Creation of a Data Product for the World's Specimens of Horseshoe Bats and Relatives, a Known Reservoir for Coronaviruses

Biodiversity Information Science and Standards ◽

10.3897/biss.4.59067 ◽

2020 ◽

Vol 4 ◽

Author(s):

Erica Krimmel ◽

Austin Mast ◽

Deborah Paul ◽

Robert Bruhn ◽

Nelson Rios ◽

...

Keyword(s):

Natural History ◽

Life Histories ◽

Lessons Learned ◽

Reproducible Research ◽

State University ◽

Biodiversity Data ◽

Global Biodiversity Information Facility ◽

Data Product ◽

Natural History Collections ◽

Horseshoe Bats

Genomic evidence suggests that the causative virus of COVID-19 (SARS-CoV-2) was introduced to humans from horseshoe bats (family Rhinolophidae) (Andersen et al. 2020) and that species in this family as well as in the closely related Hipposideridae and Rhinonycteridae families are reservoirs of several SARS-like coronaviruses (Gouilh et al. 2011). Specimens collected over the past 400 years and curated by natural history collections around the world provide an essential reference as we work to understand the distributions, life histories, and evolutionary relationships of these bats and their viruses. While the importance of biodiversity specimens to emerging infectious disease research is clear, empowering disease researchers with specimen data is a relatively new goal for the collections community (DiEuliis et al. 2016). Recognizing this, a team from Florida State University is collaborating with partners at GEOLocate, Bionomia, University of Florida, the American Museum of Natural History, and Arizona State University to produce a deduplicated, georeferenced, vetted, and versioned data product of the world's specimens of horseshoe bats and relatives for researchers studying COVID-19. The project will serve as a model for future rapid data product deployments about biodiversity specimens. The project underscores the value of biodiversity data aggregators iDigBio and the Global Biodiversity Information Facility (GBIF), which are sources for 58,617 and 79,862 records, respectively, as of July 2020, of horseshoe bat and relative specimens held by over one hundred natural history collections. Although much of the specimen-based biodiversity data served by iDigBio and GBIF is high quality, it can be considered raw data and therefore often requires additional wrangling, standardizing, and enhancement to be fit for specific applications. The project will create efficiencies for the coronavirus research community by producing an enhanced, research-ready data product, which will be versioned and published through Zenodo, an open-access repository (see doi.org/10.5281/zenodo.3974999). In this talk, we highlight lessons learned from the initial phases of the project, including deduplicating specimen records, standardizing country information, and enhancing taxonomic information. We also report on our progress to date, related to enhancing information about agents (e.g., collectors or determiners) associated with these specimens, and to georeferencing specimen localities. We seek also to explore how much we can use the added agent information (i.e., ORCID iDs and Wikidata Q identifiers) to inform our georeferencing efforts and to support crediting those collecting and doing identifications. The project will georeference approximately one third of our specimen records, based on those lacking geospatial coordinates but containing textual locality descriptions. We furthermore provide an overview of our holistic approach to enhancing specimen records, which we hope will maximize the value of the bat specimens at the center of what has been recently termed the "extended specimen network" (Lendemer et al. 2020). The centrality of the physical specimen in the network reinforces the importance of archived materials for reproducible research. Recognizing this, we view the collections providing data to iDigBio and GBIF as essential partners, as we expect that they will be responsible for the long-term management of enhanced data associated with the physical specimens they curate. We hope that this project can provide a model for better facilitating the reintegration of enhanced data back into local specimen data management systems.

Download Full-text

Collectively, we need to accelerate Arctic specimen sampling

Arctic Science ◽

10.1139/as-2016-0037 ◽

2017 ◽

Vol 3 (3) ◽

pp. 515-524

Author(s):

Kevin Winker ◽

Jack Withrow

Keyword(s):

Climate Change ◽

Natural History ◽

Biological Systems ◽

The Arctic ◽

Future Research ◽

Individual Species ◽

Time And Space ◽

Natural History Collections ◽

Specimen Resource ◽

Specimen Sampling

Natural history collections are not often thought of as observatories, but they are increasingly being used as such to observe biological systems and changes within them. Objects and the data associated with them are archived for present and future research. These specimen collections provide many diverse scientific benefits, helping us understand not only individual species or populations but also the environments in which they live(d). Despite these benefits, the specimen resource is inadequate to the tasks being asked of it — there are many gaps, taxonomically and in time and space. We examine and highlight some of these gaps using bird collections as an example. Given the speed of climate change in the Arctic, we need to collectively work to fill these gaps so we can develop and wield the science that will make us better stewards of Arctic environments.

Download Full-text

Natural History Collections as Dynamic Research Archives

Stepping in the Same River Twice ◽

10.12987/yale/9780300209549.003.0004 ◽

2017 ◽

Cited By ~ 2

Author(s):

Tamar Dayan ◽

Bella Galil

Keyword(s):

Ecosystem Services ◽

Natural History ◽

Eastern Mediterranean ◽

Marine Biota ◽

Museum Specimens ◽

Eastern Mediterranean Sea ◽

Natural History Collections ◽

Individual Specimen ◽

International Awareness

This chapter discusses the importance of museum specimens and samples. Natural history collections are archives of biodiversity, snapshots that provide a way to physically retrieve an individual specimen and through it track changes in populations and species across repeatable surveys in time and space. Growing international awareness of the potential effects on humanity due to the loss of biodiversity and the ensuing erosion of ecosystem services has reinforced the value of natural history collections, museums, and herbaria worldwide. The chapter summarizes the strengths and weaknesses of natural history collections for repeated surveys and other historical studies that require replication. Through a case study of the historical surveys and resurveys of the taxonomic exploration of the marine biota of the eastern Mediterranean Sea, it highlights the relevance of collections for ecology and conservation. Finally, it discusses prospects for future uses of natural history collections in the context of replicated research.

Download Full-text

THE THE FIRST DOCUMENTED PREY ITEMS FOR Bothrops medusa (STERNFELD, 1920)

Revista Latinoamericana de Herpetología ◽

10.22201/fc.25942158e.2019.1.38 ◽

2019 ◽

Vol 2 (1) ◽

pp. 48

Author(s):

Tristan David Schramer

Keyword(s):

Endangered Species ◽

Natural History ◽

Stomach Contents ◽

Museum Specimens ◽

University Of Illinois ◽

Natural History Collections ◽

The University ◽

Prey Items

The Venezuelan forest pitviper (Bothrops medusa) is an endangered viperid endemic to the central range of the Cordillera de la Costa in Venezuela. Little is known regarding its natural history. We examined the stomach contents of museum specimens housed in the University of Illinois Museum of Natural History Herpetology Collection and report the first prey items for the species. The arboreal habits of both prey items support the notion that B. medusa may be semi-arboreal. This exposes the need for further studies on this rare viperid and showcases the value of natural history collections for studying endangered species.

Download Full-text

Unleash the Potential of your Website! 180,000 webpages from the French Natural History Museum marked up with Bioschemas/Schema.org biodiversity types

Biodiversity Information Science and Standards ◽

10.3897/biss.4.59046 ◽

2020 ◽

Vol 4 ◽

Author(s):

Franck Michel ◽

Gargominy Olivier ◽

Benjamin Ledentec ◽

The Bioschemas Community

Keyword(s):

Natural History ◽

Search Engines ◽

Critical Mass ◽

Scientific Data ◽

Third Party ◽

Data Sources ◽

Integrative Approach ◽

Biodiversity Data ◽

Major Step ◽

Global Biodiversity Information Facility

The challenge of finding, retrieving and making sense of biodiversity data is being tackled by many different approaches. Projects like the Global Biodiversity Information Facility (GBIF) or Encyclopedia of Life (EoL) adopt an integrative approach where they republish, in a uniform manner, records aggregated from multiple data sources. With this centralized, siloed approach, such projects stand as powerful one-stop shops, but tend to reduce the visibility of other data sources that are not (yet) aggregated. At the other end of the spectrum, the Web of Data promotes the building of a global, distributed knowledge graph consisting of datasets published by independent institutions according to the Linked Open Data principles (Heath and Bizer 2011), such as Wikidata or DBpedia. Beyond these "sophisticated" infrastructures, websites remain the most common way of publishing and sharing scientific data at low cost. Thanks to web search engines, everyone can discover webpages. Yet, the summaries provided in results lists are often insufficiently informative to decide whether a web page is relevant with respect to some research interests, such that integrating data published by a wealth of websites is hardly possible. A strategy around this issue lies in annotating websites with structured, semantic metadata such as the Schema.org vocabulary (Guha et al. 2015). Webpages typically embed Schema.org annotations in the form of markup data (written in the RDFa or JSON-LD formats), which search engines harvest and exploit to improve ranking and provide more informative summarization. Bioschemas is a community effort working to extend Schema.org to support markup for Life Sciences websites (Michel and The Bioschemas Community 2018, Garcia et al. 2017). Bioschemas primarily re-uses existing terms from Schema.org, occasionally re-uses terms from third-party vocabularies, and when necessary proposes new terms to be endorsed by Schema.org. As of today, Bioschemas's biodiversity group has proposed the Taxon type*1 to support the annotation of any webpage denoting taxa, TaxonName to support more specifically the annotation of taxonomic names registries, and guidelines describing how to leverage existing vocabularies such as Darwin Core terms. To proceed further, the biodiversity community must now demonstrate its interest in having these terms endorsed by Schema.org: (1) through a critical mass of live markup deployments, and (2) by the development of applications capable of exploiting this markup data. Therefore, as a first step, the French National Museum of Natural History has marked up its natural heritage inventory website: over 180,000 webpages describing the species inventoried in French territories have been annotated with the Taxon and TaxonName types in the form of JSON-LD scripts (see example scripts). As an example, one can check the source of the Delphinus delphis page. In this presentation, by demonstrating that marking up existing webpages can be very inexpensive, we wish to encourage the biodiversity community to adopt this practice, engage in the discussion about biodiversity-related markup, and possibly propose new terms related e.g. to traits or collections. We believe that generalizing the use of such markup by the many websites reporting checklists, museum collections, occurrences, life traits etc. shall be a major step towards the generalized adoption of FAIR*2 principles (Wilkinson 2016), shall dramatically improve information discovery using search engines, and shall be a key accelerator for the development of novel, web-scale, biodiversity data integration scenarios.

Download Full-text

Assessment of North American arthropod collections: prospects and challenges for addressing biodiversity research

PeerJ ◽

10.7717/peerj.8086 ◽

2019 ◽

Vol 7 ◽

pp. e8086 ◽

Cited By ~ 10

Author(s):

Neil S. Cobb ◽

Lawrence F. Gall ◽

Jennifer M. Zaspel ◽

Nicolas J. Dowdy ◽

Lindsie M. McCabe ◽

...

Keyword(s):

United States ◽

Natural History ◽

North American ◽

The United States ◽

Biodiversity Data ◽

Biodiversity Crisis ◽

Biodiversity Research ◽

Natural History Collections ◽

Rate Of Increase ◽

Crucial Information

Over 300 million arthropod specimens are housed in North American natural history collections. These collections represent a “vast hidden treasure trove” of biodiversity −95% of the specimen label data have yet to be transcribed for research, and less than 2% of the specimens have been imaged. Specimen labels contain crucial information to determine species distributions over time and are essential for understanding patterns of ecology and evolution, which will help assess the growing biodiversity crisis driven by global change impacts. Specimen images offer indispensable insight and data for analyses of traits, and ecological and phylogenetic patterns of biodiversity. Here, we review North American arthropod collections using two key metrics, specimen holdings and digitization efforts, to assess the potential for collections to provide needed biodiversity data. We include data from 223 arthropod collections in North America, with an emphasis on the United States. Our specific findings are as follows: (1) The majority of North American natural history collections (88%) and specimens (89%) are located in the United States. Canada has comparable holdings to the United States relative to its estimated biodiversity. Mexico has made the furthest progress in terms of digitization, but its specimen holdings should be increased to reflect the estimated higher Mexican arthropod diversity. The proportion of North American collections that has been digitized, and the number of digital records available per species, are both much lower for arthropods when compared to chordates and plants. (2) The National Science Foundation’s decade-long ADBC program (Advancing Digitization of Biological Collections) has been transformational in promoting arthropod digitization. However, even if this program became permanent, at current rates, by the year 2050 only 38% of the existing arthropod specimens would be digitized, and less than 1% would have associated digital images. (3) The number of specimens in collections has increased by approximately 1% per year over the past 30 years. We propose that this rate of increase is insufficient to provide enough data to address biodiversity research needs, and that arthropod collections should aim to triple their rate of new specimen acquisition. (4) The collections we surveyed in the United States vary broadly in a number of indicators. Collectively, there is depth and breadth, with smaller collections providing regional depth and larger collections providing greater global coverage. (5) Increased coordination across museums is needed for digitization efforts to target taxa for research and conservation goals and address long-term data needs. Two key recommendations emerge: collections should significantly increase both their specimen holdings and their digitization efforts to empower continental and global biodiversity data pipelines, and stimulate downstream research.

Download Full-text

Workforce Capacity Development and the Digital Extended Specimen

Biodiversity Information Science and Standards ◽

10.3897/biss.5.73927 ◽

2021 ◽

Vol 5 ◽

Author(s):

Anna Monfils ◽

Elizabeth R. Ellwood

Keyword(s):

Natural History ◽

Best Practices ◽

Capacity Development ◽

Workforce Training ◽

Educational Materials ◽

Biodiversity Data ◽

Natural History Collections ◽

Workforce Capacity ◽

Future Work ◽

And Training

As we look to the future of natural history collections and a global integration of biodiversity data, we are reliant on a diverse workforce with the skills necessary to build, grow, and support the data, tools, and resources of the Digital Extended Specimen (DES; Webster 2019, Lendemer et al. 2020, Hardisty 2020). Future “DES Data Curators” – those who will be charged with maintaining resources created through the DES – will require skills and resources beyond what is currently available to most natural history collections staff. In training the workforce to support the DES we have an opportunity to broaden our community and ensure that, through the expansion of biodiversity data, the workforce landscape itself is diverse, equitable, inclusive, and accessible. A fully-implemented DES will provide training that encapsulates capacity building, skills development, unifying protocols and best practices guidance, and cutting-edge technology that also creates inclusive, equitable, and accessible systems, workflows, and communities. As members of the biodiversity community and the current workforce, we can leverage our knowledge and skills to develop innovative training models that: include a range of educational settings and modalities; address the needs of new communities not currently engaged with digital data; from their onset, provide attribution for past and future work and do not perpetuate the legacy of colonial practices and historic inequalities found in many physical natural history collections. Recent reports from the Biodiversity Collections Network (BCoN 2019) and the National Academies of Science, Engineering and Medicine (National Academies of Sciences, Engineering, and Medicine 2020) specifically address workforce needs in support of the DES. To address workforce training and inclusivity within the context of global data integration, the Alliance for Biodiversity Knowledge included a topic on Workforce capacity development and inclusivity in Phase 2 of the consultation on Converging Digital Specimens and Extended Specimens - Towards a global specification for data integration. Across these efforts, several common themes have emerged relative to workforce training and the DES. A call for a community needs assessment: As a community, we have several unknowns related to the current collections workforce and training needs. We would benefit from a baseline assessment of collections professionals to define current job responsibilities, demographics, education and training, incentives, compensation, and benefits. This includes an evaluation of current employment prospects and opportunities. Defined skills and training for the 21st century collections professional: We need to be proactive and define the 21st century workforce skills necessary to support the development and implementation of the DES. When we define the skills and content needs we can create appropriate training opportunities that include scalable materials for capacity building, educational materials that develop relevant skills, unifying protocols across the DES network, and best practices guidance for professionals. Training for data end-users: We need to train data end-users in biodiversity and data science at all levels of formal and informal education from primary and secondary education through the existing workforce. This includes developing training and educational materials, creating data portals, and building analyses that are inclusive, accessible, and engage the appropriate community of science educators, data scientists, and biodiversity researchers. Foster a diverse, equitable, inclusive, and accessible and professional workforce: As the DES develops and new tools and resources emerge, we need to be intentional in our commitment to building tools that are accessible and in assuring that access is equitable. This includes establishing best practices to ensure the community providing and accessing data is inclusive and representative of the diverse global community of potential data providers and users. Upfront, we must acknowledge and address issues of historic inequalities and colonial practices and provide appropriate attribution for past and future work while ensuring legal and regulatory compliance. Efforts must include creating transparent linkages among data and the humans that create the data that drives the DES. In this presentation, we will highlight recommendations for building workforce capacity within the DES that are diverse, inclusive, equitable and accessible, take into account the requirements of the biodiversity science community, and that are flexible to meet the needs of an evolving field.

Download Full-text