scholarly journals ALICE: Angled Label Image Capture and Extraction for high throughput insect specimen digitisation

Author(s):  
Benjamin Wills Price ◽  
Steen Dupont ◽  
Elizabeth Louise Allan ◽  
Vladimir Blagoderov ◽  
Alice Jenny Butcher ◽  
...  

The world’s natural history collections contain at least 2 billion specimens, representing a unique data source for answering fundamental scientific questions about ecological, evolutionary, and geological processes. Unlocking this treasure trove of data, stored in thousands of museum drawers and cabinets, is crucial to help map a sustainable future for ourselves and the natural systems on which we depend. The rate-limiting steps in the digitisation of natural history collections often involve specimen handling due to their fragile nature. Insects comprise the single largest collection type in the Natural History Museum, London (NHM), reflecting their global diversity. The NHM pinned insect collection, estimated at 25 million specimens, will take over 700 person years to digitise at current rates. In order to ramp up digitisation we have developed ALICE for Angled Label Image Capture and Extraction. This multi-camera setup and associated software processing pipeline enables primary data capture from angled images, without removal of the labels from the specimen pin. As a result ALICE enables a single user to sustainably image over 1,000 specimens per day, allowing us to digitally unlock the insect collections at an unprecedented rate.

Author(s):  
Steen Dupont ◽  
Benjamin Price

The world’s natural history collections contain at least 2 billion specimens (Ariño 2010), representing a unique data source for answering fundamental scientific questions about ecological, evolutionary, and geological processes. Unlocking this treasure trove of data, stored in thousands of museum drawers and cabinets, is crucial to help map a sustainable future for ourselves and the natural systems on which we depend. The rate-limiting steps in the digitisation of natural history collections often involve specimen handling, due to their fragile nature. Insects comprise the single largest collection type in the Natural History Museum, London (NHM) and in many other collections, reflecting their global diversity and multiplicity. The NHM pinned insect collection, estimated at 25 million specimens, will take over 700 person years to digitise at current rates (Price et al. 2018: estimated from Blagoderov et al. 2017). In order to ramp up digitisation, we have developed ALICE for Angled Label Image Capture and Extraction from pinned insects. This multi-camera setup (Fig. 1) and associated software processing pipeline, enables primary data capture from angled images, without removal of the labels from the specimen pin. As a result ALICE enables a single user to sustainably digitise (add a catalogue label, image and prepare images for database import) over 800 specimens per day (Price et al. 2018), allowing us to digitally unlock large parts of the insect collection (e.g., Hymenoptera, Diptera, Coleoptera) at up to seven times the previous rate. We are continuing to refine hardware approaches to reduce specimen handling and extract data, for both human and machine interpretation, from labels without removing them from the object. More recently we are also trialing multiple mirrors in our Mirror Angled Label Image Capture Equipment (MALICE) (Fig. 2) or a rotating stage for our Vial Image Label Extraction (VILE) (Fig. 3) aimed at spirit-preserved specimens housed in vials. In this talk, we will outline the current approaches in use at the Natural History Museum, next generation prototypes, and challenges that need to be addressed before these techniques can be fully optimized.


2019 ◽  
Vol 5 ◽  
Author(s):  
Vincent Smith ◽  
Kristina Gorman ◽  
Wouter Addink ◽  
Christos Arvanitidis ◽  
Ana Casino ◽  
...  

European natural history collections are a critical infrastructure for meeting the most important challenge humans face over the next 30 years – creating a sustainable future for ourselves and the natural systems on which we depend – and for answering fundamental scientific questions about ecological, evolutionary, and geological processes. Since 2004 SYNTHESYS has been an essential instrument supporting this community, underpinning new ways to access and exploit collections, harmonising policy and providing significant new insights for thousands of researchers, while fostering the development of new approaches to face urgent societal challenges. SYNTHESYS+ is a fourth iteration of this programme, and represents a step change in the evolution of this community. For the first time SYNTHESYS+ brings together the European branches of the global natural science organisations (GBIF https://www.gbif.org/, TDWG https://www.tdwg.org/, GGBN http://www.ggbn.org/ggbn_portal/ and CETAF https://cetaf.org/) with an unprecedented number of collections, to integrate, innovate and internationalise our efforts within the global scientific collections community. Major new developments addressed by SYNTHESYS+ include the delivery of a new virtual access programme, providing digitisation on demand services to a significantly expanded user community; the construction of a European Loans and Visits System (ELViS) providing, for the first time, a unified gateway to accessing digital, physical and molecular collections; and a new data processing platform (the Specimen Data Refinery), applying cutting edge artificial intelligence to dramatically speed up the digital mobilisation of natural history collections. The activities of SYNTHESYS+ form a critical dependency for DiSSCo - the Distributed System of Scientific Collections (https://dissco.eu/), which is the European Research Infrastructure for natural science collections, under the ESFRI umbrella. DiSSCo will undertake the maintenance and sustainability of SYNTHESYS+ products at the end of the programme.


2015 ◽  
Vol 29 (1-2) ◽  
pp. 1-21 ◽  
Author(s):  
Bethany L. Abrahamson

AbstractNatural history collections (NHCs) are used in many fields of study, but general knowledge regarding their uses is poor. Because of this, funding and support for NHCs frequently fluctuate. One way in which collections professionals can illustrate a collection’s contribution to a variety of fields is based on the collection’s history of use. Tracking NHC utilization through time can increase NHC value to others outside of the collection, allow for the analysis of changes in specimen-based research trends, and assist in effective collection management. This case study focuses on NHC usage records held by the Museum of Southwestern Biology (MSB), a currently growing university collection used in many research fields, and presents methods for quantifying collections utilization through time. Through an exploration of these data, this paper illustrates MSB’s growth and changes in research produced over time and offers explanations for the changes observed. Last, this study provides suggestions for how collections professionals can most greatly benefit from considering NHC records as a data source. Understanding NHC usage from “the collection’s perspective” provides a new way for NHC professionals to understand NHCs’ value in the context of the research it supports and demonstrates the importance of this key infrastructure to a broader audience.


2019 ◽  
Vol 38 (2) ◽  
pp. 173-203 ◽  
Author(s):  
ANNARITA FRANZA ◽  
ROSANNA FABOZZI ◽  
LETIZIA VEZZOSI ◽  
LUCIANA FANTONI ◽  
GIOVANNI PRATESI

ABSTRACT The Collectio Mineralium (1765) currently preserved at the Historical Archive of the Natural History Museum of the University of Firenze, is the unpublished catalog of the mineralogical collection that belonged to Emperor Leopold II (1747–1792). The catalog is a 110-page register, with the golden emblem of the House of Habsburg at the center of the binding, containing information about 242 mineralogical samples. Each specimen is carefully described (i.e., habit, metal content, product value) and its locality given. The interpretation of the text has also returned information on most of the mining deposits in the Austro-Hungarian territories in the eighteenth century. Therefore, the interpretation of this catalog—that on the basis of the literature appears to be the first catalog of a collection belonged to a Habsburg emperor—represents an important step toward enhancing our understanding of Habsburg natural history collections and reflected the transition from wonder-rooms to commodity collecting. Leopold's private collection was no longer an ‘instrument of wonder’ but it became representative of scientific collecting characterized by the establishment of systematic mineralogy, and by a careful economic evaluation of the mineralogical samples collected as a symbol of the power of the Austro-Hungarian Empire.


2019 ◽  
Vol 2 ◽  
pp. 77-81 ◽  
Author(s):  
Mathias Küster

Abstract. The Müritzeum is a nature discovery centre and a museum in the heart of the Mecklenburg Lake District. It is the first natural history museum in Mecklenburg-Vorpommern, with natural history collections that are over 150 years old, and are still growing today. The collections contain about 290 000 specimens from the fields of botany, zoology and geology. An extensive library and an archive are also part of the museum. Collecting, preserving and researching natural history are our main spheres of activity. The exhibition in the Müritzeum offers the visitor a comprehensive insight into the development of the nature and landscape of northeastern Germany and of Mecklenburg-Vorpommern and the Lake Müritz region in particular. The largest aquarium for indigenous freshwater species in Germany enables visitors to imagine themselves in the underwater world of the Mecklenburg Lake District.


2018 ◽  
Vol 2 ◽  
pp. e26122
Author(s):  
Jason Best

In recent years, the natural history collections community has made great progress in accelerating the pace of collection digitization and global data-sharing. However, a common workflow bottleneck often occurs in that period immediately following image capture but preceding image submission to portals, a critical phase involving quality control, file management, image processing, metadata capture, data backup, and monitoring performance and progress. While larger institutions have likely developed reliable, automated workflows over time, small and medium institutions may not have the expertise or resources to design and implement workflows that take full advantage of automation opportunities. Without automation, these institutions must invest many hours of manual effort to meet quality and performance goals. To address its own needs, BRIT developed a number of workflow automation components, which coalesced over time into a suite of tools that operate on both an image capture station as a client application and on a server that provides file storage and image processing features. Together, these tools were created to meet the following goals: Simplify file management and data preservation through automation Quickly identify quality issues Quickly capture skeletal metadata to facilitate later databasing Significantly reduce time between image capture and online availability Provide performance and quality monitoring and reporting Easy configuration and maintenance of client and server The client and server components together can be considered a “digitization appliance”: software integrated with the specific goal of providing a comprehensive suite of digitization tools that can be quickly and easily deployed on simple consumer hardware. We have made this software available to the natural history collections community under an open-source license at https://github.com/BRITorg/digitization_appliance.


2010 ◽  
Vol 37 (2) ◽  
pp. 333-345
Author(s):  
Marcus B. Simpson ◽  
Sallie W. Simpson ◽  
David W. Johnston

As part of his plan for a “Compleat History” of the region, John Lawson, Surveyor-General of North Carolina, collected plants and animals in 1710 and 1711 from Virginia and North Carolina and shipped them to James Petiver in London. After Petiver's death in 1718, his collection was acquired by Hans Sloane and subsequently incorporated into the natural history collections in the British Museum. The Sloane herbarium, now at the Natural History Museum, London, contains more than 300 previously reported botanical specimens attributed to Lawson, but details of his zoological collecting have not heretofore been documented. Two of Sloane's manuscript catalogues of “Fossils” include at least 34 specimens that appear to have been among those sent by Lawson to Petiver. These Lawson specimens were probably discarded or destroyed by British Museum staff in the 1700s or early 1800s. The Sloane catalogues nevertheless provide evidence that Lawson had begun work on his ambitious plan for a natural history of Carolina. Lawson's untimely death in September 1711 brought an abrupt end to the project, and Petiver apparently never used the zoological material he received from Lawson.


Author(s):  
Marcus De Almeida ◽  
Ângelo Pinto ◽  
Alcimar Carvalho

Natural history collections (NHC) are guardians of biodiversity (Lane 1996) and essential to understand the natural world and its evolutionary processes. They hold samples of morphological and genetic heritages of living and extinct biotas, helping to reconstruct the timeline of life over the centuries (Gardner 2014). Primary data from specimens in NHC are crucial elements for research in many areas of biological sciences, considered the “bricks” of systematics and therefore one of the pillars for evolutionary studies (Troudet 2018). For this reason, studies carried out in NHC are essential for the development of the scientific knowledge and are pivotal for the scientific-technological progress of a nation (Camargo 2015). The digitization and availability of primary data on biodiversity from NHC represents a inexpensive, practical and secure means of exchanging information, allowing collaboration between institutions and researchers. In this sense, initiatives such as the Sistema de Informação sobre a Biodiversidade Brasileira (SiBBr), a country-level branch of the Global Biodiversity Information Facility (GBIF) platform, aim to encourage and establish ways for the informatization of biological collections and their type specimens. Known for housing one of the largest and oldest collections of insects in the world focused on Neotropical fauna, the Entomological Collection of the Museu Nacional of Federal University of Rio de Janeiro (MNRJ) had more than 3,000 primary types and approximately 12,005,000 specimens, of which about 96% were lost in the tragic fire occurred at the institution on September 2, 2018. The SiBBr project was active in that collection from 2016 to 2019 and enabled the digitization and preservation of data from the type material of many insect orders, including the charismatic dragonflies (order Odonata). Due to the end of the agreement between SiBBr and the Museu Nacional, most of the obtained primary data are pending full curation and, therefore, are not yet available to the public and researchers. The MNRJ housed the biggest and most important collection of dragonflies among all Central and South American institutions. It assembled most of the physical records of neotropical dragonfly fauna gathered over the last 80 years, many of which are of undescribed taxa. Unfortunately, almost all material was permanently lost. This study aims to gather, analyze and publicize primary data of the type material of dragonflies housed in the MNRJ, ensuring the preservation of its history, as well as providing data on the taxonomy and diversity of this marvelous group of insects. A total of 11 families, 50 genera and 131 species were recorded, belonging to the suborders Anisoptera and Zygoptera with distributional records widespread in South America. The MNRJ housed 105 holotypes of dragonflies' nomina representing 11.7% of the richness of the Brazilian Odonata fauna (901 spp.), a country with the highest number of species of the biosphere. The impact of the loss of this collection to studies of these insects is unprecedented, since some enigmatic and monotypic genera such as Brasiliogomphus, Fluminagrion and Roppaneura lost 100% of their type series, while others most diverse such as Lauromacromia, Oxyagrion and Neocordulia lost 50%, 35% and 31% of their holotypes. Therefore, due to the registration and preservation of primary biodiversity data, this work reiterates the importance of curating and digitizing biological scientific collections. Furthermore, it shows extreme relevance for preserving information on existing biodiversity permanently and providing support for future research. Digitization and interconnecting digital extended specimen data proves to be one of the main and most effective ways to protect NHC heritage and their primary data against catastrophic events.


Author(s):  
Elizabeth Louise Allan ◽  
Steen Dupont ◽  
Helen Hardy ◽  
Laurence Livermore ◽  
Benjamin Price ◽  
...  

The Natural History Museum, London (NHM) has embarked on an ambitious Digital Collections Programme to digitise its collections. One aim of the programme has been to improve the workflows and infrastructure needed to support high-throughput digitisation and create comprehensive digital inventories of large scientific collections. Pilot projects have been carried out for a variety of collection types, from which high-throughput imaging workflows have been developed and refined. These workflows have focused on pinned insect specimens (Blagoderov et al. 2012, Paterson et al. 2016, Blagoderov et al. 2017, Price et al. 2018), microscope slides (whole slide and specimen imaging; Allan et al. 2018, Allan et al. 2019) and herbarium sheets. The rate and time taken to digitise specimens is influenced by a number of factors that include, among others, the level of preparation and post-processing required, imaging approach, the type of specimens as well as the complexity and condition of the collection. As part of this presentation we will include information on the rate, cost and time to digitise various NHM collections, illustrating how our processes have improved digitisation efficiency and allowed us to maintain quality. The programme has run a variety of digitisation projects, gathering data about rates of digitisation (preparation, imaging, transcription etc.) and developing improvements. Collection types such as microscope slides and herbarium sheets lend themselves to higher imaging rates, while other collections such as pinned insects, which require greater amounts of specimen handling to remove labels, tend to have lower imaging rates (Fig. 1). In order to increase efficiency, we have developed approaches that minimise specimen handling. For example, workflows for pinned insects such as the Angled Label Image Capture and Extraction (ALICE) do not require the removal of specimen labels from the pin as the system can capture angled images of the labels, thus increasing the imaging rate three-fold (Fig. 1). Another approach taken is to semi-automate mass digitisation using a combination of temporary and permanent Data Matrix barcode labels (Allan et al. 2019). By using multiple barcodes at the imaging stage to encode information associated with each specimen (i.e. unique identifier, location in the collection, taxonomic name, type status etc.; Fig. 2), we can run a series of automated processes, including file renaming, image processing and bulk import into the Museum’s collection management system. Through adaptation of our workflows with this new approach we have increased the efficiency of digitisation processes, illustrating how simple activities, like automated file renaming, reduces image post-processing time, minimises human error and can be applied across multiple collection types.


Sign in / Sign up

Export Citation Format

Share Document