scholarly journals Linked Open Biodiversity Data (LOBD): A semantic application for integrating biodiversity information

Author(s):  
Marcos Zárate ◽  
Paula Zermoglio ◽  
John Wieczorek ◽  
Anabela Plos ◽  
Renato Mazzanti

Scientists frequently collect biological and environmental information over years and store it in database systems to answer their own research questions without exposing it in repositories that make it easy to find and retrieve. While in recent years the community working on biodiversity informatics has made significant strides by creating common shared vocabularies such as the Darwin Core (DwC, Wieczorek et al. 2012) and publishing mechanisms such as the Integrated Publishing Toolkit (IPT, Robertson et al. 2014), integration is largely limited to the aggregation of datasets and full interoperability has still not been achieved. In this context, The Semantic Web (SW) aims to represent information in a way that, in addition to the human-centered display purposes, it can be used autonomously by machines for integration and reuse across applications. From the biodiversity informatics point of view, interoperability and links among data sources would allow integration of information that is otherwise disconnected, enabling scientists to answer broader questions. These considerations provide strong motivations to formulate a web application considering the semantic interoperability that may provide answers to questions such as the following: (Q1) Is it possible to complement taxonomic, bibliographic and environmental information of a particular species without relying on specific Application Programming Interfaces (APIs)? (Q2) How to relate occurrences of species with environmental variables within a specific region? (Q3) What are the bibliographic references associated with a given species? (Q1) Is it possible to complement taxonomic, bibliographic and environmental information of a particular species without relying on specific Application Programming Interfaces (APIs)? (Q2) How to relate occurrences of species with environmental variables within a specific region? (Q3) What are the bibliographic references associated with a given species? With questions such as these in mind, we present the design of a proof-of-concept application: Linked Open Biodiversity Data (LOBD). LOBD uses Linked Data (LD) (Heath and Bizer 2011) to complement species occurrence information previously extracted from GBIF and converted to Resource Description Framework (RDF) (Zárate et al. 2020) with information about the taxa in question from different RDF datasets, such as Wikidata, NCBI Taxonomy, Springer Nature SciGraph and OpenCitation corpus. A simplified view of the architecture is shown in Fig. 1. To achieve semantic interoperability, we use the SPARQL query language, which allows us not to depend on specific APIs to retrieve information. The application consists of three modules: General information, where the Wikidata endpoint is used to retrieve additional information about the selected species, including links to other databases and information about the species extracted from National Center for Biotechnology Information (NCBI) Taxonomy. Bibliography, where all publications related to the species are retrieved and extracted from OpenCitation. Environment, where users can plot species on a map and add layers related to marine regions as well as environmental layers (e.g., temperature, salinity, etc). General information, where the Wikidata endpoint is used to retrieve additional information about the selected species, including links to other databases and information about the species extracted from National Center for Biotechnology Information (NCBI) Taxonomy. Bibliography, where all publications related to the species are retrieved and extracted from OpenCitation. Environment, where users can plot species on a map and add layers related to marine regions as well as environmental layers (e.g., temperature, salinity, etc). For the development of the application, we use the Shiny framework for R, access to SPARQL endpoints is done through the SPARQL package, marine regions are obtained from marineregion.org and the environmental layers are extracted from Bio-ORACLE. The data used for this article were collected by the Center for the Study of Marine Systems at the National Patagonian Sci-Tech Centre (CCT CENPAT-CONICET), and are published and available through the GBIF network. Linked Data is a powerful tool for scientists, as it allows generating new approaches to biodiversity informatics, which can help to address the data integration challenges. Users would benefit from complementing the current prevalent use of vocabularies that are not ontologically defined (like DwC) for sharing biodiversity data. Although this application is a proof of concept, it shows that with little effort, it is possible to achieve greater interoperability between datasets that were not initially represented as LD.

Symmetry ◽  
2021 ◽  
Vol 13 (2) ◽  
pp. 317
Author(s):  
Chithambaramani Ramalingam ◽  
Prakash Mohan

The increasing demand for cloud computing has shifted business toward a huge demand for cloud services, which offer platform, software, and infrastructure for the day-to-day use of cloud consumers. Numerous new cloud service providers have been introduced to the market with unique features that assist service developers collaborate and migrate services among multiple cloud service providers to address the varying requirements of cloud consumers. Many interfaces and proprietary application programming interfaces (API) are available for migration and collaboration services among cloud providers, but lack standardization efforts. The target of the research work was to summarize the issues involved in semantic cloud portability and interoperability in the multi-cloud environment and define the standardization effort imminently needed for migrating and collaborating services in the multi-cloud environment.


2021 ◽  
Author(s):  
Nikesh Lalchandani ◽  
Frank Jiang ◽  
Jongkil Jay Jeong ◽  
Yevhen Zolotavkin ◽  
Robin Doss

Author(s):  
Sharif Islam

The European Loans and Visits System (ELViS) is an e-service in development designed to improve access to natural history collections across Europe. Bringing together heterogeneous datasets about institutions, people, collections and specimens, ELViS will provide an e-service (with application programming interfaces (APIs) and portal) that handles various stages of collections-based research. One of the main functionalities of ELViS is to facilitate loan and visit requests related to collections. To facilitate activities such as searching for collections, requesting loans, generating reports on collection usage, and ensuring interoperability with existing and new systems and services, ELViS must use a standard way of describing collections. In this talk, I show how ELViS can use the Collection Descriptions (CD) standard currently being developed by the CD Task Group at TDWG. I will provide a brief introduction to ELViS, summarise the current development efforts, and show how the Collection Description standard can support specific user requirements (gathered via an extensive set of user stories). I will also provide insight into the data elements within ELViS (see Fig. 1) and how they relate to the Collection Description data model.


Author(s):  
Mary Barkworth ◽  
Mushtaq Ahmad ◽  
Mudassir Asrar ◽  
Raza Bhatti ◽  
Neil Cobb ◽  
...  

In 2017, funding from the Biodiversity Information Fund for Asia accelerated data mobilization and georeferencing by Pakistani herbaria. The funding directly benefited only two herbaria but, by the end of the project 9 herbaria were involved in sharing data, 2 through GBIF (ISL 2019, SINDH 2019; codes according to Index herbariorum) and 6 others (BANNU 2019, BGH 2019, PUP 2019, QUETTA 2019, RAW 2019, SWAT 2019) through OpenHerbarium, a Symbiota based network. Eventually, all collections in OpenHerbarium are expected to become GBIF data providers. Additional Pakistani herbaria are being introduced to data mobilization and several individuals have expressed interest in learning to use OpenHerbarium to generated documented checklists for teaching and research and others for learning to link information in OpenHerbarium to other resources. These are the first steps to developing a “a large group of individuals … to train, mentor, and champion [biodiversity] data use” in Pakistan, but it is important to remember that good bioidiversity data starts in the field. We need to provide today’s collectors and educators with easy access to a) information about what constitutes a high-quality herbarium specimen; b) tools for making it easier to record and provide high quality specimen data; c) simple mechanisms for sharing data in ways that provide immediately useful resources; and d) learning to make use of the data becoming available. OpenHerbarium addresses the third and fourth needs and also makes it simple for collections to become GBIF data providers. This year, the focus will be on first two of the three steps identified. Introduction of the new resources will be used to introduce collectors and educators to the ideas underlying provision of biodiversity data that is fit for use and reuse. When Symbiota2 is functional, OpenHerbarium will be moved to that system. This will encourage development of additional tools for using biodiversity data. All these activities are essential to helping spread understanding of the concepts integral to biodiversity informatics. It is, of course, possible “to train, build, and champion data use” using data for other parts of the world, or provided by institutions from other parts of the world, but embedding good biodiversity data practices into the fabric of a country’s biodiversity education and research activities better benefits the country if a substantial portion of the data is generated from within the country. It also helps to spread knowledge of the country’s biodiversity among its students. Consequently, our focus in developing Pakistan’s capacity in biodiversity informatics is on engaging collections and collectors in sharing biodiversity data, then helping them discover, use, and create methods for developing the insights needed to encourage wise use of the country’s biological resources, and encouraging interaction. This will lead to a “community of practice” within Pakistan that can both benefit from and contribute to an international “community of practice”.


2020 ◽  
Vol 5 (1) ◽  
pp. 3-17
Author(s):  
Jian Qin

AbstractPurposeThis paper compares the paradigmatic differences between knowledge organization (KO) in library and information science and knowledge representation (KR) in AI to show the convergence in KO and KR methods and applications.MethodologyThe literature review and comparative analysis of KO and KR paradigms is the primary method used in this paper.FindingsA key difference between KO and KR lays in the purpose of KO is to organize knowledge into certain structure for standardizing and/or normalizing the vocabulary of concepts and relations, while KR is problem-solving oriented. Differences between KO and KR are discussed based on the goal, methods, and functions.Research limitationsThis is only a preliminary research with a case study as proof of concept.Practical implicationsThe paper articulates on the opportunities in applying KR and other AI methods and techniques to enhance the functions of KO.Originality/value:Ontologies and linked data as the evidence of the convergence of KO and KR paradigms provide theoretical and methodological support to innovate KO in the AI era.


Sign in / Sign up

Export Citation Format

Share Document