scholarly journals Darwin Core for Agricultural Biodiversity: A metadata extension proposal

Author(s):  
Filipi Soares ◽  
Benildes Maculan ◽  
Debora Drucker

Agricultural Biodiversity has been defined by the Convention on Biological Diversity as the set of elements of biodiversity that are relevant to agriculture and food production. These elements are arranged into an agro-ecosystem that compasses "the variability among living organisms from all sources including terrestrial, marine and other aquatic ecosystems and the ecological complexes of which they are part: this includes diversity within species, between species and of ecosystems" (UNEP 1992). As with any other field in Biology, Agricultural Biodiversity work produces data. In order to publish data in a way it can be efficiently retrieved on web, one must describe it with proper metadata. A metadata element set is a group of statements made about something. These statements have three elements, named subject (thing represented), predicate (space filled up with data) and object (data itself). This representation is called triples. For example, the title is a metadata element. A book is the subject; title is the predicate; and The Chronicles of Narnia is the object. Some metadata standards have been developed to describe biodiversity data, as ABCD Data Schema, Darwin Core (DwC) and Ecological Metadata Language (EML). The DwC is said to be the most used metadata standard to publish data about species occurrence worldwide (Global Biodiversity Information Facility 2019). "Darwin Core is a standard maintained by the Darwin Core maintenance group. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing identifiers, labels, and definitions. Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information" (Biodiversity Information Standards (TDWG) 2014). Within this thematic context, a master research project is in progress at the Federal University of Minas Gerais in partnership with the Brazilian Agricultural Research Corporation (EMBRAPA). It aims to apply the DwC on Brazil’s Agricultural Biodiversity data. A pragmatic analysis of DwC and DwC Extensions demonstrated that important concepts and relations from Agricultural Biodiversity are not represented in DwC elements. For example, DwC does not have significant metadata to describe biological interactions, to convey important information about relations between organisms in an ecological perspective. Pollination is one of the biological interactions relevant to Agricultural Biodiversity, for which we need enhanced metadata. Given these problems, the principles of metadata construction of DwC will be followed in order to develop a metadata extension able to represent data about Agricultural Biodiversity. These principles are the Dublin Core Abstract Model, which present propositions for creating the triples (subject-predicate-object). The standard format of DwC Extensions (see Darwin Core Archive Validator) will be followed to shape the metadata extension. At the end of the research, we expect to present a model of DwC metadata record to publish data about Agricultural Biodiversity in Brazil, including metadata already existent in Simple DwC and the new metadata of Brazil’s Agricultural Biodiversity Metadata Extension. The resulting extension will be useful to represent Agricultural Diversity worldwide.

2017 ◽  
Vol 3 ◽  
pp. 49-59
Author(s):  
Bal Krishna Joshi

Agricultural biodiversity is the basis of human life and food security. Nepal with 577 cultivated species possesses huge diversity at varietal as well as landrace levels. In most agricultural crops the rapid genetic erosion due to several reasons is a common phenomenon. Thus, considering the importance of agricultural biodiversity declared by Convention on Biological Diversity for sustainable food production, National Agriculture Genetic Resources Center (NAGRC) has been established for conservation and sustainable utilization of agricultural biodiversity. This paper thus delineates the application of biotechnological tools adopted by NAGRC for effective and efficient conservation and use of agricultural plant genetic resources (APGRs). Among the adopted technologies, tissue bank using shoot tip culture of vegetatively propagating and recalcitrant crops eg potato, sugarcane, banana, sweet potato, etc are in function. Under the molecular marker technology, currently random amplified polymorphic DNA (RAPD) and simple sequence repeat (SSR) markers have been used for developing DNA profiles, identifying duplicates in the collections, assessing genetic diversity and screening accessions against economic traits. DNA bank has also been created for storing DNA of indigenous crops and these DNA can be accessed for research and study. Genotypic database has been developed for chayote, finger millet, wheat and maize for identification and selection of the accessions.Journal of Nepal Agricultural Research Council Vol.3 2017: 49-59


2020 ◽  
Vol 15 (4) ◽  
pp. 411-437 ◽  
Author(s):  
Marcos Zárate ◽  
Germán Braun ◽  
Pablo Fillottrani ◽  
Claudio Delrieux ◽  
Mirtha Lewis

Great progress to digitize the world’s available Biodiversity and Biogeography data have been made recently, but managing data from many different providers and research domains still remains a challenge. A review of the current landscape of metadata standards and ontologies in Biodiversity sciences suggests that existing standards, such as the Darwin Core terminology, are inadequate for describing Biodiversity data in a semantically meaningful and computationally useful way. As a contribution to fill this gap, we present an ontology-based system, called BiGe-Onto, designed to manage data together from Biodiversity and Biogeography. As data sources, we use two internationally recognized repositories: the Global Biodiversity Information Facility (GBIF) and the Ocean Biogeographic Information System (OBIS). BiGe-Onto system is composed of (i) BiGe-Onto Architecture (ii) a conceptual model called BiGe-Onto specified in OntoUML, (iii) an operational version of BiGe-Onto encoded in OWL 2, and (iv) an integrated dataset for its exploitation through a SPARQL endpoint. We will show use cases that allow researchers to answer questions that manage information from both domains.


Author(s):  
Filipi Soares ◽  
Antonio Saraiva ◽  
Debora Drucker

Agrobiodiversity, or biodiversity for food and agriculture, plays a major role in the sustainability of food production. As stated by FAO 2019, agrobiodiversity can provide food production systems and society with a variety of services as ecosystem services, crops resilience to threats, sustainable intensification, livelihoods, food security and nutrition. The official definition of the concept has been given by CBD 2000 as "a broad term that includes all components of biological diversity of relevance to food and agriculture, and all components of biological diversity that constitute the agroecosystem: the variety and variability of animals, plants and micro-organisms, at the genetic, species and ecosystem levels, which are necessary to sustain key functions of the agro-ecosystem, its structure and processes". Thus, agrobiodiversity is primarily based on species and their function in agroecosystems. Many projects for sharing agrobiodiversity data in a structured way have emerged over the years. One realizes in looking at the Bioversity International (2018) crop descriptors list that the earliest groups of descriptors for crops and some associated data emerged back in the 1970s. In the same list, there are four multi-crop descriptors and derived standards, which are broad standards for crop-related data, namely: Core descriptors for in situ conservation of crop wild relatives v.1 (Thormann et al. 2013); FAO/Bioversity multi-crop passport descriptors V.2.1 (Alercia et al. 2015); Descriptors for farmers' knowledge of plants (Aknazarov et al. 2009); Descriptors for Genetic Marker Technologies (Vicente et al. 2004). Core descriptors for in situ conservation of crop wild relatives v.1 (Thormann et al. 2013); FAO/Bioversity multi-crop passport descriptors V.2.1 (Alercia et al. 2015); Descriptors for farmers' knowledge of plants (Aknazarov et al. 2009); Descriptors for Genetic Marker Technologies (Vicente et al. 2004). These standards share some core elements in common, as taxon, location, a period of collecting, and were intended to be used in the context of data on the occurrence of species in nature. Darwin Core, a TDWG standard commonly used for sharing data of taxon occurrence in nature (Wieczorek et al. 2012), is a globally used metadata standard, representing "a large majority of the 1.4 billion of species occurrence records shared by the Global Biodiversity Information Facility (GBIF), published by more than 1561 organizations in 59 countries in January 2020" (Body et al. 2020). Darwin Core is a standardized language that applies unique Internationalized Resource Identifiers (IRIs) to each element assigned as a metadata element, plus a label and a definition. It improves the interoperability between databases in the context of the Semantic Web (Duerst and Suignard 2005). We believe it is possible to use Darwin Core to represent agrobiodiversity data if a metadata extension is developed to enroll the agrobiodiversity concepts missing in Darwin Core. Thus, a research project held at the University of São Paulo in partnership with Brazilian Agricultural Research Corporation started to map concepts and descriptors from the literature for agrobiodiversity data representation. This project is the sequence of the research initiated by Soares et al. 2019. The crop descriptors published by Bioversity International (2018) may be integrated into the metadata extension, but also other standards like Global Genome Biodiversity Network (GGBN) Data Standard v1 (Droege et al. 2016) and the Darwin Core germplasm extension (DwC-germplasm). At the moment, we are designing a mind map to organize the agrobiodiversity concepts. We expect the metadata extension will be useful for the scientific community to share agrobiodiversity data as linked data, applying Resource Description Framework (RDF) as a resource relationship model, for example.


Author(s):  
Lauren Weatherdon

Ensuring that we have the data and information necessary to make informed decisions is a core requirement in an era of increasing complexity and anthropogenic impact. With cumulative challenges such as the decline in biodiversity and accelerating climate change, the need for spatially-explicit and methodologically-consistent data that can be compiled to produce useful and reliable indicators of biological change and ecosystem health is growing. Technological advances—including satellite imagery—are beginning to make this a reality, yet uptake of biodiversity information standards and scaling of data to ensure its applicability at multiple levels of decision-making are still in progress. The complementary Essential Biodiversity Variables (EBVs) and Essential Ocean Variables (EOVs), combined with Darwin Core and other data and metadata standards, provide the underpinnings necessary to produce data that can inform indicators. However, perhaps the largest challenge in developing global, biological change indicators is achieving consistent and holistic coverage over time, with recognition of biodiversity data as global assets that are critical to tracking progress toward the UN Sustainable Development Goals and Targets set by the international community (see Jensen and Campbell (2019) for discussion). Through this talk, I will describe some of the efforts towards producing and collating effective biodiversity indicators, such as those based on authoritative datasets like the World Database on Protected Areas (https://www.protectedplanet.net/), and work achieved through the Biodiversity Indicators Partnership (https://www.bipindicators.net/). I will also highlight some of the characteristics of effective indicators, and global biodiversity reporting and communication needs as we approach 2020 and beyond.


2020 ◽  
Vol 12 (24) ◽  
pp. 10690
Author(s):  
Ishwari Singh Bisht ◽  
Jai Chand Rana ◽  
Rashmi Yadav ◽  
Sudhir Pal Ahlawat

Mainstreaming biodiversity in production landscapes ensures conservation and sustainable use of agricultural biodiversity, the key objectives of the Convention on Biological Diversity (CBD) and the projects supported by the United Nations Environment Program (UNEP) Global Environment Facility (GEF). Mainstreaming integrates biodiversity in existing or new programs and policies, both cross-sectoral and sector-specific. The conventional model of agricultural production with limited diversity in production systems and use of high chemical input has taught us a valuable lesson as it is adversely impacting the environment, the essential ecosystem services, the soil health and the long term sustainability of our food systems. Using a qualitative participant observation approach, our study investigated four distinct traditional Indian production landscapes to gage (i) the farming communities’ response to institutional policies, programs and agricultural biodiversity-related activities in traditional Indian production landscapes and (ii) opportunities and challenges for sustainable development in smallholder traditional Indian farming systems. Results indicate that the top-down decision-making regime is the least effective towards achieving sustainable development in traditional Indian farming landscapes and that farmers’ experiential knowledge on participatory biodiversity management, maintenance and use for sustainable development are of critical importance to India’s agriculture and economy. Reclaiming agriculture’s spiritual roots through organic farming and locally grown food emerged as key, including the need for designing and implementing a more sovereign food system. Revisiting traditional smallholder farming under the COVID-19 pandemic and lessons learned for repurposing India’s agricultural policy are also highlighted.


Author(s):  
José Augusto Salim ◽  
Antonio Saraiva

For those biologists and biodiversity data managers who are unfamiliar with information science data practices of data standardization, the use of complex software to assist in the creation of standardized datasets can be a barrier to sharing data. Since the ratification of the Darwin Core Standard (DwC) (Darwin Core Task Group 2009) by the Biodiversity Information Standards (TDWG) in 2009, many datasets have been published and shared through a variety of data portals. In the early stages of biodiversity data sharing, the protocol Distributed Generic Information Retrieval (DiGIR), progenitor of DwC, and later the protocols BioCASe and TDWG Access Protocol for Information Retrieval (TAPIR) (De Giovanni et al. 2010) were introduced for discovery, search and retrieval of distributed data, simplifying data exchange between information systems. Although these protocols are still in use, they are known to be inefficient for transferring large amounts of data (GBIF 2017). Because of that, in 2011 the Global Biodiversity Information Facility (GBIF) introduced the Darwin Core Archive (DwC-A), which allows more efficient data transfer, and has become the preferred format for publishing data in the GBIF network. DwC-A is a structured collection of text files, which makes use of the DwC terms to produce a single, self-contained dataset. Many tools for assisting data sharing using DwC-A have been introduced, such as the Integrated Publishing Toolkit (IPT) (Robertson et al. 2014), the Darwin Core Archive Assistant (GBIF 2010) and the Darwin Core Archive Validator. Despite promoting and facilitating data sharing, many users have difficulties using such tools, mainly because of the lack of training in information science in the biodiversity curriculum (Convention on Biological Diversiity 2012, Enke et al. 2012). However, most users are very familiar with spreadsheets to store and organize their data, but the adoption of the available solutions requires data transformation and training in information science and more specifically, biodiversity informatics. For an example of how spreadsheets can simplify data sharing see Stoev et al. (2016). In order to provide a more "familiar" approach to data sharing using DwC-A, we introduce a new tool as a Google Sheet Add-on. The Add-on, called Darwin Core Archive Assistant Add-on can be installed in the user's Google Account from the G Suite MarketPlace and used in conjunction with the Google Sheets application. The Add-on assists the mapping of spreadsheet columns/fields to DwC terms (Fig. 1), similar to IPT, but with the advantage that it does not require the user to export the spreadsheet and import it into another software. Additionally, the Add-on facilitates the creation of a star schema in accordance with DwC-A, by the definition of a "CORE_ID" (e.g. occurrenceID, eventID, taxonID) field between sheets of a document (Fig. 2). The Add-on also provides an Ecological Metadata Language (EML) (Jones et al. 2019) editor (Fig. 3) with minimal fields to be filled in (i.e., mandatory fields required by IPT), and helps users to generate and share DwC-Archives stored in the user's Google Drive, which can be downloaded as a DwC-A or automatically uploaded to another public storage resource like a user's Zenodo Account (Fig. 4). We expect that the Google Sheet Add-on introduced here, in conjunction with IPT, will promote biodiversity data sharing in a standardized format, as it requires minimal training and simplifies the process of data sharing from the user's perspective, mainly for those users not familiar with IPT, but that historically have worked with spreadsheets. Although the DwC-A generated by the add-on still needs to be published using IPT, it does provide a simpler interface (i.e., spreadsheet) for mapping data sets to DwC than IPT. Even though the IPT includes many more features than the Darwin Core Assistant Add-on, we expect that the Add-on can be a "starting point" for users unfamiliar with biodiversity informatics before they move on to more advanced data publishing tools. On the other hand, Zenodo integration allows users to share and cite their standardized data sets without publishing them via IPT, which can be useful for users without access to an IPT installation. Additionally, we are working on new features and future releases will include the automatic generation of Global Unique Identifiers for shared records, the possibility of adding additional data standards and DwC extensions, integration with GBIF REST API and with IPT REST API.


Author(s):  
Azra Velagić-Hajrudinović

Featuring a large variety of ecosystems, abundant freshwater and forest resources, unique extensive karstic systems, and a high level of biodiversity and endemism, Southeast Europe (SEE) plays a crucial role in the conservation of biodiversity in Europe and beyond. In order to conserve and sustainably use these biodiversity assets and valuable natural resources, a regional concerted approach in the field of biodiversity information management and reporting (BIMR) has been strengthened. This has enabled improvement in access, transparency and exchange of biodiversity data and reporting processes among the participating economies. Certain significant and visible progress among SEE economies and stakeholders is due to to the knowledge gained about regional and national BIMR baselines, agreed and elaborated minimum Convention on Biological Diversity (CBD) and European Union (EU) requirements on BIMR among stakeholders and implemented BIMR tools (e.g., a regionally unified fundamental database for the Information System for Nature Conservation (ISNC), for instance in Montenegro (http://zasticenapodrucja-cg.tk//en), Bosnia and Herzegovina/entity of Republika Srpska (http://e-priroda.rs.ba/en/) and entity of Federation of Bosnia and Herzegovina and North Macedonia (Standard Data Form - SDF application for NATURA 2000) and compiled dataset on five taxonomic groups of endemic taxa using the Darwin Core standard). Therefore, BIMR activities/priorities from the region have become more evident and supported along with ownership of BIMR tools acquired by the partner institutions and recognized at the global level through the Global Biodiversity Information Facility (GBIF).


Author(s):  
Raïssa Meyer ◽  
Pier Buttigieg ◽  
John Wieczorek ◽  
Thomas Jeppesen ◽  
William Duncan ◽  
...  

Biodiversity is increasingly being assessed using omic technologies (e.g. metagenomics or metatranscriptomics); however, the metadata generated by omic investigations is not fully harmonised with that of the broader biodiversity community. There are two major communities developing metadata standards specifications relevant to omic biodiversity data: TDWG, through its Darwin Core (DwC) standard, and the Genomic Standard Consortium (GSC), through its Minimum Information about any (x) Sequence (MIxS) checklists. To prevent these specifications leading to silos between the communities using them (e.g. INSDC: an internationally mandated database collaboration for nucleotide sequencing data [from health, biodiversity, microbiology, etc.] using the MIxS checklists; OBIS and GBIF: global biodiversity data networks using the DwC standard), there is a need to harmonise them at the level of the standards organisations themselves. To this end, we have brought together representatives from these standardisation bodies, along with representatives from established biodiversity data infrastructures, domain experts, data generators, and publishers to develop sustainable interoperability between the two specifications. Together, we have: generated a semantic mapping between the terminology used in each specification, and syntactic mapping of their associated values following the Simple Standard for Sharing Ontology Mappings (SSSOM), and created an example MIxS-DwC extension showing the incorporation of unmapped MIxS terms into a DwC-Archive. generated a semantic mapping between the terminology used in each specification, and syntactic mapping of their associated values following the Simple Standard for Sharing Ontology Mappings (SSSOM), and created an example MIxS-DwC extension showing the incorporation of unmapped MIxS terms into a DwC-Archive. To sustain these mechanisms of interoperability, we have proposed a Memorandum of Understanding between the GSC and TDWG. During our work, we also noted a number of key challenges that currently preclude interoperation between these two specifications. In this talk, we will outline the major steps we took to get here, as well as the future activities we recommend based on our outputs.


2018 ◽  
Vol 2 ◽  
pp. e27087
Author(s):  
Donald Hobern ◽  
Andrea Hahn ◽  
Tim Robertson

For more than a decade, the biodiversity informatics community has recognised the importance of stable resolvable identifiers to enable unambiguous references to data objects and the associated concepts and entities, including museum/herbarium specimens and, more broadly, all records serving as evidence of species occurrence in time and space. Early efforts built on the Darwin Core institutionCode, collectionCode and catalogueNumber terms, treated as a triple and expected to uniquely to identify a specimen. Following review of current technologies for globally unique identifiers, TDWG adopted Life Science Identifiers (LSIDs) (Pereira et al. 2009). Unfortunately, the key stakeholders in the LSID consortium soon withdrew support for the technology, leaving TDWG committed to a moribund technology. Subsequently, publishers of biodiversity data have adopted a range of technologies to provide unique identifiers, including (among others) HTTP Universal Resource Identifiers (URIs), Universal Unique Identifiers (UUIDs), Archival Resource Keys (ARKs), and Handles. Each of these technologies has merit but they do not provide consistent guarantees of persistence or resolvability. More importantly, the heterogeneity of these solutions hampers delivery of services that can treat all of these data objects as part of a consistent linked-open-data domain. The geoscience community has established the System for Earth Sample Registration (SESAR) that enables collections to publish standard metadata records for their samples and for each of these to be associated with an International Geo Sample Number (IGSN http://www.geosamples.org/igsnabout). IGSNs follow a standard format, distribute responsibility for uniqueness between SESAR and the publishing collections, and support resolution via HTTP URI or Handles. Each IGSN resolves to a standard metadata page, roughly equivalent in detail to a Darwin Core specimen record. The standardisation of identifiers has allowed the community to secure support from some journal publishers for promotion and use of IGSNs within articles. The biodiversity informatics community encompasses a much larger number of publishers and greater pre-existing variation in identifier formats. Nevertheless, it would be possible to deliver a shared global identifier scheme with the same features as IGSNs by building off the aggregation services offered by the Global Biodiversity Information Facility (GBIF). The GBIF data index includes normalised Darwin Core metadata for all data records from registered data sources and could serve as a platform for resolution of HTTP URIs and/or Handles for all specimens and for all occurrence records. The most significant trade-off requiring consideration would be between autonomy for collections and other publishers in how they format identifiers within their own data and the benefits that may arise from greater consistency and predictability in the form of resolvable identifiers.


HortScience ◽  
2007 ◽  
Vol 42 (2) ◽  
pp. 200-202 ◽  
Author(s):  
Philip L. Forsline ◽  
Kim E. Hummer

The National Plant Germplasm System (NPGS) of the U.S. Department of Agriculture (UDSA), Agricultural Research Service (ARS), has greatly expanded since 1980. Foremost in this expansion was the addition of seven repositories for clonally propagated fruit and specialty crops. Many collections at state agricultural experiment station sites were in jeopardy as breeders retired. These collections can now be preserved by the NPGS. The NPGS has provided funding for plant exploration and exchange. From 1980 to 2004, 37 exploration/exchange proposals for fruit crops were funded, and over 3000 accessions introduced as a result. Crop Germplasm Committees (CGCs), established for each commodity have prepared genetic vulnerability statements and prioritized collection activities. The USDA ARS, National Germplasm Resources Laboratory (NGRL), facilitates international relationships, and the USDA Animal and Plant Health Inspection Service (APHIS), National Plant Germplasm Quarantine Center (NPGQC), tests and makes pathogen-tested germplasm available. As a result of the Convention on Biological Diversity (1993) and the International Treaty on Plant Genetic Resource for Food and Agriculture (2004), the USDA now pursues germplasm collection through the establishment of bilateral agreements of mutual benefit.


Sign in / Sign up

Export Citation Format

Share Document