scholarly journals Improving Impact Metrics of Open and Free Biodiversity Data through Linked Metadata and Academic Outreach

Author(s):  
Daniel Noesgaard

The work required to collect, clean and publish biodiversity datasets is significant, and those who do it deserve recognition for their efforts. Researchers publish studies using open biodiversity data available from GBIF—the Global Biodiversity Information Facility—at a rate of about two papers a day. These studies cover areas such as macroecology, evolution, climate change, and invasive alien species, relying on data sharing by hundreds of publishing institutions and the curatorial work of thousands of individual contributors. With more than 90 per cent of these datasets licensed under Creative Commons Attribution licenses (CC BY and CC BY-NC), data users are required to credit the dataset providers. For GBIF, it is crucial to link these scientific uses to the underlying data as one means of demonstrating the value and impact of open science, while seeking to ensure attribution of individual, organizational and national contributions to the global pool of open data about biodiversity. Every single authenticated download of occurrence records from GBIF.org is issued a unique Digital Object Identifier (DOI). These DOIs each resolve to a landing page that contains details of the search parameters used to generate the download a quantitative map of the underlying datasets that contributed to the download a simple citation to be included in works that rely on the data the search parameters used to generate the download a quantitative map of the underlying datasets that contributed to the download a simple citation to be included in works that rely on the data When used properly by authors and deposited correctly by journals in the article metadata, the DOI citation establishes a direct link between a scientific paper and the underlying data. Crossref—the main DOI Registration Agency for academic literature— exposes such links in Event Data, which can be consumed programmatically to report direct use of individual datasets. GBIF also records these links, permanently preserving the download archives while exposing a citation count on download landing pages that is also summarized on the landing pages of each contributing datasets and publishers. The citation counts can be expanded to produce lists of all papers unambiguously linked to use of specific datasets. In 2018, just 15 per cent of papers based on GBIF-mediated data used DOIs to cite or acknowledge the datasets used in the studies. To promote crediting of data publishers and digital recognition of data sharing, the GBIF Secretariat has been reaching out systematically to authors and publishers since April 2018 whenever a paper fails to include a proper data citation. While publishing lags may hinder immediate effects, preliminary findings suggest that uptake is improving—as the number of papers with DOI data citations during the first part of 2019 is up more than 60 per cent compared to 2018. Focusing on the value of linking scientific publications and data, this presentation will explore the potential for establishing automatic linkage through DOI metadata while demonstrating efforts to improve metrics of data use and attribution of data providers through outreach campaigns to authors and journal publishers.

2020 ◽  
Author(s):  
Kyle Copas

<p>GBIF—the Global Biodiversity Information Facility—and its network of more than 1,500 institutions maintain the world's largest index of biodiversity data (https://www.gbif.org), containing nearly 1.4 billion species occurrence records. This infrastructure offers a model of best practices, both technological and cultural, that other domains may wish to adapt or emulate to ensure that its users have free, FAIR and open access to data.</p><p>The availability of community-supported data and metadata standards in the biodiversity informatics community, combined with the adoption (in 2014) of open Creative Commons licensing for data shared with GBIF, established the necessary preconditions for the network's recent growth.</p><p>But GBIF's development of a data citation system based on the uses of DOIs—Digital Object Identifiers—has established an approach for using unique identifiers to establish direct links between scientific research and the underlying data on which it depends. The resulting state-of-the-art system tracks uses and reuses of data in research and credits data citations back to individual datasets and publishers, helping to ensure the transparency of biodiversity-related scientific analyses.</p><p>In 2015, GBIF began issuing a unique Digital Object Identifier (DOI) for every data download. This system resolves each download to a landing page containing 1) the taxonomic, geographic, temporal and other search parameters used to generate the download; 2) a quantitative map of the underlying datasets that contributed to the download; and 3) a simple citation to be included in works that rely on the data.</p><p>When authors cite these download DOIs, they in effect assert direct links between scientific papers and underlying data. Crossref registers these links through Event Data, enabling GBIF to track citation counts automatically for each download, dataset and publisher. These counts expand to display a bibliography of all research reuses of the data.This system improves the incentives for institutions to share open data by providing quantifiable measures demonstrating the value and impact of sharing data for others' research.</p><p>GBIF is a mature infrastructure that supports a wide pool of researchers publish two peer-reviewed journal articles that rely on this data every day. That said, the citation-tracking and -crediting system has room for improvement. At present, 21% of papers using GBIF-mediated data provide DOI citations—which represents a 30% increase over 2018. Through outreach to authors and collaboration with journals, GBIF aims to continue this trend.</p><p>In addition, members of the GBIF network are seeking to extend citation credits to individuals through tools like Bloodhound Tracker (https://www.bloodhound-tracker.net) using persistent identifiers from ORCID and Wikidata IDs. This approach provides a compelling model for the scientific and scholarly benefits of treating individual data records from specimens as micro- or nanopublications—first-class research objects that advancing both FAIR data and open science.</p>


2019 ◽  
Vol 39 (06) ◽  
pp. 300-307
Author(s):  
Deep Jyoti Francis ◽  
Anup Kumar Das

With the wave of digitalisation, institutions across countries are pushing for the creation of open data and their governance. FAIR Data Principles have initiated the publishing of open research data to the key stakeholders and practitioners in the low- and middle-income countries to meet their developmental goals through practical usage in problem-solving. Open Data, which is part of the Open Science movement, has transformed the regime structure at a transnational level for the governance of critical issues surrounding water and energy. This paper provides a baseline survey to look into the various open data initiatives in the areas of water and clean energy across countries in general and India in particular. Given the multifaceted challenges around the water-energy nexus existing in India, it is critical to identifying the open data initiatives and studying their governance at the country level. Since governance requires the participation of various institutions and multiple stakeholders, the research aims at highlighting the various initiatives such as participation of institutions and the application of Creative Commons (CC) licensing terms in the open data governance for clean energy and water sectors in India.


2019 ◽  
Vol 19 (01) ◽  
pp. e05
Author(s):  
Marcos daniel Zarate ◽  
Carlos Buckle ◽  
Renato Mazzanti ◽  
Gustavo Samec

Scientific publication services are changing drastically, researchers demand intelligent search services to discover and relate scientific publications. Publishersneed to incorporate semantic information to better organize their digital assets and make publications more discoverable. In this paper, we present the on-going work to publish a subset of scientific publications of CONICET Digital as Linked Open Data. The objective of this work is to improve the recovery andreuse of data through Semantic Web technologies and Linked Data in the domain of scientific publications.To achieve these goals, Semantic Web standards and reference RDF schema’s have been taken into account (Dublin Core, FOAF, VoID, etc.). The conversion and publication process is guided by the methodological guidelines for publishing government linked data. We also outline how these data can be linked to other datasets DBLP, WIKIDATA and DBPEDIA on the web of data. Finally, we show some examples of queries that answer questions that initially CONICET Digital does not allow


Author(s):  
Tetiana Yaroshenko

Open Access to scientific information, transparency of research processes and data is one of the most important conditions for the progress of science and scientific communication, the basis of international collaboration of researchers globally. The COVID-19 global pandemic has once again highlighted the need for open, efficient and equal access to scientific information for researchers, regardless of geographical, gender or any other constraints, promoting the exchange of scientific knowledge and data, scientific cooperation and scientific decision-making, knowledge and open data. The Internet has radically changed scientific communication, particularly on the model of peer-reviewed scientific journals and the way readers find and access the scientific information. Digital access is now the norm, thanks to the Open Access model. Although 20 years have passed since the announcement of the Budapest Open Access Initiative, and despite many achievements and advantages, there are still obstacles to the implementation of this model, there is some resistance from commercial publishers and other providers, and discussions continue in the academia world. The Open Access model is already supported by various strategies, policies, platforms, applications but is not yet established. Various business models for scientific journals are still being tested, a culture of preprints is being formed, and discussions are underway on the ethics of scientific publications, intellectual property, the need to finance the dissemination of research results, and so on. Various platforms and applications are being developed to help researchers “discover” research results. Nevertheless, this is not enough: it is important to “discover” not only the results but also the research data, allowing them be used for further research in the global world. Thus, the concepts and practices of Open Science, Open Data, development of research infrastructures, etc., are developing quite rapidly. The article considers the main stages of this 20-year path and outlines the main components and trends of the current stage. Emphasis is placed on the need to form a culture of Open Science and create incentives for its implementation, promoting innovative methods of Open Science at different stages of the scientific process, the needs of European integration of Ukrainian e-infrastructure development, the need for socio-cultural and technological change. The main international and domestic practices and projects in Open Access and Open Science, particularly the National Repository of Academic Texts and the National Plan of Open Science draft, are considered. The role of libraries and librarians in implementing the principles of Open Access and Open Science is emphasized.


Publications ◽  
2019 ◽  
Vol 7 (2) ◽  
pp. 38 ◽  
Author(s):  
Lyubomir Penev ◽  
Mariya Dimitrova ◽  
Viktor Senderov ◽  
Georgi Zhelezov ◽  
Teodor Georgiev ◽  
...  

Hundreds of years of biodiversity research have resulted in the accumulation of a substantial pool of communal knowledge; however, most of it is stored in silos isolated from each other, such as published articles or monographs. The need for a system to store and manage collective biodiversity knowledge in a community-agreed and interoperable open format has evolved into the concept of the Open Biodiversity Knowledge Management System (OBKMS). This paper presents OpenBiodiv: An OBKMS that utilizes semantic publishing workflows, text and data mining, common standards, ontology modelling and graph database technologies to establish a robust infrastructure for managing biodiversity knowledge. It is presented as a Linked Open Dataset generated from scientific literature. OpenBiodiv encompasses data extracted from more than 5000 scholarly articles published by Pensoft and many more taxonomic treatments extracted by Plazi from journals of other publishers. The data from both sources are converted to Resource Description Framework (RDF) and integrated in a graph database using the OpenBiodiv-O ontology and an RDF version of the Global Biodiversity Information Facility (GBIF) taxonomic backbone. Through the application of semantic technologies, the project showcases the value of open publishing of Findable, Accessible, Interoperable, Reusable (FAIR) data towards the establishment of open science practices in the biodiversity domain.


2018 ◽  
Vol 374 (1763) ◽  
pp. 20170391 ◽  
Author(s):  
Gil Nelson ◽  
Shari Ellis

The first two decades of the twenty-first century have seen a rapid rise in the mobilization of digital biodiversity data. This has thrust natural history museums into the forefront of biodiversity research, underscoring their central role in the modern scientific enterprise. The advent of mobilization initiatives such as the United States National Science Foundation's Advancing Digitization of Biodiversity Collections (ADBC), Australia's Atlas of Living Australia (ALA), Mexico's National Commission for the Knowledge and Use of Biodiversity (CONABIO), Brazil's Centro de Referência em Informação (CRIA) and China's National Specimen Information Infrastructure (NSII) has led to a rapid rise in data aggregators and an exponential increase in digital data for scientific research and arguably provide the best evidence of where species live. The international Global Biodiversity Information Facility (GBIF) now serves about 131 million museum specimen records, and Integrated Digitized Biocollections (iDigBio) in the USA has amassed more than 115 million. These resources expose collections to a wider audience of researchers, provide the best biodiversity data in the modern era outside of nature itself and ensure the primacy of specimen-based research. Here, we provide a brief history of worldwide data mobilization, their impact on biodiversity research, challenges for ensuring data quality, their contribution to scientific publications and evidence of the rising profiles of natural history collections. This article is part of the theme issue ‘Biological collections for understanding biodiversity in the Anthropocene’.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 253
Author(s):  
Daniel Nüst ◽  
Stephen J. Eglen

The traditional scientific paper falls short of effectively communicating computational research.  To help improve this situation, we propose a system by which the computational workflows underlying research articles are checked. The CODECHECK system uses open infrastructure and tools and can be integrated into review and publication processes in multiple ways. We describe these integrations along multiple dimensions (importance, who, openness, when). In collaboration with academic publishers and conferences, we demonstrate CODECHECK with 25 reproductions of diverse scientific publications. These CODECHECKs show that asking for reproducible workflows during a collaborative review can effectively improve executability. While CODECHECK has clear limitations, it may represent a building block in Open Science and publishing ecosystems for improving the reproducibility, appreciation, and, potentially, the quality of non-textual research artefacts. The CODECHECK website can be accessed here: https://codecheck.org.uk/.


2018 ◽  
Vol 60 (3) ◽  
pp. 192-198
Author(s):  
Dorota Grygoruk

Abstract The development of information technology makes it possible to collect and analyse more and more data resources. The results of research, regardless of the discipline, constitute one of main sources of data. Currently, the research results are increasingly being published in the Open Access model. The Open Access concept has been accepted and recommended worldwide by many institutions financing and implementing research. Initially, the idea of openness concerned only the results of research and scientific publications; at present, more attention is paid to the problem of sharing scientific data, including raw data. Proceedings towards open data are intricate, as data specificity requires the development of an appropriate legal, technical and organizational model, followed by the implementation of data management policies at both the institutional and national levels. The aim of this publication was to present the development of the open data concept in the context of open access idea and problems related to defining data in the process of data sharing and data management.


2019 ◽  
Author(s):  
Andrea Abele-Brehm ◽  
Mario Gollwitzer ◽  
Ulf Steinberg ◽  
Felix D. Schönbrodt

Central values of science are, among others, transparency, verifiability, replicability and openness. The currently very prominent Open Science (OS) movement supports these values. Among its most important principles are open methodology (comprehensive and useful documentation of methods and materials used), open access to published research output, and open data (making collected data available for re-analyses). We here present a survey conducted among members of the German Psychological Society (N = 337), in which we applied a mixed-methods approach (quantitative and qualitative data) to assess attitudes towards OS in general and towards data sharing more specifically. Attitudes towards OS were distinguished into positive expectations (“hopes”) and negative expectations (“fears”). These were un-correlated. There were generally more hopes associated with OS and data sharing than fears. Both hopes and fears were highest among early career researchers and lowest among professors. The analysis of the open answers revealed that generally positive attitudes towards data sharing (especially sharing of data related to a published article) are somewhat diminished by cost/benefit considerations. The results are discussed with respect to individual researchers’ behavior and with respect to structural changes in the research system.


Author(s):  
C. Willmes ◽  
D. Becker ◽  
J. Verheul ◽  
Y. Yener ◽  
M. Zickel ◽  
...  

Paleoenvironmental studies and according information (data) are abundantly published and available in the scientific record. However, GIS-based paleoenvironmental information and datasets are comparably rare. Here, we present an Open Science approach for creating GIS-based data and maps of paleoenvironments, and Open Access publishing them in a web based Spatial Data Infrastructure (SDI), for access by the archaeology and paleoenvironment communities. We introduce an approach to gather and create GIS datasets from published non-GIS based facts and information (data), such as analogous maps, textual information or figures in scientific publications. These collected and created geo-datasets and maps are then published, including a Digital Object Identifier (DOI) to facilitate scholarly reuse and citation of the data, in a web based Open Access Research Data Management Infrastructure. The geo-datasets are additionally published in an Open Geospatial Consortium (OGC) standards compliant SDI, and available for GIS integration via OGC Open Web Services (OWS).


Sign in / Sign up

Export Citation Format

Share Document