The Magnetics Information Consortium (MagIC) Data Repository: Successes and Continuing Challenges

Author(s):  
Nicholas Jarboe ◽  
Rupert Minnett ◽  
Catherine Constable ◽  
Anthony Koppers ◽  
Lisa Tauxe

<p>MagIC (earthref.org/MagIC) is an organization dedicated to improving research capacity in the Earth and Ocean sciences by maintaining an open community digital data archive for rock and paleomagnetic data with portals that allow users access to archive, search, visualize, download, and combine these versioned datasets. We are a signatory of the Coalition for Publishing Data in the Earth and Space Sciences (COPDESS)'s Enabling FAIR Data Commitment Statement and an approved repository for the Nature set of journals. We have been in collaboration with EarthCube's GeoCodes data search portal, adding schema.org/JSON-LD headers to our data set landing pages and suggesting extensions to schema.org when needed. Collaboration with the European Plate Observing System (EPOS)'s Thematic Core Service Multi-scale laboratories (TCS MSL) is ongoing with MagIC sending its contributions' metadata to TCS MSL via DataCite records.</p><p>Improving and updating our data repository to meet the demands of the quickly changing landscape of data archival, retrieval, and interoperability is a challenging proposition. Most journals now require data to be archived in a "FAIR" repository, but the exact specifications of FAIR are still solidifying. Some journals vet and have their own list of accepted repositories while others rely on other organizations to investigate and certify repositories. As part of the COPDESS group at Earth Science Information Partners (ESIP), we have been and will continue to be part of the discussion on the needed and desired features for acceptable data repositories.</p><p>We are actively developing our software and systems to meet the needs of our scientific community. Some current issues we are confronting are: developing workflows with journals on how to publish the journal article and data in MagIC simultaneously, sustainability of data repository funding especially in light of the greater demands on them due to data policy changes at journals, and how to best share and expose metadata about our data holdings to organizations such as EPOS, EarthCube, and Google.</p>

2021 ◽  
Author(s):  
Shelley Stall ◽  
Helen Glaves ◽  
Brooks Hanson ◽  
Kerstin Lehnert ◽  
Erin Robinson ◽  
...  

<p>The Earth, space, and environmental sciences have made significant progress in awareness and implementation of policy and practice around the sharing of data, software, and samples.  In specific, the Coalition for Publishing Data in the Earth and Space Sciences (https://copdess.org/) brings together data repositories and journals to discuss and address common challenges in support of more transparent and discoverable research and the supporting data.  Since the inception of COPDESS in 2014 and the completion of the Enabling FAIR Data Project in 2019, work has continued on the improvement of availability statements for data and software as well as corresponding citations.  </p><p>As the broad research community continues to make progress around data and software management and sharing, COPDESS is focused on several key efforts. These include 1) supporting authors in identifying the most appropriate data repository for preservation, 2) validating that all manuscripts have data and software availability statements, 3) ensuring data and software citations are properly included and linked to the publication to support credit, 4) encouraging adoption of best practices. </p><p>We will review the status of these current efforts around data and software sharing, the important role that repositories and researchers have to ensure that automated credit and attribution elements are in place, and the recent publications on software citation guidance from the FORCE11 Software Implementation Working Group.</p>


2021 ◽  
Author(s):  
Ronald R. Gutierrez ◽  
Frank E. Escusa ◽  
Alice Lefebvre ◽  
Carlo Gualtieri ◽  
Francisco Nunez-Gonzalez ◽  
...  

<p>Open and data-driven paradigms have allowed to answer fundamental scientific questions in different disciplines such as astronomy, ecology and fluid mechanics, among others. Recently, the need to collaboratively build a large, engineered and freely accessible bed form database has been highlighted as a necessary step to adopt these paradigms in bed form dynamics research.</p><p>Most large database architectures have followed the principles of relational databases model solutions (RDBMS). Recently, non-relational (NoSQL) architectures (e.g., key-value store, graph databases, document-oriented, etc.) have been proposed to improve the capabilities and flexibility of RDBMS. Both RDBMS and NoSQL architectures require designing an engineered metadata structure to define the data taxonomy and structure, which are subsequently used to develop a metadata language for data querying. Past research suggests that the development of a metadata language needs a collaborative and iterative approach.</p><p>Defining the data taxonomy and structure for bed form data may be challenging because: [1] there is not a standardized protocol for conducting field and laboratory measurements; [2] it is expected that existing bed form data have a wide spectrum of data characteristics (e.g. length, format, resolution, structured or non-structured, etc.); and [3] bedforms are studied by scientists and engineers from different disciplines (e.g., geologists, ecologists, civil and water engineers, etc.).</p><p>In recent years, several data repositories have been built to manage large datasets related to the Earth System. One of these repositories is the Earth Science Information Partners, which has proposed standards to promote and improve the preservation, availability and overall quality of Earth System related data. These standards map the roles of participants (e.g., creators, intermediaries and end users) and delivers protocols to ensure proper data distribution and quality control.</p><p>This contribution presents the first iteration of a metadata language for subaqueous bed form data, named BedformsML0, which adopts the standards of the Earth Science Information Partners. BedformsML0 may serve as a prototype to describe bed form observations from field and laboratory measurements, model outputs, technical reports, scientific papers, post processed data, etc. Biogeoenvironmental observations associated to bed form dynamics (e.g., hydrodynamics, turbulence, river and coastal morphology, biota density, habitat metrics, sediment transport, sediment properties, land use dynamics, etc.) may also be represented in BedformsML0. It could subsequently be improved in future iterations via the collaboration of professionals from different Earth science fields to also describe subaerial, and extraterrestrial bed form data. Likewise, BedformsML0 can be used as machine search query selection for massive data processing and visualization of bed form observations. </p>


Author(s):  
Thomas Hedberg ◽  
Allison Barnard Feeney ◽  
Moneer Helu ◽  
Jaime A. Camelio

Industry has been chasing the dream of integrating and linking data across the product lifecycle and enterprises for decades. However, industry has been challenged by the fact that the context in which data are used varies based on the function/role in the product lifecycle that is interacting with the data. Holistically, the data across the product lifecycle must be considered an unstructured data set because multiple data repositories and domain-specific schema exist in each phase of the lifecycle. This paper explores a concept called the lifecycle information framework and technology (LIFT). LIFT is a conceptual framework for lifecycle information management and the integration of emerging and existing technologies, which together form the basis of a research agenda for dynamic information modeling in support of digital-data curation and reuse in manufacturing. This paper provides a discussion of the existing technologies and activities that the LIFT concept leverages. Also, the paper describes the motivation for applying such work to the domain of manufacturing. Then, the LIFT concept is discussed in detail, while underlying technologies are further examined and a use case is detailed. Lastly, potential impacts are explored.


2014 ◽  
Vol 6 (1) ◽  
pp. 123-145 ◽  
Author(s):  
S. Torres Valdés ◽  
S. C. Painter ◽  
A. P. Martin ◽  
R. Sanders ◽  
J. Felden

Abstract. We provide a data set assemblage of directly observed and derived fluxes of sedimenting material (total mass, POC, PON, bSiO2, CaCO3, PIC and lithogenic/terrigenous fluxes) obtained using sediment traps. This data assemblage contains over 5900 data points distributed across the Atlantic, from the Arctic Ocean to the Southern Ocean. Data from the Mediterranean Sea are also included. Data were compiled from a variety of sources: data repositories (e.g. BCO-DMO, PANGAEA®), time-series sites (e.g. BATS, CARIACO), published scientific papers and data provided by the originating principal investigators (PIs). All sources are specified within the combined data set. Data from the World Ocean Atlas 2009 were extracted to coincide with flux data to provide additional environmental information where available. Specifically, contemporaneous data were extracted for temperature, salinity, oxygen (concentration, AOU and percentage saturation), nitrate, phosphate and silicate. Data show a broad range of flux estimates, with marked differences between ocean domains. Data also reveal important differences in the contribution that a given variable provides to the total mass flux, which is relevant towards understanding the factors that control the strength of the biological carbon pump. This data set has been submitted to the data repository PANGAEA® (http://www.pangaea.de), who have made it available under doi:10.1594/PANGAEA.807946.


KWALON ◽  
2016 ◽  
Vol 21 (1) ◽  
Author(s):  
René van Horik

Summary Nowadays, research without a role for digital data and data analysis tools is barely possible. As a result, we see an increasing interest in research data management, as this enables the replication of research outcomes and the reuse of research data for new research activities. Data management planning outlines how to handle data, both during research and after the research is completed. Trusted data repositories are places were research data are archived and made available for the long term. This article covers the state of the art concerning data management and data repository demands with a focus on qualitative data sets.


2013 ◽  
Vol 6 (2) ◽  
pp. 541-595 ◽  
Author(s):  
S. Torres-Valdés ◽  
S. C. Painter ◽  
A. P. Martin ◽  
R. Sanders ◽  
J. Felden

Abstract. We provide a data set assemblage of directly observed and derived fluxes of sedimenting material (total mass, POC , PON , BSiO2, CaCO3, PIC and lithogenic/terrigenous fluxes) obtained using sediment traps. This data assemblage contains over 5900 data points distributed across the Atlantic, from the Arctic Ocean to the Southern Ocean. Data from the Mediterranean Sea are also included. Data were compiled from a variety of sources: data repositories (e.g., BCO-DMO, PANGAEA), time series sites (e.g., BATS, CARIACO), published scientific papers and data provided by originating PI's. All sources are specified within the combined data set. Data from the World Ocean Atlas 2009 were extracted to coincide with flux data to provide additional environmental information where available. Specifically, contemporaneous data were extracted for temperature, salinity, oxygen (concentration, AOU and percentage saturation), nitrate, phosphate and silicate. Data show a broad range of flux estimates, with marked differences between ocean domains. Data also reveal important differences in the contribution that a given variable provides to the total mass flux, which is relevant towards understanding the factors that control the strength of the biological carbon pump. The dataset is archived on the data repository PANGAEA® (http://www.pangaea.de) under doi:10.1594/PANGAEA.807946.


2014 ◽  
Vol 9 (1) ◽  
pp. 164-175 ◽  
Author(s):  
Sarah Callaghan ◽  
Jonathan Tedds ◽  
Rebecca Lawrence ◽  
Fiona Murphy ◽  
Timothy Roberts ◽  
...  

This article provides a selection of examples of the many ways that a link can be made between a journal article (whether in a data journal or otherwise) and a dataset held in a data repository. In some cases the method of linking is well established, while in others, they have yet to be rolled out uniformly across the journal landscape. We explore ways in which these examples might be implemented in a data journal, such as Geoscience Data Journal, as explored by the PREPARDE project.


2021 ◽  
Vol 8 (3A) ◽  
Author(s):  
Fernando Barcellos Razuck

It can be said that scientific community produces research data, as well as uses research data to validate its work. Thus, research data cease to be research products to become informational resources. In this context, digital data repositories play an extremely important role in the scientific research process, since they can be used to share, access, reuse and validate data. In this sense, the informational recognition of research data, in recent years, transformed the view that characterized them as simple by-products of research processes, to the point that researchers, academic institutions and research development agencies begin to understand that these data contribute as a source of informational resources for scientific research and science teaching. Based on this, the objective of this work is to make a preliminary survey of the type of research data generated at the Institute of Radiation Protection and Dosimetry (IRD). For this, an analysis was made of the papers published by the permanent professors of the Post-graduation Program in Radiation Protection and Dosimetry in the last year (2019). It was then generated a table relating the Concentration Areas versus some technical information, regarding the generation of research data. In this sense, the analysis of the data of the IRD consists of a initial stage to assist the creation of the Institute's Digital data repository, which aims to provide the research data in order to be used in other researches. 


Sign in / Sign up

Export Citation Format

Share Document