Data Management Experiences and Best Practices from the Perspective of a Plant Research Institute

Author(s):  
Daniel Arend ◽  
Christian Colmsee ◽  
Helmut Knüpffer ◽  
Markus Oppermann ◽  
Uwe Scholz ◽  
...  
Neuroforum ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Michael Denker ◽  
Sonja Grün ◽  
Thomas Wachtler ◽  
Hansjörg Scherberger

Abstract Preparing a neurophysiological data set with the aim of sharing and publishing is hard. Many of the available tools and services to provide a smooth workflow for data publication are still in their maturing stages and not well integrated. Also, best practices and concrete examples of how to create a rigorous and complete package of an electrophysiology experiment are still lacking. Given the heterogeneity of the field, such unifying guidelines and processes can only be formulated together as a community effort. One of the goals of the NFDI-Neuro consortium initiative is to build such a community for systems and behavioral neuroscience. NFDI-Neuro aims to address the needs of the community to make data management easier and to tackle these challenges in collaboration with various international initiatives (e.g., INCF, EBRAINS). This will give scientists the opportunity to spend more time analyzing the wealth of electrophysiological data they leverage, rather than dealing with data formats and data integrity.


2018 ◽  
Vol 72 (3) ◽  
pp. 332-337
Author(s):  
Deb Autor ◽  
Zena Kaufman ◽  
Ron Tetzlaff ◽  
Maryann Gribbin ◽  
Madlene Dole ◽  
...  

2021 ◽  
Vol 50 (2) ◽  
pp. 30-32
Author(s):  
Patrick Valduriez

I have been working on research in data management for the last 40 years. I like my job and my research institution (Inria, the French national research institute for computer science), which have offered me great opportunities to learn a lot, do good work, get to know smart and nice people and overall feel useful. However, since the early days of my mid-career, the research environment, including academia and industry, has certainly become more complex, making the move from junior (or pre-tenure) researcher to senior researcher quite challenging. Based on my experience, I review some of the main questions and challenges and give some hints on how to deal with them. I'll sometimes use stories and anecdotes to illustrate the point.


2020 ◽  
Author(s):  
Paolo Oliveri ◽  
SImona Simoncelli ◽  
Pierluigi DI Pietro ◽  
Sara Durante

<p>One of the main challenges for the present and future in ocean observations is to find best practices for data management: infrastructures like Copernicus and SeaDataCloud already take responsibility for assembly, archive, update and publish data. Here we present the strengths and weaknesses in a SeaDataCloud Temperature and Salinity time series data collections, in particular a tool able to recognize the different devices and platforms and to merge them with processed Copernicus platforms.</p><p>While Copernicus has the main target to quickly acquire and publish data, SeaDataNet aims to publish data with the best quality available. This two data repository should be considered together, since the originator can ingest the data in both the infrastructures or only in one, or partially in both. This results sometimes in data partially available in Copernicus or SeaDataCloud, with great impact for the researcher who wants to access as much data as possible. The data reprocessing should not be loaded on researchers' shoulders, since only skilled users in all data management plan know how merge the data.</p><p>The SeaDataCloud time series data collections is a Global Ocean soon-to-be-published dataset that will represent a reference for ocean researchers, released in binary, user friendly Ocean Data View format. The database management plan was originally for profiles, but had been adapted for time series, resolving several issues like the uniqueness of the identifiers (ID).</p><p>Here we present an extension of the SOURCE (Sea Observations Utility for Reprocessing. Calibration and Evaluation) Python package, able to enhance the data quality with redundant sophisticated methods and simplify their usage. </p><p>SOURCE increases quality control (Q/C) performances on observations using statistical quality check procedures that follows the ocean best practices guidelines, exploiting the following  issues:</p><ol><li>Find and aggregate all broken time series using likeness in ID parameter strings;</li> <li>Find and organize in a dictionary all different metadata variables;</li> <li>Correct time series time to match simpler measure units;</li> <li>Filter devices that are outside of a selected horizontal rectangle;</li> <li>Give some information on original Q/C scheme by SeaDataCloud infrastructure;</li> <li>Give information tables on platforms and on the merged ID string duplicates together with an errors log file (missing time, depth, data, wrong Q/C variables, etc.).</li> </ol><p>In particular, the duplicates table and the log file may be helpful to SeaDataCloud partners in order to update the data collection and make it finally available for the users.</p><p>The reconstructed SeaDataCloud time series data, divided by parameter and stored in a more flexible dataset, give the possibility to ingest it in the main part of the software, allowing to compare it with Copernicus time series, find the same platform using horizontal and vertical surroundings (without looking to ID) find and cleanup  duplicated data, merge the two databases to extend the data coverage.</p><p>This allow researchers to have the most wide and the best quality possible data for the final users release and to to use these data to calibrate and validate models, in order to reach an idea of a whole area sea conditions.</p>


Sign in / Sign up

Export Citation Format

Share Document