Data Management for Early Career Scientists – How to Tame the Elephant

Author(s):  
Laia Comas-Bru ◽  
Marcus Schmidt

<p>Data Management can be overwhelming, especially for Early Career Scientists. In order to give them a kick-start, the World Data System (WDS) organised a 3-day EGU-sponsored workshop on current achievements and future challenges in November 2019 in Paris. The purpose of the workshop was to provide Early Career Scientists with practical skills in data curation and management through a combination of practical sessions, group discussions and lectures. Participants were introduced to what are research data andcommon vocabulary to be used during the workshop. Later, a World Café session provided an opportunity to discuss individual challenges on data management and expectations of the workshop in small groups of peers. Lectures and discussions evolved around Open Science, Data Management Plans (DMP), data exchange, copyright and plagiarism, the use of Big Data, ontologies and cloud platforms in Science. Finally, the roles and responsibilities of the WDS as well as its WDS Early Career Researcher Network were discussed. Wrapping-up the workshop, attendees were walked through what is a data repository and how do they obtain their certifications.This PICO presentation given by two attendees of the workshop will showcase the main topics of discussion on data management and curation, provide key examples with special emphasis on the importance of creating a DMP at an early stage of your research project and share practical tools and advise on how to make data management more accessible.</p>

2019 ◽  
Author(s):  
Trond Kvamme ◽  
Philipp Conzett

Norway has been selected as a new national node in RDA (Research Data Alliance). Until the end of the project in May 2020, the node will be engaging with research communities, supporting national agendas, and contributing to the EU Open Science Strategy to ensure capillary uptake of RDA principles and outputs. Moreover, they will be working to increase the participation in RDA nationally. The Norwegian RDA node (NO-RDA) will be run by a consortium of seven partners, each of them with specific roles in the activities around the node, and led by NSD - Norwegian Centre for Research Data. NO-RDA will focus on supporting the implementation of RDA outputs and recommendations and on areas of strategic importance for the Nordic region, such as Data Management Plans, FAIR Data Stewardship and management of sensitive data in research within the framework of current international and statutory regulations. In addition to NSD the node consists of NTNU, UiB, UiO, UiT, Unit og Uninett/Sigma2. The Research Data Alliance (RDA) was launched as a community-driven initiative in 2013 by the European Commission, the United States Government's National Science Foundation and National Institute of Standards and Technology, and the Australian Government’s Department of Innovation with the goal of building the social and technical infrastructure to enable open sharing and re-use of data. RDA has a grass-roots, inclusive approach covering all data lifecycle stages, engaging data producers, users and stewards, addressing data exchange, processing, and storage. It has succeeded in creating the neutral social platform where international research data experts meet to exchange views and to agree on topics including social hurdles on data sharing, education and training challenges, data management plans and certification of data repositories, disciplinary and interdisciplinary interoperability, as well as technological aspects.


2021 ◽  
Author(s):  
Renato Alves ◽  
Dimitrios Bampalikis ◽  
Leyla Jael Castro ◽  
José María Fernández ◽  
Jennifer Harrow ◽  
...  

Data Management Plans are now considered a key element of Open Science. They describe the data management life cycle for the data to be collected, processed and/or generated within the lifetime of a particular project or activity. A Software Manag ement Plan (SMP) plays the same role but for software. Beyond its management perspective, the main advantage of an SMP is that it both provides clear context to the software that is being developed and raises awareness. Although there are a few SMPs already available, most of them require significant technical knowledge to be effectively used. ELIXIR has developed a low-barrier SMP, specifically tailored for life science researchers, aligned to the FAIR Research Software principles. Starting from the Four Recommendations for Open Source Software, the ELIXIR SMP was iteratively refined by surveying the practices of the community and incorporating the received feedback. Currently available as a survey, future plans of the ELIXIR SMP include a human- and machine-readable version, that can be automatically queried and connected to relevant tools and metrics within the ELIXIR Tools ecosystem and beyond.


2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Viviane Santos de Oliveira Veiga ◽  
Patricia Henning ◽  
Simone Dib ◽  
Erick Penedo ◽  
Jefferson Da Costa Lima ◽  
...  

RESUMO Este artigo trás para discussão o papel dos planos de gestão de dados como instrumento facilitador da gestão dos dados durante todo o ciclo de vida da pesquisa. A abertura de dados de pesquisa é pauta prioritária nas agendas científicas, por ampliar tanto a visibilidade e transparência das investigações, como a capacidade de reprodutibilidade e reuso dos dados em novas pesquisas. Nesse contexto, os princípios FAIR, um acrônimo para ‘Findable’, ‘Accessible’, ‘Interoperable’ e ‘Reusable’ é fundamental por estabelecerem orientações basilares e norteadoras na gestão, curadoria e preservação dos dados de pesquisa direcionados para o compartilhamento e o reuso. O presente trabalho tem por objetivo apresentar uma proposta de template de Plano de Gestão de Dados, alinhado aos princípios FAIR, para a Fundação Oswaldo Cruz. A metodologia utilizada é de natureza bibliográfica e de análise documental de diversos planos de gestão de dados europeus. Concluímos que a adoção de um plano de gestão nas práticas cientificas de universidades e instituições de pesquisa é fundamental. No entanto, para tirar maior proveito dessa atividade é necessário contar com a participação de todos os atores envolvidos no processo, além disso, esse plano de gestão deve ser machine-actionable, ou seja, acionável por máquina.Palavras-chave: Plano de Gestão de Dados; Dado de Pesquisa; Princípios FAIR; PGD Acionável por Máquina; Ciência Aberta.ABSTRACT This article proposes to discuss the role of data management plans as a tool to facilitate data management during researches life cycle. Today, research data opening is a primary agenda at scientific agencies as it may boost investigations’ visibility and transparency as well as the ability to reproduce and reuse its data on new researches. Within this context, FAIR principles, an acronym for Findable, Accessible, Interoperable and Reusable, is paramount, as it establishes basic and guiding orientations for research data management, curatorship and preservation with an intent on its sharing and reuse. The current work intends to present to the Fundação Oswaldo Cruz a new Data Management Plan template proposal, aligned with FAIR principles. The methodology used is bibliographical research and documental analysis of several European data management plans. We conclude that the adoption of a management plan on universities and research institutions scientific activities is paramount. However, to be fully benefited from this activity, all actors involved in the process must participate, and, on top of that, this plan must be machine-actionable.Keywords: Data Management Plan; Research Data; FAIR Principles; DMP Machine-Actionable; Open Science.


2017 ◽  
Vol 12 (1) ◽  
pp. 22-35 ◽  
Author(s):  
Tomasz Miksa ◽  
Andreas Rauber ◽  
Roman Ganguly ◽  
Paolo Budroni

Data management plans are free-form text documents describing the data used and produced in scientific experiments. The complexity of data-driven experiments requires precise descriptions of tools and datasets used in computations to enable their reproducibility and reuse. Data management plans fall short of these requirements. In this paper, we propose machine-actionable data management plans that cover the same themes as standard data management plans, but particular sections are filled with information obtained from existing tools. We present mapping of tools from the domains of digital preservation, reproducible research, open science, and data repositories to data management plan sections. Thus, we identify the requirements for a good solution and identify its limitations. We also propose a machine-actionable data model that enables information integration. The model uses ontologies and is based on existing standards.


2020 ◽  
Author(s):  
Irene DeFelipe ◽  
Juan Alcalde ◽  
Monika Ivandic ◽  
David Martí ◽  
Mario Ruiz ◽  
...  

Abstract. Seismic reflection data (normal incidence and wide-angle) are unique assets for Solid Earth Science as they provide critical information about the physical properties and structure of the lithosphere, as well as about the shallow subsurface for exploration purposes. The resolution of these seismic data is highly appreciated, however they are logistically complex and expensive to acquire and their geographical coverage is limited. Therefore, it is essential to make the most of the data that has already been acquired. The collation and dissemination of seismic open access data is then key to promote accurate and innovative research and to enhance new interpretations of legacy data. This work presents the Seismic DAta REpository (SeisDARE), which is, to our knowledge, one of the first comprehensive open access online databases that stores seismic data registered with a permanent identifier (DOI). The datasets included here are openly accessible online and guarantee the FAIR (Findable, Accessible, Interoperable, Reusable) principles of data management, granting the inclusion of each dataset into a statistics referencing database so its impact can be measured. SeisDARE includes seismic data acquired in the last four decades in the Iberian Peninsula and Morocco. These areas have attracted the attention of international researchers in the fields of geology and geophysics due to the exceptional outcrops of the Variscan and Alpine orogens and wide foreland basins; the crustal structure of the offshore margins that resulted from a complex plate kinematic evolution; and the vast quantities of natural resources contained within. This database has been built thanks to a network of national and international institutions, promoting a multidisciplinary research, and is open for international data exchange and collaborations. As part of this international collaboration, and as a model for inclusion of other global seismic datasets, SeisDARE also hosts seismic data acquired in Hardeman County, Texas (USA), within the COCORP project (Consortium for Continental Reflection Profiling). SeisDARE aims to make easily accessible old and recently acquired seismic data and to establish a framework for future seismic data management plans. The SeisDARE is freely available at https://digital.csic.es/handle/10261/101879, bringing endless research and teaching opportunities to the scientific, industrial and educational communities.


2021 ◽  
Vol 13 (3) ◽  
pp. 1053-1071 ◽  
Author(s):  
Irene DeFelipe ◽  
Juan Alcalde ◽  
Monika Ivandic ◽  
David Martí ◽  
Mario Ruiz ◽  
...  

Abstract. Seismic reflection data (normal incidence and wide angle) are unique assets for solid Earth sciences as they provide critical information about the physical properties and structure of the lithosphere as well as about the shallow subsurface for exploration purposes. The resolution of these seismic data is highly appreciated; however they are logistically complex and expensive to acquire, and their geographical coverage is limited. Therefore, it is essential to make the most of the data that have already been acquired. The collation and dissemination of seismic open-access data are then key to promote accurate and innovative research and to enhance new interpretations of legacy data. This work presents the Seismic DAta REpository (SeisDARE), which is, to our knowledge, one of the first comprehensive open-access online databases that stores seismic data registered with a permanent identifier (DOI). The datasets included here are openly accessible online and guarantee the FAIR (findable, accessible, interoperable, reusable) principles of data management, granting the inclusion of each dataset in a statistics referencing database so its impact can be measured. SeisDARE includes seismic data acquired in the last 4 decades in the Iberian Peninsula and Morocco. These areas have attracted the attention of international researchers in the fields of geology and geophysics due to the exceptional outcrops of the Variscan and Alpine orogens and wide foreland basins, the crustal structure of the offshore margins that resulted from a complex plate kinematic evolution, and the vast quantities of natural resources contained within. This database has been built thanks to a network of national and international institutions, promoting a multidisciplinary research and is open for international data exchange and collaborations. As part of this international collaboration, and as a model for inclusion of other global seismic datasets, SeisDARE also hosts seismic data acquired in Hardeman County, Texas (USA), within the COCORP project (Consortium for Continental Reflection Profiling). SeisDARE aims to make easily accessible old and recently acquired seismic data and to establish a framework for future seismic data management plans. SeisDARE is freely available at https://digital.csic.es/handle/10261/101879 (a detailed list of the datasets can be found in Table 1), bringing endless research and teaching opportunities to the scientific, industrial, and educational communities.


2021 ◽  
Vol 3 (1) ◽  
pp. 189-204
Author(s):  
Hua Nie ◽  
Pengcheng Luo ◽  
Ping Fu

Research Data Management (RDM) has become increasingly important for more and more academic institutions. Using the Peking University Open Research Data Repository (PKU-ORDR) project as an example, this paper will review a library-based university-wide open research data repository project and related RDM services implementation process including project kickoff, needs assessment, partnerships establishment, software investigation and selection, software customization, as well as data curation services and training. Through the review, some issues revealed during the stages of the implementation process are also discussed and addressed in the paper such as awareness of research data, demands from data providers and users, data policies and requirements from home institution, requirements from funding agencies and publishers, the collaboration between administrative units and libraries, and concerns from data providers and users. The significance of the study is that the paper shows an example of creating an Open Data repository and RDM services for other Chinese academic libraries planning to implement their RDM services for their home institutions. The authors of the paper have also observed since the PKU-ORDR and RDM services implemented in 2015, the Peking University Library (PKUL) has helped numerous researchers to support the entire research life cycle and enhanced Open Science (OS) practices on campus, as well as impacted the national OS movement in China through various national events and activities hosted by the PKUL.


2020 ◽  
Author(s):  
Magdalena Szuflita-Żurawska ◽  
Anna Wałek

Open Science Competence Center at the Gdańsk University of Technology Library was established upon the Bridge of Data project at the end of 2018. Our main goals include providing support for the academic community for broad issues associated with Open Science, especially with Open Research Data. Our team of professionals help researchers in many topics such as: "what kinds of data you need to share", "how to make your data openly available to others", or "how to create a Data Management Plan" – that recently has been the most popular and demanding service.  One of the main challenges to support academic staff with Data Management Plans is dealing with the legal impediments to provide open access and reusing of research data for publicly funded scientific projects. The lack of understanding the legal issues in opening research is a significant barrier to facilitate Open Science. Much public-funded research requires to prepare a Data Management Plan that, among other items, provides information about ownership and user rights. One of the most common activity for scholars is choosing which license (if any) they are supposed to use in terms of the dissemination the scientific output. However, in many cases, resolving the right license for research data is not enough. Academic staff faces many tensions with a lack of clarity around legal requirements and obstacles. The increasing researchers' need for understanding and describing conflicting issues (e.g. patenting) results in looking for professional and knowledgeable support at the university. We examine the most frequent legal issues arising among DMPs from the three scientific disciplines: chemistry (e.g. ethical papers), economics (e.g. data value cycle), and civil engineering (e.g. complexity of construction data). In our presentation, we would like to introduce the main identified problems and show how mapping and benchmarking occurring problems among those disciplines help us to establish more efficient legal support for researchers. 


Terminology ◽  
2019 ◽  
Vol 25 (2) ◽  
pp. 146-174
Author(s):  
Paula Zorrilla-Agut ◽  
Thierry Fontenelle

Abstract The redevelopment of the European Union’s interinstitutional terminology database IATE (InterActive Terminology for Europe) has been an opportunity to rethink the technologies, architecture and data structure of the system in order to prepare it for future challenges, including interoperability, modularity, scalability and data exchange, among other things. This article describes which strategies are being put in place to allow IATE data – one of the largest multilingual terminology databases in the world – to be consumed by third-party tools, particularly computer-assisted translation environments (CATEs). The modernisation of the application, aligning it with the latest software and systems engineering standards and technologies for the benefit of all users and for improved data management by EU linguists, is also described.


Sign in / Sign up

Export Citation Format

Share Document