scholarly journals Are Scientific Data Repositories Coping with Research Data Publishing?

2016 ◽  
Vol 15 ◽  
2017 ◽  
Vol 12 (1) ◽  
pp. 88-105 ◽  
Author(s):  
Sünje Dallmeier-Tiessen ◽  
Varsha Khodiyar ◽  
Fiona Murphy ◽  
Amy Nurnberger ◽  
Lisa Raymond ◽  
...  

The data curation community has long encouraged researchers to document collected research data during active stages of the research workflow, to provide robust metadata earlier, and support research data publication and preservation. Data documentation with robust metadata is one of a number of steps in effective data publication. Data publication is the process of making digital research objects ‘FAIR’, i.e. findable, accessible, interoperable, and reusable; attributes increasingly expected by research communities, funders and society. Research data publishing workflows are the means to that end. Currently, however, much published research data remains inconsistently and inadequately documented by researchers. Documentation of data closer in time to data collection would help mitigate the high cost that repositories associate with the ingest process. More effective data publication and sharing should in principle result from early interactions between researchers and their selected data repository. This paper describes a short study undertaken by members of the Research Data Alliance (RDA) and World Data System (WDS) working group on Publishing Data Workflows. We present a collection of recent examples of data publication workflows that connect data repositories and publishing platforms with research activity ‘upstream’ of the ingest process. We re-articulate previous recommendations of the working group, to account for the varied upstream service components and platforms that support the flow of contextual and provenance information downstream. These workflows should be open and loosely coupled to support interoperability, including with preservation and publication environments. Our recommendations aim to stimulate further work on researchers’ views of data publishing and the extent to which available services and infrastructure facilitate the publication of FAIR data. We also aim to stimulate further dialogue about, and definition of, the roles and responsibilities of research data services and platform providers for the ‘FAIRness’ of research data publication workflows themselves.


2020 ◽  
Author(s):  
Graham Smith ◽  
Andrew Hufton

<p>Researchers are increasingly expected by funders and journals to make their data available for reuse as a condition of publication. At Springer Nature, we feel that publishers must support researchers in meeting these additional requirements, and must recognise the distinct opportunities data holds as a research output. Here, we outline some of the varied ways that Springer Nature supports research data sharing and report on key outcomes.</p><p>Our staff and journals are closely involved with community-led efforts, like the Enabling FAIR Data initiative and the COPDESS 2014 Statement of Commitment <sup>1-4</sup>. The Enabling FAIR Data initiative, which was endorsed in January 2019 by <em>Nature</em> and <em>Scientific Data</em>, and by <em>Nature Geoscience</em> in January 2020, establishes a clear expectation that Earth and environmental sciences data should be deposited in FAIR<sup>5</sup> Data-aligned community repositories, when available (and in general purpose repositories otherwise). In support of this endorsement, <em>Nature</em> and <em>Nature Geoscience</em> require authors to share and deposit their Earth and environmental science data, and <em>Scientific Data</em> has committed to progressively updating its list of recommended data repositories to help authors comply with this mandate.</p><p>In addition, we offer a range of research data services, with various levels of support available to researchers in terms of data curation, expert guidance on repositories and linking research data and publications.</p><p>We appreciate that researchers face potentially challenging requirements in terms of the ‘what’, ‘where’ and ‘how’ of sharing research data. This can be particularly difficult for researchers to negotiate given that huge diversity of policies across different journals. We have therefore developed a series of standardised data policies, which have now been adopted by more than 1,600 Springer Nature journals. </p><p>We believe that these initiatives make important strides in challenging the current replication crisis and addressing the economic<sup>6</sup> and societal consequences of data unavailability. They also offer an opportunity to drive change in how academic credit is measured, through the recognition of a wider range of research outputs than articles and their citations alone. As signatories of the San Francisco Declaration on Research Assessment<sup>7</sup>, Nature Research is committed to improving the methods of evaluating scholarly research. Research data in this context offers new mechanisms to measure the impact of all research outputs. To this end, Springer Nature supports the publication of peer-reviewed data papers through journals like <em>Scientific Data</em>. Analysis of citation patterns demonstrate that data papers can be well-cited, and offer a viable way for researchers to receive credit for data sharing through traditional citation metrics. Springer Nature is also working hard to improve support for direct data citation. In 2018 a data citation roadmap developed by the Publishers Early Adopters Expert Group was published in <em>Scientific Data</em><sup>8</sup>, outlining practical steps for publishers to work with data citations and associated benefits in transparency and credit for researchers. Using examples from this roadmap, its implementation and supporting services, we outline how a FAIR-led data approach from publishers can help researchers in the Earth and environmental sciences to capitalise on new expectations around data sharing.</p><p>__</p><ol><li>https://doi.org/10.1038/d41586-019-00075-3</li> <li>https://doi.org/10.1038/s41561-019-0506-4</li> <li>https://copdess.org/enabling-fair-data-project/commitment-statement-in-the-earth-space-and-environmental-sciences/</li> <li>https://copdess.org/statement-of-commitment/</li> <li>https://www.force11.org/group/fairgroup/fairprinciples</li> <li>https://op.europa.eu/en/publication-detail/-/publication/d375368c-1a0a-11e9-8d04-01aa75ed71a1</li> <li>https://sfdora.org/read/</li> <li>https://doi.org/10.1038/sdata.2018.259</li> </ol>


2022 ◽  
Vol 29 (1) ◽  
pp. 91-101
Author(s):  
Gustavo Caetano Borges ◽  
Julio Cesar Dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in the agriculture domain.


2021 ◽  
Author(s):  
Gustavo Caetano Borges ◽  
Julio César dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in agriculture.


2020 ◽  
Author(s):  
Mario Gollwitzer ◽  
Andrea Abele-Brehm ◽  
Christian Fiebach ◽  
Roland Ramthun ◽  
Anne M. Scheel ◽  
...  

Providing access to research data collected as part of scientific publications and publicly funded research projects is now regarded as a central aspect of an open and transparent scientific practice and is increasingly being called for by funding institutions and scientific journals. To this end, researchers should strive to comply with the so-called FAIR principles (of scientific data management), that is, research data should be findable, accessible, interoperable, and reusable. Systematic data management supports these goals and, at the same time, makes it possible to achieve them efficiently. With these revised recommendations on data management and data sharing, which also draw on feedback from a 2018 survey of its members, the German Psychological Society (Deutsche Gesellschaft für Psychologie; DGPs) specifies important basic principles of data management in psychology. Initially, based on discipline-specific definitions of raw data, primary data, secondary data, and metadata, we provide recommendations on the degree of data processing necessary when publishing data. We then discuss data protection as well as aspects of copyright and data usage before defining the qualitative requirements for trustworthy research data repositories. This is followed by a detailed discussion of pragmatic aspects of data sharing, such as the differences between Type 1 and Type 2 data publications, restrictions on use (embargo period), the definition of "scientific use" by secondary users of shared data, and recommendations on how to resolve potential disputes. Particularly noteworthy is the new recommendation of distinct "access categories" for data, each with different requirements in terms of data protection or research ethics. These range from completely open data without usage restrictions ("access category 0") to data shared under a set of standardized conditions (e.g., reuse restricted to scientific purposes; "access category 1"), individualized usage agreements ("access category 2"), and secure data access under strictly controlled conditions (e.g., in a research data center; “access category 3"). The practical implementation of this important innovation, however, will require data repositories to provide the necessary technical functionalities. In summary, the revised recommendations aim to present pragmatic guidelines for researchers to handle psychological research data in an open and transparent manner, while addressing structural challenges to data sharing solutions that are beneficial for all involved parties.


2021 ◽  
pp. 016555152199863
Author(s):  
Ismael Vázquez ◽  
María Novo-Lourés ◽  
Reyes Pavón ◽  
Rosalía Laza ◽  
José Ramón Méndez ◽  
...  

Current research has evolved in such a way scientists must not only adequately describe the algorithms they introduce and the results of their application, but also ensure the possibility of reproducing the results and comparing them with those obtained through other approximations. In this context, public data sets (sometimes shared through repositories) are one of the most important elements for the development of experimental protocols and test benches. This study has analysed a significant number of CS/ML ( Computer Science/ Machine Learning) research data repositories and data sets and detected some limitations that hamper their utility. Particularly, we identify and discuss the following demanding functionalities for repositories: (1) building customised data sets for specific research tasks, (2) facilitating the comparison of different techniques using dissimilar pre-processing methods, (3) ensuring the availability of software applications to reproduce the pre-processing steps without using the repository functionalities and (4) providing protection mechanisms for licencing issues and user rights. To show the introduced functionality, we created STRep (Spam Text Repository) web application which implements our recommendations adapted to the field of spam text repositories. In addition, we launched an instance of STRep in the URL https://rdata.4spam.group to facilitate understanding of this study.


Sign in / Sign up

Export Citation Format

Share Document