scholarly journals Show Me The Data: The Pilot UK Research Data Registry

2014 ◽  
Vol 9 (1) ◽  
pp. 132-141
Author(s):  
Alexander Ball ◽  
Kevin Ashley ◽  
Patrick McCann ◽  
Laura Molloy ◽  
Veerle Van den Eynden

The UK Research Data (Metadata) Registry (UKRDR) pilot project is implementing a prototype registry for the UK’s research data assets, enabling the holdings of subject-based data centres and institutional data repositories alike to be searched from a single location. The purpose of the prototype is to prove the concept of the registry, and uncover challenges that will need to be addressed if and when the registry is developed into a sustainable service. The prototype is being tested using metadata records harvested from nine UK data centres and the data repositories of nine UK universities.

Author(s):  
Johannes Hubert Stigler ◽  
Elisabeth Steiner

Research data repositories and data centres are becoming more and more important as infrastructures in academic research. The article introduces the Humanities’ research data repository GAMS, starting with the system architecture to preservation policy and content policy. Challenges of data centres and repositories and the general and domain-specific approaches and solutions are outlined. Special emphasis lies on the sustainability and long-term perspective of such infrastructures, not only on the technical but above all on the organisational and financial level.


2020 ◽  
Vol 33 ◽  
pp. 01003
Author(s):  
Wouter Haak ◽  
Alberto Zigoni ◽  
Helen Kardinaal-de Mooij ◽  
Elena Zudilova-Seinstra

Institutions, funding bodies, and national research organizations are pushing for more data sharing and FAIR data. Institutions typically implement data policies, frequently supported by an institutional data repository. Funders typically mandate data sharing. So where does this leave the researcher? How can researchers benefit from doing the additional work to share their data? In order to make sure that researchers and institutions get credit for sharing their data, the data needs to be tracked and attributed first. In this paper we investigated where the research data ended up for 11 research institutions, and how this data is currently tracked and attributed. Furthermore, we also analysed the gap between the research data that is currently in institutional repositories, and where their researchers truly share their data. We found that 10 out of 11 institutions have most of their public research data hosted outside of their own institution. Combined, they have 12% of their institutional research data published in the institutional data repositories. According to our data, the typical institution had 5% of their research data (median) published in the institutional repository, but there were 4 universities for which it was 10% or higher. By combining existing data-to-article graphs with existing article-to- researcher and article-to-institution graphs it becomes possible to increase tracking of public research data and therefore the visibility of researchers sharing their data typically by 17x. The tracking algorithm that was used to perform analysis and report on potential improvements has subsequently been implemented as a standard method in the Mendeley Data Monitor product. The improvement is most likely an under-estimate because, while the recall for datasets in institutional repositories is 100%, that is not the case for datasets published outside the institutions, so there are even more datasets still to be discovered.


2011 ◽  
Vol 6 (2) ◽  
pp. 274-287 ◽  
Author(s):  
James A. J. Wilson ◽  
Luis Martinez-Uribe ◽  
Michael A. Fraser ◽  
Paul Jeffreys

This article outlines the work that the University of Oxford is undertaking to implement a coordinated data management infrastructure. The rationale for the approach being taken by Oxford is presented, with particular attention paid to the role of each service division. This is followed by a consideration of the relative advantages and disadvantages of institutional data repositories, as opposed to national or international data centres. The article then focuses on two ongoing JISC-funded projects, ‘Embedding Institutional Data Curation Services in Research’ (Eidcsr) and ‘Supporting Data Management Infrastructure for the Humanities’ (Sudamih). Both projects are intra-institutional collaborations and involve working with researchers to develop particular aspects of infrastructure, including: University policy, systems for the preservation and documentation of research data, training and support, software tools for the visualisation of large images, and creating and sharing databases via the Web (Database as a Service).


2014 ◽  
Vol 9 (1) ◽  
pp. 152-163 ◽  
Author(s):  
Sarah Callaghan ◽  
Jonathan Tedds ◽  
John Kunze ◽  
Varsha Khodiyar ◽  
Rebecca Lawrence ◽  
...  

This document summarises guidelines produced by the UK Jisc-funded PREPARDE data publication project on the key issues of repository accreditation. It aims to lay out the principles and the requirements for data repositories intent on providing a dataset as part of the research record and as part of a research publication. The data publication requirements that repository accreditation may support are rapidly changing, hence this paper is intended as a provocation for further discussion and development in the future.


2009 ◽  
Vol 31 (3) ◽  
pp. 21
Author(s):  
Robin Rice

DISC-UK DataShare Project: Building Exemplars for Institutional Data Repositories in the UK


2021 ◽  
pp. 016555152199863
Author(s):  
Ismael Vázquez ◽  
María Novo-Lourés ◽  
Reyes Pavón ◽  
Rosalía Laza ◽  
José Ramón Méndez ◽  
...  

Current research has evolved in such a way scientists must not only adequately describe the algorithms they introduce and the results of their application, but also ensure the possibility of reproducing the results and comparing them with those obtained through other approximations. In this context, public data sets (sometimes shared through repositories) are one of the most important elements for the development of experimental protocols and test benches. This study has analysed a significant number of CS/ML ( Computer Science/ Machine Learning) research data repositories and data sets and detected some limitations that hamper their utility. Particularly, we identify and discuss the following demanding functionalities for repositories: (1) building customised data sets for specific research tasks, (2) facilitating the comparison of different techniques using dissimilar pre-processing methods, (3) ensuring the availability of software applications to reproduce the pre-processing steps without using the repository functionalities and (4) providing protection mechanisms for licencing issues and user rights. To show the introduced functionality, we created STRep (Spam Text Repository) web application which implements our recommendations adapted to the field of spam text repositories. In addition, we launched an instance of STRep in the URL https://rdata.4spam.group to facilitate understanding of this study.


2017 ◽  
Vol 12 (1) ◽  
pp. 88-105 ◽  
Author(s):  
Sünje Dallmeier-Tiessen ◽  
Varsha Khodiyar ◽  
Fiona Murphy ◽  
Amy Nurnberger ◽  
Lisa Raymond ◽  
...  

The data curation community has long encouraged researchers to document collected research data during active stages of the research workflow, to provide robust metadata earlier, and support research data publication and preservation. Data documentation with robust metadata is one of a number of steps in effective data publication. Data publication is the process of making digital research objects ‘FAIR’, i.e. findable, accessible, interoperable, and reusable; attributes increasingly expected by research communities, funders and society. Research data publishing workflows are the means to that end. Currently, however, much published research data remains inconsistently and inadequately documented by researchers. Documentation of data closer in time to data collection would help mitigate the high cost that repositories associate with the ingest process. More effective data publication and sharing should in principle result from early interactions between researchers and their selected data repository. This paper describes a short study undertaken by members of the Research Data Alliance (RDA) and World Data System (WDS) working group on Publishing Data Workflows. We present a collection of recent examples of data publication workflows that connect data repositories and publishing platforms with research activity ‘upstream’ of the ingest process. We re-articulate previous recommendations of the working group, to account for the varied upstream service components and platforms that support the flow of contextual and provenance information downstream. These workflows should be open and loosely coupled to support interoperability, including with preservation and publication environments. Our recommendations aim to stimulate further work on researchers’ views of data publishing and the extent to which available services and infrastructure facilitate the publication of FAIR data. We also aim to stimulate further dialogue about, and definition of, the roles and responsibilities of research data services and platform providers for the ‘FAIRness’ of research data publication workflows themselves.


Sign in / Sign up

Export Citation Format

Share Document