A RCT study of DataCite metadata completeness to quantify the benefits of metadata quality on dataset uptake and sharing
Properly describing and documenting data via rich metadata allows users to understand and track important details of the work. It is thought that richer metadata fuels discovery and innovation, increasing the discoverability and reusability of research, and eliminating duplication of effort. Researchers are told to expend effort to improve their metadata, but efforts to quantify the benefits of putting in this effort have not been carried out. With that in mind we have carried out a randomised controlled study of DataCite metadata completeness on dataset uptake and sharing. 1093 datasets were randomised and given rich vs minimal metadata, and the effect of re-use and sharing of this data was followed over a 1-year period. Analysing this data after the trial period, based on the very low rate of page-views there was no significant difference detected between the two different RCT groups. Despite the negative findings we would like to share this proof of concept and provide lessons learned for how randomised controlled trials should be carried out to quantify the benefits of FAIR data sharing.