Delivering on open data: Current practices in data sharing, and challenges ahead!

Septentrio Conference Series ◽

10.7557/5.4510 ◽

2018 ◽

Author(s):

Timon Oefelein

Keyword(s):

Data Sharing ◽

Open Data ◽

Research Data ◽

Data Reuse ◽

Valuable Insight ◽

Research Libraries ◽

Critical Areas ◽

Clear Vision ◽

Working Together ◽

Current Practices

Watch the VIDEO.The case for sharing research data has been strongly made in many parts of the world, but noticeably so in Europe. Open access to research data delivers more value for every funding Euro by enabling data reuse and reducing unnecessary duplication of research. Further, open data can help speed the pace of discovery and allows for reproducibility studies. The European Commission has set out a clear vision for open data in their Horizon Europe proposal. Yet in 2018 only about half of research data are shared, according to surveys of researchers, and a much smaller proportion are shared openly or in ways that maximise discoverability and reuse. Whilst policy implementation remains critical to the uptake of data sharing, this must be joined by greater support and education for researchers, and faster, easier routes to sharing data optimally. We also need to make it worth a researcher’s time to share their data. Starting with the case for better data practice, this talk showcases the findings of one of the largest author surveys of its kind on current practices, attitudes and perceptions in data-sharing at the point of scholarly publication. The survey, carried out by Springer Nature in 2018, is based on over 7700 responses from academic researchers - at various levels of their career – in Europe, Asia, America, and Australasia. Responses are from across all subject areas. The resulting data provides a valuable insight into how, where, and why data is currently shared and what the main obstacles to sharing it are.The talk identifies the most “critical areas” – as borne out of the survey findings – that need to be tackled with top priority if we are to accelerate the speed and scope of data-sharing. In closing, we therefore ask – how can we better work together across research libraries, institutions, funders, governments, and publishers, to address and action these “critical areas”? Indeed, it is only by working together that we can unlock the huge potential of research data, namely to improve our knowledge, to address the grand societal challenges, and to help solve some of the most pressing problems in science today.

Download Full-text

Práticas e percepções dos pesquisadores brasileiros sobre serviços de acesso aberto a dados de pesquisa | Practices and perceptions of Brazilian researchers on open access to research data

Liinc em Revista ◽

10.18617/liinc.v15i2.4771 ◽

2019 ◽

Vol 15 (2) ◽

Author(s):

Sonia Elisa Caregnato ◽

Samile Andrea de Souza Vanz ◽

Caterina Groposo Pavão ◽

Paula Caroline Jardim Schifino Passos ◽

Eduardo Borges ◽

...

Keyword(s):

Open Access ◽

Data Sharing ◽

Research Data ◽

Exploratory Analysis ◽

Data Reuse ◽

Institutional Repositories ◽

Open Research

RESUMO O artigo apresenta análise exploratória das práticas e das percepções a respeito do acesso aberto a dados de pesquisa embasada em dados coletados por meio de survey, realizada com pesquisadores brasileiros. As 4.676 respostas obtidas demonstram que, apesar do grande interesse pelo tema, evidenciado pela prevalência de variáveis relacionadas ao compartilhamento e ao uso de dados e aos repositórios institucionais, não há clareza por parte dos sujeitos sobre os principais tópicos relacionados. Conclui-se que, apesar da maioria dos pesquisadores afirmar que compartilha dados de pesquisa, a disponibilização desses dados de forma aberta e irrestrita ainda não é amplamente aceita.Palavras-chave: Dados Abertos de Pesquisa; Compartilhamento de Dados; Reuso de Dados.ABSTRACT This article presents an exploratory analysis of the practices and perceptions regarding open access to research data based on information collected by a survey with Brazilian researchers. The 4,676 responses show that, despite the great interest in the topic, evidenced by the prevalence of variables related to data sharing and use and to institutional repositories, there is no clarity on the part of the subjects on the main related topics. We conclude that, although the majority of the researchers share research data, the availability of this data in an open and unrestricted way is not yet widely accepted.Keywords: Open Research Data; Data Sharing; Data Reuse.

Download Full-text

PsychData – Experiences from 12 Years of Research Data Archiving

Septentrio Conference Series ◽

10.7557/5.3666 ◽

2015 ◽

Author(s):

Peter Weiland ◽

Ina Dehnhard

Keyword(s):

Data Sharing ◽

Large Scale ◽

Research Data ◽

Data Reuse ◽

German Research Foundation ◽

Cross Sectional ◽

Data Archiving ◽

Wide Range ◽

Domain Specific Knowledge ◽

Meta Analyses

See video of the presentation.The benefits of making research data permanently accessible through data archives is widely recognized: costs can be reduced by reusing existing data, research results can be compared and validated with results from archived studies, fraud can be more easily detected, and meta-analyses can be conducted. Apart from that, authors may gain recognition and reputation for producing the datasets. Since 2003, the accredited research data center PsychData (part of the Leibniz Institute for Psychology Information in Trier, Germany) documents and archives research data from all areas of psychology and related fields. In the beginning, the main focus was on datasets that provide a high potential for reuse, e.g. longitudinal studies, large-scale cross sectional studies, or studies that were conducted during historically unique conditions. Presently, more and more journal publishers and project funding agencies require researchers to archive their data and make them accessible for the scientific community. Therefore, PsychData also has to serve this need.In this presentation we report on our experiences in operating a discipline-specific research data archive in a domain where data sharing is met with considerable resistance. We will focus on the challenges for data sharing and data reuse in psychology, e.g.large amount of domain-specific knowledge necessary for data curationhigh costs for documenting the data because of a wide range on non-standardized measuressmall teams and little established infrastructures compared with the "big data" disciplinesstudies in psychology not designed for reuse (in contrast to the social sciences)data protectionresistance to sharing dataAt the end of the presentation, we will provide a brief outlook on DataWiz, a new project funded by the German Research Foundation (DFG). In this project, tools will be developed to support researchers in documenting their data during the research phase.

Download Full-text

Publishing descriptions of non-public clinical datasets: guidance for researchers, repositories, editors and funding organisations

10.1101/021667 ◽

2015 ◽

Cited By ~ 3

Author(s):

Iain Hrynaszkiewicz ◽

Varsha Khodiyar ◽

Andrew L Hufton ◽

Susanna-Assunta Sansone

Keyword(s):

Clinical Research ◽

Data Sharing ◽

Peer Review Process ◽

Open Data ◽

Data Access ◽

Research Data ◽

Future Research ◽

Patient Privacy ◽

Journal Articles ◽

Data Repositories

AbstractSharing of experimental clinical research data usually happens between individuals or research groups rather than via public repositories, in part due to the need to protect research participant privacy. This approach to data sharing makes it difficult to connect journal articles with their underlying datasets and is often insufficient for ensuring access to data in the long term. Voluntary data sharing services such as the Yale Open Data Access (YODA) and Clinical Study Data Request (CSDR) projects have increased accessibility to clinical datasets for secondary uses while protecting patient privacy and the legitimacy of secondary analyses but these resources are generally disconnected from journal articles – where researchers typically search for reliable information to inform future research. New scholarly journal and article types dedicated to increasing accessibility of research data have emerged in recent years and, in general, journals are developing stronger links with data repositories. There is a need for increased collaboration between journals, data repositories, researchers, funders, and voluntary data sharing services to increase the visibility and reliability of clinical research. We propose changes to the format and peer-review process for journal articles to more robustly link them to data that are only available on request. We also propose additional features for data repositories to better accommodate non-public clinical datasets, including Data Use Agreements (DUAs).

Download Full-text

A survey of researchers' needs and priorities for data sharing

10.31219/osf.io/njr5u ◽

2021 ◽

Author(s):

Iain Hrynaszkiewicz ◽

James Harney ◽

Lauren Cadwallader

Keyword(s):

Data Sharing ◽

Research Impact ◽

Open Science ◽

Research Data ◽

Data Reuse ◽

Data Availability ◽

Data Repositories ◽

Use Of Data ◽

Share Data ◽

Do So

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data. In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data. In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data. There may however be opportunities - unmet researcher needs - in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.

Download Full-text

A descriptive analysis of the data availability statements accompanying medRxiv preprints and a comparison with their published counterparts

PLoS ONE ◽

10.1371/journal.pone.0250887 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0250887

Author(s):

Luke A. McGuinness ◽

Athena L. Sheppard

Keyword(s):

Data Sharing ◽

Descriptive Analysis ◽

Open Data ◽

System Change ◽

Research Data ◽

Data Availability ◽

Published Data ◽

Editorial Policies ◽

Journal Editors ◽

Closed Data

Objective To determine whether medRxiv data availability statements describe open or closed data—that is, whether the data used in the study is openly available without restriction—and to examine if this changes on publication based on journal data-sharing policy. Additionally, to examine whether data availability statements are sufficient to capture code availability declarations. Design Observational study, following a pre-registered protocol, of preprints posted on the medRxiv repository between 25th June 2019 and 1st May 2020 and their published counterparts. Main outcome measures Distribution of preprinted data availability statements across nine categories, determined by a prespecified classification system. Change in the percentage of data availability statements describing open data between the preprinted and published versions of the same record, stratified by journal sharing policy. Number of code availability declarations reported in the full-text preprint which were not captured in the corresponding data availability statement. Results 3938 medRxiv preprints with an applicable data availability statement were included in our sample, of which 911 (23.1%) were categorized as describing open data. 379 (9.6%) preprints were subsequently published, and of these published articles, only 155 contained an applicable data availability statement. Similar to the preprint stage, a minority (59 (38.1%)) of these published data availability statements described open data. Of the 151 records eligible for the comparison between preprinted and published stages, 57 (37.7%) were published in journals which mandated open data sharing. Data availability statements more frequently described open data on publication when the journal mandated data sharing (open at preprint: 33.3%, open at publication: 61.4%) compared to when the journal did not mandate data sharing (open at preprint: 20.2%, open at publication: 22.3%). Conclusion Requiring that authors submit a data availability statement is a good first step, but is insufficient to ensure data availability. Strict editorial policies that mandate data sharing (where appropriate) as a condition of publication appear to be effective in making research data available. We would strongly encourage all journal editors to examine whether their data availability policies are sufficiently stringent and consistently enforced.

Download Full-text

Abstract 3356: Working together to put kids first: Outreach strategies driving collaborative research, data sharing and cross-disease analysis to accelerate discoveries in pediatric cancer and structural birth defects

10.1158/1538-7445.am2019-3356 ◽

2019 ◽

Author(s):

Tatiana S. Patton ◽

Robert Moulder ◽

Erin Alexander ◽

Donna Vito ◽

Jonathan Waller ◽

...

Keyword(s):

Data Sharing ◽

Birth Defects ◽

Pediatric Cancer ◽

Collaborative Research ◽

Research Data ◽

Working Together ◽

Kids First ◽

Disease Analysis

Download Full-text

Researchers at Arab Universities Hold Positive Views on Research Data Management and Data Sharing

Evidence Based Library and Information Practice ◽

10.18438/eblip29746 ◽

2020 ◽

Vol 15 (2) ◽

pp. 168-170

Author(s):

Jennifer Kaari

Keyword(s):

Data Management ◽

Data Sharing ◽

Data Privacy ◽

Management Practices ◽

Open Data ◽

Scientific Progress ◽

Management Plan ◽

Research Data ◽

Data Repository ◽

Research Data Management

A Review of: Elsayed, A. M., & Saleh, E. I. (2018). Research data management and sharing among researchers in Arab universities: An exploratory study. IFLA Journal, 44(4), 281–299. https://doi.org/10.1177/0340035218785196 Abstract Objective – To investigate researchers’ practices and attitudes regarding research data management and data sharing. Design – Email survey. Setting – Universities in Egypt, Jordan, and Saudi Arabia. Subjects – Surveys were sent to 4,086 academic faculty researchers. Methods – The survey was emailed to faculty at three Arab universities, targeting faculty in the life sciences and engineering. The survey was created using Google Docs and remained open for five months. Participants were asked basic demographic questions, questions regarding their research data and metadata practices, and questions regarding their data sharing practices. Main Results – The authors received 337 responses, for a response rate of 8%. The results showed that 48.4% of respondents had a data management plan and that 97% were responsible for preserving their own data. Most respondents stored their research data on their personal storage devices. The authors found that 64.4% of respondents reported sharing their research data. Respondents most frequently shared their data by publishing in a data research journal, sharing through academic social networks such as ResearchGate, and providing data upon request to peers. Only 5.1% of respondents shared data through an open data repository. Of those who did not share data, data privacy and confidentiality were the most common reasons cited. Of the respondents who did share their data, contributing to scientific progress and increased citation and visibility were the primary reasons for doing so. A total of 59.6% of respondents stated that they needed more training in research data management from their universities. Conclusion – The authors conclude that researchers at Arab universities are still primarily responsible for their own data and that data management planning is still a new concept to most researchers. For the most part, the researchers had a positive attitude toward data sharing, although depositing data in open repositories is still not a widespread practice. The authors conclude that in order to encourage strong data management practices and open data sharing among Arab university researchers, more training and institutional support is needed.

Download Full-text

Initiating FAIR geothermal data in Indonesia

10.5194/egusphere-egu21-14438 ◽

2021 ◽

Author(s):

Dasapta Erwin Irawan

Keyword(s):

Data Sharing ◽

Data Exchange ◽

Open Data ◽

Data Reuse ◽

Data Availability ◽

Scientific Development ◽

For Profit ◽

Deep Well ◽

Data Schema ◽

Corporate Social

One of the main keys to scientific development is data availability. Not only the data is easily discovered and downloaded, there's also needs for the data to be easily reused. Geothermal researchers, research institutions and industries are the three main stakeholders to foster data sharing and data reuse. Very expensive deep well datasets as well as advanced logging datasets are very important not only for exploitation purposes but also for the community involved eg: for regional planning or common environmental analyses. In data sharing, we have four principles of F.A.I.R data. Principle 1 Findable: data uploaded to open repository with proper data documentations and data schema, Principle 2 Accessible: removed access restrictions such as user id and password for easy downloads. In case of data from commercial entities, embargoed data is permitted with a clear embargo duration and data request procedure, Principle 3 Interoperable: all data must be prepared in a manner for straightforward data exchange between platforms, Principle 4 Reusable: all data must be submitted using common conventional file format, preferably text-based file (eg `csv` or `txt`) therefore it can be analyzed using various software and hardware. The fact that geothermal industries are packed with for-profit motivations and capital intensive would give even more reasons to embrace data sharing. It would be a good way for them to share their role in supporting society. The contributions from multiple stakeholders are the most essential part in science development. In the context of the commercial industry, data sharing is a form of corporate social responsibility (CSR). It shouldn't be defined only as giving out funding to support local communities.Keywords: open data, FAIR data, data sharing&#160;&#160;

Download Full-text

Data Producers Courting Data Reusers: Two Cases from Modeling Communities

International Journal of Digital Curation ◽

10.2218/ijdc.v9i1.304 ◽

2014 ◽

Vol 9 (1) ◽

pp. 98-109 ◽

Cited By ~ 5

Author(s):

Jillian Wallis

Keyword(s):

Data Sharing ◽

Climate Modeling ◽

Data Reuse ◽

Viable Option ◽

Attractive Option ◽

Working Together

Data sharing is a difficult process for both the data producer and the data reuser. Both parties are faced with more disincentives than incentives. Data producers need to sink time and resources into adding metadata for data to be findable and usable, and there is no promise of receiving credit for this effort. Making data available also leaves data producers vulnerable to being scooped or data misuse. Data reusers also need to sink time and resources into evaluating data and trying to understand them, making collecting their own data a more attractive option. In spite of these difficulties, some data producers are looking for new ways to make data sharing and reuse a more viable option. This paper presents two cases from the surface and climate modeling communities, where researchers who produce data are reaching out to other researchers who would be interested in reusing the data. These cases are evaluated as a strategy to identify ways to overcome the challenges typically experienced by both data producers and data reusers. By working together with reusers, data producers are able to mitigate the disincentives and create incentives for sharing data. By working with data producers, data reusers are able to circumvent the hurdles that make data reuse so challenging.

Download Full-text

Responding to Reality: Evolving Curation Practices and Infrastructure at the University of Illinois at Urbana-Champaign

Journal of eScience Librarianship ◽

10.7191/jeslib.2021.1202 ◽

2021 ◽

Vol 10 (3) ◽

Author(s):

Hoa Q. Luong ◽

Colleen Fallaw ◽

Genevieve Schmitt ◽

Susan M. Braxton ◽

Heidi Imker

Keyword(s):

Data Sharing ◽

Positive Impact ◽

Data Bank ◽

Research Data ◽

Data Reuse ◽

Data Repository ◽

Data Service ◽

University Of Illinois ◽

Service Offering ◽

The University

Objective: The Illinois Data Bank provides Illinois researchers with the infrastructure to publish research data publicly. During a five-year review of the Research Data Service at the University of Illinois at Urbana-Champaign, it was recognized as the most useful service offering in the unit. Internal metrics are captured and used to monitor the growth, document curation workflows, and surface technical challenges faced as we assist our researchers. Here we present examples of these curation challenges and the solutions chosen to address them. Methods: Some Illinois Data Bank metrics are collected internally by within the system, but most of the curation metrics reported here are tracked separately in a Google spreadsheet. The curator logs required information after curation is complete for each dataset. While the data is sometimes ambiguous (e.g., depending on researcher uptake of suggested actions), our curation data provide a general understanding about our data repository and have been useful in assessing our workflows and services. These metrics also help prioritize development needs for the Illinois Data Bank. Results and Conclusions: The curatorial services polish and improve the datasets, which contributes to the spirit of data reuse. Although we continue to see challenges in our processes, curation makes a positive impact on datasets. Continued development and adaptation of the technical infrastructure allows for an ever-better experience for the curators and users. These improvements have helped our repository more effectively support the data sharing process by successfully fostering depositor engagement with curators to improve datasets and facilitating easy transfer of very large files.

Download Full-text