scholarly journals A descriptive analysis of the data availability statements accompanying medRxiv preprints and a comparison with their published counterparts

PLoS ONE ◽  
2021 ◽  
Vol 16 (5) ◽  
pp. e0250887
Author(s):  
Luke A. McGuinness ◽  
Athena L. Sheppard

Objective To determine whether medRxiv data availability statements describe open or closed data—that is, whether the data used in the study is openly available without restriction—and to examine if this changes on publication based on journal data-sharing policy. Additionally, to examine whether data availability statements are sufficient to capture code availability declarations. Design Observational study, following a pre-registered protocol, of preprints posted on the medRxiv repository between 25th June 2019 and 1st May 2020 and their published counterparts. Main outcome measures Distribution of preprinted data availability statements across nine categories, determined by a prespecified classification system. Change in the percentage of data availability statements describing open data between the preprinted and published versions of the same record, stratified by journal sharing policy. Number of code availability declarations reported in the full-text preprint which were not captured in the corresponding data availability statement. Results 3938 medRxiv preprints with an applicable data availability statement were included in our sample, of which 911 (23.1%) were categorized as describing open data. 379 (9.6%) preprints were subsequently published, and of these published articles, only 155 contained an applicable data availability statement. Similar to the preprint stage, a minority (59 (38.1%)) of these published data availability statements described open data. Of the 151 records eligible for the comparison between preprinted and published stages, 57 (37.7%) were published in journals which mandated open data sharing. Data availability statements more frequently described open data on publication when the journal mandated data sharing (open at preprint: 33.3%, open at publication: 61.4%) compared to when the journal did not mandate data sharing (open at preprint: 20.2%, open at publication: 22.3%). Conclusion Requiring that authors submit a data availability statement is a good first step, but is insufficient to ensure data availability. Strict editorial policies that mandate data sharing (where appropriate) as a condition of publication appear to be effective in making research data available. We would strongly encourage all journal editors to examine whether their data availability policies are sufficiently stringent and consistently enforced.

2020 ◽  
Author(s):  
Luke A McGuinness ◽  
Athena Louise Sheppard

ObjectiveTo determine whether medRxiv data availability statements describe open or closed data - that is, whether the data used in the study is openly available without restriction - and to examine if this changes on publication based on journal data sharing policy. Additionally, to examine whether data availability statements are sufficient to capture code availability declarations.DesignObservational study, following a pre-registered protocol, of preprints posted on the medRxiv repository between 25th June 2019 and 1st May 2020 and their published counterparts.Main outcome measuresDistribution of preprinted data availability statements across eight categories, determined by a prespecified classification system.Change in the percentage of data availability statements describing open data between the preprinted and published versions of the same record, stratified by journal sharing policy.Number of code availability declarations reported in the full-text preprint which were not captured in the corresponding data availability statement.Results4101 medRxiv preprints were included in our sample, of which 911 (22.2%) were categorized as describing open data, 3027 (73.8%) as describing closed data, 163 (4.0%) as not applicable (e.g. editorial, protocol). 379 (9.2%) preprints were subsequently published, and of these published articles, only 159 (42.0%) contained a data availability statement. Similar to the preprint stage, most published data availability statements described closed data (59 (37.1%) open, 96 (60.4%) closed, 4 (2.5%) not applicable).Of the 151 records eligible for the comparison between preprinted and published stages, 57 (37.7%) were published in journals which mandated open data sharing. Data availability statements more frequently described open data on publication when the journal mandated data sharing (open at preprint: 33.3%, open at publication: 61.4%) compared to when the journal did not mandate data sharing (open at preprint: 20.2%, open at publication: 22.3%).ConclusionRequiring that authors submit a data availability statement is a good first step, but is insufficient to ensure data availability. Strict editorial policies that require data sharing (where appropriate) as a condition of publication appear to be effective in making research data available. We would strongly encourage all journal editors to examine whether their data availability policies are sufficiently stringent and consistently enforced.


2018 ◽  
Vol 6 (2) ◽  
pp. 125-143 ◽  
Author(s):  
Ben Marwick ◽  
Suzanne E. Pilaar Birch

ABSTRACTHow do archaeologists share their research data, if at all? We review what data are, according to current influential definitions, and previous work on the benefits, costs, and norms of data sharing in the sciences broadly. To understand data sharing in archaeology, we present the results of three pilot studies: requests for data by e-mail, review of data availability in published articles, and analysis of archaeological datasets deposited in repositories. We find that archaeologists are often willing to share but that discipline-wide sharing is patchy and ad hoc. Legislation and mandates are effective at increasing data sharing, but editorial policies at journals lack adequate enforcement. Although most of the data available at repositories are licensed to enable flexible reuse, only a small proportion of the data are stored in structured formats for easy reuse. We present some suggestions for improving the state of date sharing in archaeology; among these is a standard for citing datasets to ensure that researchers making their data publicly available receive appropriate credit.


2017 ◽  
Author(s):  
Ben Marwick ◽  
Suzanne E Pilaar Birch

How do archaeologists share their research data, if at all? We review what data are, according to current influential definitions, and previous work on the benefits, costs and norms of data sharing in the sciences broadly. To understand data sharing in archaeology, we present the results of three pilot studies: requests for data by email; review of data availability in published articles, and analysis of archaeological datasets deposited in repositories. We find that archaeologists are often willing to share, but discipline-wide sharing is patchy and ad hoc. Legislation and mandates are effective at increasing data-sharing, but editorial policies at journals lack adequate enforcement. Although most of data available at repositories are licensed to enable flexible reuse, only a small proportion of the data are stored in structured formats for easy reuse. We present some suggestions for improving the state of date sharing in archaeology, among these is a standard for citing data sets to ensure that researchers making their data publicly available receive appropriate credit.


2021 ◽  
pp. 002203452110202
Author(s):  
F. Schwendicke ◽  
J. Krois

Data are a key resource for modern societies and expected to improve quality, accessibility, affordability, safety, and equity of health care. Dental care and research are currently transforming into what we term data dentistry, with 3 main applications: 1) medical data analysis uses deep learning, allowing one to master unprecedented amounts of data (language, speech, imagery) and put them to productive use. 2) Data-enriched clinical care integrates data from individual (e.g., demographic, social, clinical and omics data, consumer data), setting (e.g., geospatial, environmental, provider-related data), and systems level (payer or regulatory data to characterize input, throughput, output, and outcomes of health care) to provide a comprehensive and continuous real-time assessment of biologic perturbations, individual behaviors, and context. Such care may contribute to a deeper understanding of health and disease and a more precise, personalized, predictive, and preventive care. 3) Data for research include open research data and data sharing, allowing one to appraise, benchmark, pool, replicate, and reuse data. Concerns and confidence into data-driven applications, stakeholders’ and system’s capabilities, and lack of data standardization and harmonization currently limit the development and implementation of data dentistry. Aspects of bias and data-user interaction require attention. Action items for the dental community circle around increasing data availability, refinement, and usage; demonstrating safety, value, and usefulness of applications; educating the dental workforce and consumers; providing performant and standardized infrastructure and processes; and incentivizing and adopting open data and data sharing.


2019 ◽  
Vol 10 (20) ◽  
pp. 17 ◽  
Author(s):  
Mattia Previtali ◽  
Riccardo Valente

<p>The open data paradigm is changing the research approach in many fields such as remote sensing and the social sciences. This is supported by governmental decisions and policies that are boosting the open data wave, and in this context archaeology is also affected by this new trend. In many countries, archaeological data are still protected or only limited access is allowed. However, the strong political and economic support for the publication of government data as open data will change the accessibility and disciplinary expertise in the archaeological field too. In order to maximize the impact of data, their technical openness is of primary importance. Indeed, since a spreadsheet is more usable than a PDF of a table, the availability of digital archaeological data, which is structured using standardised approaches, is of primary importance for the real usability of published data. In this context, the main aim of this paper is to present a workflow for archaeological data sharing as open data with a large level of technical usability and interoperability. Primary data is mainly acquired through the use of digital techniques (e.g. digital cameras and terrestrial laser scanning). The processing of this raw data is performed with commercial software for scan registration and image processing, allowing for a simple and semi-automated workflow. Outputs obtained from this step are then processed in modelling and drawing environments to generate digital models, both 2D and 3D. These crude geometrical data are then enriched with further information to generate a Geographic Information System (GIS) which is finally published as open data using Open Geospatial Consortium (OGC) standards to maximise interoperability.</p><p><strong>Highlights:</strong></p><ul><li><p>Open data will change the accessibility and disciplinary expertise in the archaeological field.</p></li><li><p>The main aim of this paper is to present a workflow for archaeological data sharing as open data with a large level of interoperability.</p></li><li><p>Digital acquisition techniques are used to document archaeological excavations and a Geographic Information System (GIS) is generated that is published as open data.</p></li></ul>


2020 ◽  
Vol 2 (4) ◽  
pp. 554-568
Author(s):  
Chris Graf ◽  
Dave Flanagan ◽  
Lisa Wylie ◽  
Deirdre Silver

Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyze 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorized the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.


2021 ◽  
Author(s):  
Kevin B Read ◽  
Heather Ganshorn ◽  
Sarah Rutley ◽  
David R. Scott

Background:As Canada increases requirements for research data management (RDM) and sharing, there is value in identifying how research data are shared, and what has been done to make them findable and reusable. This study aims to understand Canada’s data sharing landscape by reviewing how Canadian Institutes of Health Research (CIHR) funded data are shared, and comparing researchers’ data sharing practices to RDM and sharing best practices. Methods:We performed a descriptive analysis of CIHR-funded publications from PubMed and PubMed Central that were published between 1946 and Dec 31, 2019 and that indicated the research data underlying the results of the publication were shared. Each publication was analyzed to identify how and where data were shared, who shared data, and what documentation was included to support data reuse.Results:Of 4,144 CIHR-funded publications, 45.2% (n=1,876) included accessible data, 21.9% (n=909) stated data were available by request, 7.3% (n=304) stated data sharing was not applicable/possible, and we found no evidence of data sharing in 37.6% (n=1,558) of publications. Frequent data sharing methods included via a repository (n=1,549, 37.3%), within supplementary files (n=1,048, 25.2%), and by request (n=919, 22.1%). 13.1% (n=554) of publications included documentation that would facilitate data reuse.Interpretation:Our findings reveal that CIHR-funded publications largely lack the metadata, access instructions, and documentation to facilitate data discovery and reuse. Without measures to address these concerns, and enhanced support for researchers seeking to implement RDM and sharing best practices, most CIHR-funded research data will remain hidden, inaccessible, and unusable.


Publications ◽  
2021 ◽  
Vol 9 (2) ◽  
pp. 25
Author(s):  
Brian Jackson

Journal publishers play an important role in the open research data ecosystem. Through open data policies that include public data archiving mandates and data availability statements, journal publishers help promote transparency in research and wider access to a growing scholarly record. The library and information science (LIS) discipline has a unique relationship with both open data initiatives and academic publishing and may be well-positioned to adopt rigorous open data policies. This study examines the information provided on public-facing websites of LIS journals in order to describe the extent, and nature, of open data guidance provided to prospective authors. Open access journals in the discipline have disproportionately adopted detailed, strict open data policies. Commercial publishers, which account for the largest share of publishing in the discipline, have largely adopted weaker policies. Rigorous policies, adopted by a minority of journals, describe the rationale, application, and expectations for open research data, while most journals that provide guidance on the matter use hesitant and vague language. Recommendations are provided for strengthening journal open data policies.


2015 ◽  
Author(s):  
Iain Hrynaszkiewicz ◽  
Varsha Khodiyar ◽  
Andrew L Hufton ◽  
Susanna-Assunta Sansone

AbstractSharing of experimental clinical research data usually happens between individuals or research groups rather than via public repositories, in part due to the need to protect research participant privacy. This approach to data sharing makes it difficult to connect journal articles with their underlying datasets and is often insufficient for ensuring access to data in the long term. Voluntary data sharing services such as the Yale Open Data Access (YODA) and Clinical Study Data Request (CSDR) projects have increased accessibility to clinical datasets for secondary uses while protecting patient privacy and the legitimacy of secondary analyses but these resources are generally disconnected from journal articles – where researchers typically search for reliable information to inform future research. New scholarly journal and article types dedicated to increasing accessibility of research data have emerged in recent years and, in general, journals are developing stronger links with data repositories. There is a need for increased collaboration between journals, data repositories, researchers, funders, and voluntary data sharing services to increase the visibility and reliability of clinical research. We propose changes to the format and peer-review process for journal articles to more robustly link them to data that are only available on request. We also propose additional features for data repositories to better accommodate non-public clinical datasets, including Data Use Agreements (DUAs).


2021 ◽  
Author(s):  
Iain Hrynaszkiewicz ◽  
James Harney ◽  
Lauren Cadwallader

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data. In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data. In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data. There may however be opportunities - unmet researcher needs - in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.


Sign in / Sign up

Export Citation Format

Share Document