scholarly journals The open data challenge: An analysis of 124,000 data availability statements, and an ironic lesson about data management plans

Authorea ◽  
2019 ◽  
Author(s):  
Chris Graf ◽  
Dave Flanagan ◽  
Lisa Wylie ◽  
Deirdre Silver
2020 ◽  
Vol 2 (4) ◽  
pp. 554-568
Author(s):  
Chris Graf ◽  
Dave Flanagan ◽  
Lisa Wylie ◽  
Deirdre Silver

Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyze 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorized the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.


2020 ◽  
Author(s):  
Neha Makhija ◽  
Mansi Jain ◽  
Nikolaos Tziavelis ◽  
Laura Di Rocco ◽  
Sara Di Bartolomeo ◽  
...  

Data lakes are an emerging storage paradigm that promotes data availability over integration. A prime example are repositories of Open Data which show great promise for transparent data science. Due to the lack of proper integration, Data Lakes may not have a common consistent schema and traditional data management techniques fall short with these repositories. Much recent research has tried to address the new challenges associated with these data lakes. Researchers in this area are mainly interested in the structural properties of the data for developing new algorithms, yet typical Open Data portals offer limited functionality in that respect and instead focus on data semantics.We propose Loch Prospector, a visualization to assist data management researchers in exploring and understanding the most crucial structural aspects of Open Data — in particular, metadata attributes — and the associated task abstraction for their work. Our visualization enables researchers to navigate the contents of data lakes effectively and easily accomplish what were previously laborious tasks. A copy of this paper with all supplemental material is available at osf.io/zkxv9


2020 ◽  
Vol 3 ◽  
pp. 13-38
Author(s):  
Solange Aranha ◽  
Ciara R. Wigham

Although there is a move toward open data, with research funding bodies more frequently requiring data management plans and dissemination strategies, the data management challenges inherently linked to virtual exchange research are understudied. Data collection is often reported upon in papers addressing interaction analysis or language development, but little attention has been paid to offering critical discussion of data collection and structuration methods or practical advice to encourage data/corpora dissemination. This paper reports on two phases of the Multimodal Teletandem Corpus project (Aranha & Lopes, 2019) that structured 581 hours of video data from Portuguese-English teletandem sessions, 351 chat logs, 956 written productions exchanged between the partners (original, revised, and corrected versions), 91 initial and 41 final questionnaires, and 666 learning diaries. We describe the data management problems faced that included the organization of data collected, ethical consent, management of a large quantity of data, inclusion of sociolinguistic information, expansion of learning theories, and the solutions found. We then outline data management planning steps that, consequently, are being introduced for future telecollaboration instantiations.


2021 ◽  
Vol 1 ◽  
pp. 42
Author(s):  
Daniel Spichtinger

Background: Data Management Plans (DMPs) are at the heart of many research funder requirements for data management and open data, including the EU’s Framework Programme for Research and Innovation, Horizon 2020. This article provides a summary of the findings of the DMP Use Case study, conducted as part of OpenAIRE Advance. Methods: As part of the study we created a vetted collection of over 800 Horizon 2020 DMPs. Primarily, however, we report the results of qualitative interviews and a quantitative survey on the experience of Horizon 2020 projects with DMPs. Results & Conclusions: We find that a significant number of projects had to develop a DMP for the first time in the context of Horizon 2020, which points to the importance of funder requirements in spreading good data management practices. In total, 82% of survey respondents found DMPs useful or partially useful, beyond them being “just” an European Commission (EC) requirement. DMPs are most prominently developed within a project’s Management Work Package. Templates were considered important, with 40% of respondents using the EC/European Research Council template. However, some argue for a more tailor-made approach. The most frequent source for support with DMPs were other project partners, but many beneficiaries did not receive any support at all. A number of survey respondents and interviewees therefore ask for a dedicated contact point at the EC, which could take the form of an EC Data Management Helpdesk, akin to the IP helpdesk. If DMPs are published, they are most often made available on the project website, which, however, is often taken offline after the project ends. There is therefore a need to further raise awareness on the importance of using repositories to ensure preservation and curation of DMPs. The study identifies IP and licensing arrangements for DMPs as promising areas for further research.


2021 ◽  
pp. 002203452110202
Author(s):  
F. Schwendicke ◽  
J. Krois

Data are a key resource for modern societies and expected to improve quality, accessibility, affordability, safety, and equity of health care. Dental care and research are currently transforming into what we term data dentistry, with 3 main applications: 1) medical data analysis uses deep learning, allowing one to master unprecedented amounts of data (language, speech, imagery) and put them to productive use. 2) Data-enriched clinical care integrates data from individual (e.g., demographic, social, clinical and omics data, consumer data), setting (e.g., geospatial, environmental, provider-related data), and systems level (payer or regulatory data to characterize input, throughput, output, and outcomes of health care) to provide a comprehensive and continuous real-time assessment of biologic perturbations, individual behaviors, and context. Such care may contribute to a deeper understanding of health and disease and a more precise, personalized, predictive, and preventive care. 3) Data for research include open research data and data sharing, allowing one to appraise, benchmark, pool, replicate, and reuse data. Concerns and confidence into data-driven applications, stakeholders’ and system’s capabilities, and lack of data standardization and harmonization currently limit the development and implementation of data dentistry. Aspects of bias and data-user interaction require attention. Action items for the dental community circle around increasing data availability, refinement, and usage; demonstrating safety, value, and usefulness of applications; educating the dental workforce and consumers; providing performant and standardized infrastructure and processes; and incentivizing and adopting open data and data sharing.


2017 ◽  
Vol 36 ◽  
pp. 58-63 ◽  
Author(s):  
Kenneth Haug ◽  
Reza M Salek ◽  
Christoph Steinbeck
Keyword(s):  

2019 ◽  
Vol 2 ◽  
pp. 1-8
Author(s):  
Chih-Wei Chen ◽  
Ching-Yi Lin ◽  
Chine-Hung Tung ◽  
Hsiung-Ming Liao ◽  
Jr-Jie Jang ◽  
...  

<p><strong>Abstract.</strong> Since UN announced 17 SDGs in 2015, many countries around the world have been endeavouring to promote SDGs towards building a sustainable future. Given the disparity of the regional development, national government is suggested to establish localised sustainable vision. Drawing on UN SDGs with targets and corresponding indicators, meanwhile considering local circumstances and sustainable vision, government further seeks to establish localised SDGs with related targets and indicators. Meanwhile, under the digital era, digital technologies have been extensively employed as the smart tool in many fields nowadays, and Geographic Information System (GIS) has been developed as the platform to visualise the SDGs progress in UN and many countries. On the above basis, this paper further demonstrates Taiwan’s efforts to establish localised SDGs, and develop National Geographic Information System (NGIS) to implement the sustainable development in Taiwan, monitor the SDGs progress, and provide feedback to policymakers to further make strategic policies in a top-down approach, meanwhile develop Community Geographic Information System (CGIS) to encourage stakeholders and citizens to harness the concept of CGIS to proactively create and tell their own stories and promote Regional Revitalisation policy in a bottom-up approach. Moreover, GIS could not function well without appropriate data management including massive data and open data policy, well-built digital infrastructure, as well as the selected “right data” and cyber security. Hence, with appropriate data management, GIS as a smart tool could facilitate the promotion and implementation of SDGs in an intuitive manner towards shaping a smart and sustainable future.</p>


Author(s):  
Susanne Blumesberger ◽  
Nikos Gänsdorfer ◽  
Raman Ganguly ◽  
Eva Gergely ◽  
Alexander Gruber ◽  
...  

This article gives an overview of the FAIR Data Austria project objectives and current results. In collaboration with our project partners, we work on the development and establishment of tools for managing the lifecycle of research data, including machine-actionable Data Management Plans (maDMPs), repositories for long-term archiving of research results, RDM training and support services, models, and profiles for Data Stewards and FAIR Office Austria.


Sign in / Sign up

Export Citation Format

Share Document