scholarly journals Data, Data Management, and Reproducible Research in Linguistics: On the Need for The Open Handbook of Linguistic Data Management

JAMIA Open ◽  
2019 ◽  
Vol 2 (4) ◽  
pp. 516-520
Author(s):  
Katelyn A McKenzie ◽  
Suzanne L Hunt ◽  
Genevieve Hulshof ◽  
Dinesh Pal Mudaranthakam ◽  
Kayla Meyer ◽  
...  

Abstract Objective Managing registries with continual data collection poses challenges, such as following reproducible research protocols and guaranteeing data accessibility. The University of Kansas (KU) Alzheimer’s Disease Center (ADC) maintains one such registry: Curated Clinical Cohort Phenotypes and Observations (C3PO). We created an automated and reproducible process by which investigators have access to C3PO data. Materials and Methods Data was input into Research Electronic Data Capture. Monthly, data part of the Uniform Data Set (UDS), that is data also collected at other ADCs, was uploaded to the National Alzheimer’s Coordinating Center (NACC). Quarterly, NACC cleaned, curated, and returned the UDS to the KU Data Management and Statistics (DMS) Core, where it was stored in C3PO with other quarterly curated site-specific data. Investigators seeking to utilize C3PO submitted a research proposal and requested variables via the publicly accessible and searchable data dictionary. The DMS Core used this variable list and an automated SAS program to create a subset of C3PO. Results C3PO contained 1913 variables stored in 15 datasets. From 2017 to 2018, 38 data requests were completed for several KU departments and other research institutions. Completing data requests became more efficient; C3PO subsets were produced in under 10 seconds. Discussion The data management strategy outlined above facilitated reproducible research practices, which is fundamental to the future of research as it allows replication and verification to occur. Conclusion We created a transparent, automated, and efficient process of extracting subsets of data from a registry where data was changing daily.


2017 ◽  
Vol 12 (1) ◽  
pp. 22-35 ◽  
Author(s):  
Tomasz Miksa ◽  
Andreas Rauber ◽  
Roman Ganguly ◽  
Paolo Budroni

Data management plans are free-form text documents describing the data used and produced in scientific experiments. The complexity of data-driven experiments requires precise descriptions of tools and datasets used in computations to enable their reproducibility and reuse. Data management plans fall short of these requirements. In this paper, we propose machine-actionable data management plans that cover the same themes as standard data management plans, but particular sections are filled with information obtained from existing tools. We present mapping of tools from the domains of digital preservation, reproducible research, open science, and data repositories to data management plan sections. Thus, we identify the requirements for a good solution and identify its limitations. We also propose a machine-actionable data model that enables information integration. The model uses ontologies and is based on existing standards.


Author(s):  
Nicholas Thieberger ◽  
Andrea L. Berez

2022 ◽  
Vol 12 ◽  
Author(s):  
Alessandra Durazzo ◽  
Barbara C. Sorkin ◽  
Massimo Lucarini ◽  
Pavel A. Gusev ◽  
Adam J. Kuszak ◽  
...  

The increased utilization of metrology resources and expanded application of its’ approaches in the development of internationally agreed upon measurements can lay the basis for regulatory harmonization, support reproducible research, and advance scientific understanding, especially of dietary supplements and herbal medicines. Yet, metrology is often underappreciated and underutilized in dealing with the many challenges presented by these chemically complex preparations. This article discusses the utility of applying rigorous analytical techniques and adopting metrological principles more widely in studying dietary supplement products and ingredients, particularly medicinal plants and other botanicals. An assessment of current and emerging dietary supplement characterization methods is provided, including targeted and non-targeted techniques, as well as data analysis and evaluation approaches, with a focus on chemometrics, toxicity, dosage form performance, and data management. Quality assessment, statistical methods, and optimized methods for data management are also discussed. Case studies provide examples of applying metrological principles in thorough analytical characterization of supplement composition to clarify their health effects. A new frontier for metrology in dietary supplement science is described, including opportunities to improve methods for analysis and data management, development of relevant standards and good practices, and communication of these developments to researchers and analysts, as well as to regulatory and policy decision makers in the public and private sectors. The promotion of closer interactions between analytical, clinical, and pharmaceutical scientists who are involved in research and product development with metrologists who develop standards and methodological guidelines is critical to advance research on dietary supplement characterization and health effects.


2021 ◽  
Author(s):  
Marko Petek ◽  
Maja Zagorscak ◽  
Andrej Blejec ◽  
Ziva Ramsak ◽  
Anna Coll ◽  
...  

We have developed pISA-tree, a straightforward and flexible data management solution for organisation of life science project-associated research data and metadata. It enables on-the-fly creation of enriched directory tree structure (project/Investigation/Study/Assay) via a series of sequential batch files in a standardised manner based on the ISA metadata framework. The system supports reproducible research and is in accordance with the Open Science initiative and FAIR principles. Compared with similar frameworks, it does not require any systems administration and maintenance as it can be run on a personal computer or network drive. It is complemented with two R packages, pisar and seekr, where the former facilitates integration of the pISA-tree datasets into bioinformatic pipelines and the latter enables synchronisation with the FAIRDOMHub public repository using the SEEK API. Source code and detailed documentation of pISA-tree and its supporting R packages are available from https://github.com/NIB-SI/pISA-tree.


2019 ◽  
Vol 8 (1) ◽  
pp. 40-52 ◽  
Author(s):  
Sarah W. Kansa ◽  
Levent Atici ◽  
Eric C. Kansa ◽  
Richard H. Meadow

ABSTRACTWith the advent of the Web, increased emphasis on “research data management,” and innovations in reproducible research practices, scholars have more incentives and opportunities to document and disseminate their primary data. This article seeks to guide archaeologists in data sharing by highlighting recurring challenges in reusing archived data gleaned from observations on workflows and reanalysis efforts involving datasets published over the past 15 years by Open Context. Based on our findings, we propose specific guidelines to improve data management, documentation, and publishing practices so that primary data can be more efficiently discovered, understood, aggregated, and synthesized by wider research communities.


2017 ◽  
Author(s):  
Vicky Steeves

This is a self-archived version of an article published in Collaborative Librarianship. The content of this article is not different from what is in the journal (found here: http://digitalcommons.du.edu/collaborativelibrarianship/vol9/iss2/4)Recommended CitationSteeves, Vicky (2017) "Reproducibility Librarianship," Collaborative Librarianship: Vol. 9 : Iss. 2 , Article 4. Available at: https://digitalcommons.du.edu/collaborativelibrarianship/vol9/iss2/4Over the past few years, research reproducibility has been increasingly highlighted as a multifaceted challenge across many disciplines. There are socio-cultural obstacles as well as a constantly changing technical landscape that make replicating and reproducing research extremely difficult. Researchers face challenges in reproducing research across different operating systems and different versions of software, to name just a few of the many technical barriers. The prioritization of citation counts and journal prestige has undermined incentives to make research reproducible.While libraries have been building support around research data management and digital scholarship, reproducibility is an emerging area that has yet to be systematically addressed. To respond to this, New York University (NYU) created the position of Librarian for Research Data Management and Reproducibility (RDM & R), a dual appointment between the Center for Data Science (CDS) and the Division of Libraries. This report will outline the role of the RDM & R librarian, paying close attention to the collaboration between the CDS and Libraries to bring reproducible research practices into the norm.


2019 ◽  
Author(s):  
Ian Sullivan ◽  
Alexander Carl DeHaven ◽  
David Thomas Mellor

By implementing more transparent research practices, authors have the opportunity to stand out and showcase work that is more reproducible, easier to build upon, and more credible. The scientist gains by making work easier to share and maintain within their own lab, and the scientific community gains by making underlying data or research materials more available for confirmation or making new discoveries. The following protocol gives the author step by step instructions for using the free and open source OSF to create a data management plan, preregister their study, use version control, share data and other research materials, or post a preprint for quick and easy dissemination.


Sign in / Sign up

Export Citation Format

Share Document