scholarly journals The <i>fortedata</i> R package: open-science datasets from a manipulative experiment testing forest resilience

Author(s):  
Jeff W. Atkins ◽  
Elizabeth Agee ◽  
Alexandra Barry ◽  
Kyla M. Dahlin ◽  
Kalyn Dorheim ◽  
...  

Abstract. The fortedata R package is an open data notebook from the Forest Resilience Threshold Experiment (FoRTE) – a modeling and manipulative field experiment that tests the effects of disturbance severity and disturbance type on carbon cycling dynamics in a temperate forest. Package data consists of measurements of carbon pools and fluxes and ancillary measurements to help users analyse and interpret carbon cycling over time. Currently the package includes data and metadata from the first two years of FoRTE, and serves as a central, updatable resource for the FoRTE project team and is intended as a resource for external users over the course of the experiment and in perpetuity. Further, it supports all associated FoRTE publications, analyses, and modeling efforts. This increases efficiency, consistency, compatibility, and productivity, while minimizing duplicated effort and error propagation that can arise as a function of a large, distributed and collaborative effort. More broadly, fortedata represents an innovative, collaborative way of approaching science that unites and expedites the delivery of complementary datasets in near real time to the broader scientific community, increasing transparency and reproducibility of taxpayer-funded science. fortedata is available via GitHub: https://github.com/FoRTExperiment/fortedata and detailed documentation on the access, used, and applications of fortedata are available at: https://fortexperiment.github.io/fortedata/. The first public release, version 1.0.1 is also archived at: https://doi.org/10.5281/zenodo.3936146 (Atkins et al., 2020b). All level one data products are also available outside of the package as .csv files: https://doi.org/10.6084/m9.figshare.12292490.v3 (Atkins et al. 2020c).

2021 ◽  
Vol 13 (3) ◽  
pp. 943-952
Author(s):  
Jeff W. Atkins ◽  
Elizabeth Agee ◽  
Alexandra Barry ◽  
Kyla M. Dahlin ◽  
Kalyn Dorheim ◽  
...  

Abstract. The fortedata R package is an open data notebook from the Forest Resilience Threshold Experiment (FoRTE) – a modeling and manipulative field experiment that tests the effects of disturbance severity and disturbance type on carbon cycling dynamics in a temperate forest. Package data consist of measurements of carbon pools and fluxes and ancillary measurements to help analyze and interpret carbon cycling over time. Currently the package includes data and metadata from the first three FoRTE field seasons, serves as a central, updatable resource for the FoRTE project team, and is intended as a resource for external users over the course of the experiment and in perpetuity. Further, it supports all associated FoRTE publications, analyses, and modeling efforts. This increases efficiency, consistency, compatibility, and productivity while minimizing duplicated effort and error propagation that can arise as a function of a large, distributed and collaborative effort. More broadly, fortedata represents an innovative, collaborative way of approaching science that unites and expedites the delivery of complementary datasets to the broader scientific community, increasing transparency and reproducibility of taxpayer-funded science. The fortedata package is available via GitHub: https://github.com/FoRTExperiment/fortedata (last access: 19 February 2021), and detailed documentation on the access, used, and applications of fortedata are available at https://fortexperiment.github.io/fortedata/ (last access: 19 February 2021). The first public release, version 1.0.1 is also archived at https://doi.org/10.5281/zenodo.4399601 (Atkins et al., 2020b).  All data products are also available outside of the package as .csv files: https://doi.org/10.6084/m9.figshare.13499148.v1 (Atkins et al., 2020c).


Publications ◽  
2021 ◽  
Vol 9 (3) ◽  
pp. 31
Author(s):  
Manh-Toan Ho ◽  
Manh-Tung Ho ◽  
Quan-Hoang Vuong

This paper seeks to introduce a strategy of science communication: Total SciComm or all-out science communication. We proposed that to maximize the outreach and impact, scientists should use different media to communicate different aspects of science, from core ideas to methods. The paper uses an example of a debate surrounding a now-retracted article in the Nature journal, in which open data, preprints, social media, and blogs are being used for a meaningful scientific conversation. The case embodied the central idea of Total SciComm: the scientific community employs every medium to communicate scientific ideas and engages all scientists in the process.


2018 ◽  
Author(s):  
Ruben C. Arslan

Data documentation in psychology lags behind not only many other disciplines, but also basic standards of usefulness. Psychological scientists often prefer to invest the time and effort necessary to document existing data well into other duties such as writing and collecting more data. Codebooks therefore tend to be unstandardised and stored in proprietary formats, and are rarely properly indexed in search engines. This means that rich datasets are sometimes used only once—by their creators—and left to disappear into oblivion; even if they can find it, researchers are unlikely to publish analyses based on existing datasets if they cannot be confident they understand them well enough. My codebook package makes it easier to generate rich metadata in human- and machine-readable codebooks. By using metadata from existing sources and by automating some tedious tasks such as documenting psychological scales and reliabilities, summarising descriptives, and identifying missingness patterns, I aim to encourage researchers to use the package for their own or their team's benefit. The codebook R package and web app make it possible to generate rich codebooks in a few minutes and just three clicks. Over time, this could lead to psychological data becoming findable, accessible, interoperable, and reusable, and to reduced research waste, thereby benefiting the scientific community as a whole.


Author(s):  
Jan Homolak ◽  
Ivan Kodvanj ◽  
Davor Virag

Introduction: The Pandemic of COVID-19, an infectious disease caused by SARS-CoV-2 motivated the scientific community to work together in order to gather, organize, process and distribute data on the novel biomedical hazard. Here, we analyzed how the scientific community responded to this challenge by quantifying distribution and availability patterns of the academic information related to COVID-19. The aim of our study was to assess the quality of the information flow and scientific collaboration, two factors we believe to be critical for finding new solutions for the ongoing pandemic. Materials and Methods: The RISmed R package, and a custom Python script were used to fetch metadata on articles indexed in PubMed and published on rXiv preprint server. Scopus was manually searched and the metadata was exported in BibTex file. Publication rate and publication status, affiliation and author count per article, and submission-to-publication time were analysed in R. Biblioshiny application was used to create a world collaboration map. Results: Our preliminary data suggest that COVID-19 pandemic resulted in generation of a large amount of scientific data, and demonstrates potential problems regarding the information velocity, availability, and scientific collaboration in the early stages of the pandemic. More specifically, our results indicate precarious overload of the standard publication systems, delayed adoption of the preprint publishing, significant problems with data availability and apparent deficient collaboration. Conclusion: In conclusion, we believe the scientific community could have used the data more efficiently in order to create proper foundations for finding new solutions for the COVID-19 pandemic. Moreover, we believe we can learn from this on the go and adopt open science principles and a more mindful approach to COVID-19-related data to accelerate the discovery of more efficient solutions. We take this opportunity to invite our colleagues to contribute to this global scientific collaboration by publishing their findings with maximal transparency.


Author(s):  
Jan Homolak ◽  
Ivan Kodvanj ◽  
Davor Virag

Introduction: The Pandemic of COVID-19, an infectious disease caused by SARS-CoV-2 motivated the scientific community to work together in order to gather, organize, process and distribute data on the novel biomedical hazard. Here, we analyzed how the scientific community responded to this challenge by quantifying distribution and availability patterns of the academic information related to COVID-19. The aim of our study was to assess the quality of the information flow and scientific collaboration, two factors we believe to be critical for finding new solutions for the ongoing pandemic. Materials and methods: The RISmed R package, and a custom Python script were used to fetch metadata on articles indexed in PubMed and published on Rxiv preprint server. Scopus was manually searched and the metadata was exported in BibTex file. Publication rate and publication status, affiliation and author count per article, and submission-to-publication time were analysed in R. Biblioshiny application was used to create a world collaboration map. Results: Our preliminary data suggest that COVID-19 pandemic resulted in generation of a large amount of scientific data, and demonstrates potential problems regarding the information velocity, availability, and scientific collaboration in the early stages of the pandemic. More specifically, our results indicate precarious overload of the standard publication systems, significant problems with data availability and apparent deficient collaboration. Conclusion: In conclusion, we believe the scientific community could have used the data more efficiently in order to create proper foundations for finding new solutions for the COVID-19 pandemic. Moreover, we believe we can learn from this on the go and adopt open science principles and a more mindful approach to COVID-19-related data to accelerate the discovery of more efficient solutions. We take this opportunity to invite our colleagues to contribute to this global scientific collaboration by publishing their findings with maximal transparency.


2020 ◽  
Author(s):  
Matti Vuorre ◽  
Matthew John Charles Crump

A consensus on the importance of open data and reproducible code is emerging. How should data and code be shared to maximize the key desiderata of reproducibility, permanence, and accessibility? Research assets should be stored persistently in formats that are not software restrictive, and documented so that others can reproduce and extend the required computations. The sharing method should be easy to adopt by already busy researchers. We suggest the R package standard as a solution for creating, curating, and communicating research assets. The R package standard, with extensions discussed herein, provides a format for assets and metadata that satisfies the above desiderata, facilitates reproducibility, open access, and sharing of materials through online platforms like GitHub and Open Science Framework. We discuss a stack of R resources that help users create reproducible collections of research assets, from experiments to manuscripts, in the RStudio interface. We created an R package, vertical, to help researchers incorporate these tools into their workflows, and discuss its functionality at length in an online supplement. Together, these tools may increase the reproducibility and openness of psychological science.


2020 ◽  
Vol 2020 (11) ◽  
pp. 267-1-267-8
Author(s):  
Mitchell J.P. van Zuijlen ◽  
Sylvia C. Pont ◽  
Maarten W.A. Wijntjes

The human face is a popular motif in art and depictions of faces can be found throughout history in nearly every culture. Artists have mastered the depiction of faces after employing careful experimentation using the relatively limited means of paints and oils. Many of the results of these experimentations are now available to the scientific domain due to the digitization of large art collections. In this paper we study the depiction of the face throughout history. We used an automated facial detection network to detect a set of 11,659 faces in 15,534 predominately western artworks, from 6 international, digitized art galleries. We analyzed the pose and color of these faces and related those to changes over time and gender differences. We find a number of previously known conventions, such as the convention of depicting the left cheek for females and vice versa for males, as well as unknown conventions, such as the convention of females to be depicted looking slightly down. Our set of faces will be released to the scientific community for further study.


2021 ◽  
Author(s):  
Samir Das ◽  
Rida Abou-Haidar ◽  
Henri Rabalais ◽  
Sonia Denise Lai Wing Sun ◽  
Zaliqa Rosli ◽  
...  

AbstractIn January 2016, the Montreal Neurological Institute-Hospital (The Neuro) declared itself an Open Science organization. This vision extends beyond efforts by individual scientists seeking to release individual datasets, software tools, or building platforms that provide for the free dissemination of such information. It involves multiple stakeholders and an infrastructure that considers governance, ethics, computational resourcing, physical design, workflows, training, education, and intra-institutional reporting structures. The C-BIG repository was built in response as The Neuro’s institutional biospecimen and clinical data repository, and collects biospecimens as well as clinical, imaging, and genetic data from patients with neurological disease and healthy controls. It is aimed at helping scientific investigators, in both academia and industry, advance our understanding of neurological diseases and accelerate the development of treatments. As many neurological diseases are quite rare, they present several challenges to researchers due to their small patient populations. Overcoming these challenges required the aggregation of datasets from various projects and locations. The C-BIG repository achieves this goal and stands as a scalable working model for institutions to collect, track, curate, archive, and disseminate multimodal data from patients. In November 2020, a Registered Access layer was made available to the wider research community at https://cbigr-open.loris.ca, and in May 2021 fully open data will be released to complement the Registered Access data. This article outlines many of the aspects of The Neuro’s transition to Open Science by describing the data to be released, C-BIG’s full capabilities, and the design aspects that were implemented for effective data sharing.


Data Science ◽  
2021 ◽  
pp. 1-21
Author(s):  
Caspar J. Van Lissa ◽  
Andreas M. Brandmaier ◽  
Loek Brinkman ◽  
Anna-Lena Lamprecht ◽  
Aaron Peikert ◽  
...  

Adopting open science principles can be challenging, requiring conceptual education and training in the use of new tools. This paper introduces the Workflow for Open Reproducible Code in Science (WORCS): A step-by-step procedure that researchers can follow to make a research project open and reproducible. This workflow intends to lower the threshold for adoption of open science principles. It is based on established best practices, and can be used either in parallel to, or in absence of, top-down requirements by journals, institutions, and funding bodies. To facilitate widespread adoption, the WORCS principles have been implemented in the R package worcs, which offers an RStudio project template and utility functions for specific workflow steps. This paper introduces the conceptual workflow, discusses how it meets different standards for open science, and addresses the functionality provided by the R implementation, worcs. This paper is primarily targeted towards scholars conducting research projects in R, conducting research that involves academic prose, analysis code, and tabular data. However, the workflow is flexible enough to accommodate other scenarios, and offers a starting point for customized solutions. The source code for the R package and manuscript, and a list of examplesof WORCS projects, are available at https://github.com/cjvanlissa/worcs.


Sign in / Sign up

Export Citation Format

Share Document