Using R in hydrology: a review of recent developments and future directions

Abstract. The open-source programming language R has gained a central place in the hydrological sciences over the last decade, driven by the availability of diverse hydro-meteorological data archives and the development of open-source computational tools. The growth of R's usage in hydrology is reflected in the number of newly published hydrological packages, the strengthening of online user communities, and the popularity of training courses and events. In this paper, we explore the benefits and advantages of R's usage in hydrology, such as the democratization of data science and numerical literacy, the enhancement of reproducible research and open science, the access to statistical tools, the ease of connecting R to and from other languages, and the support provided by a growing community. This paper provides an overview of a typical hydrological workflow based on reproducible principles and packages for retrieval of hydro-meteorological data, spatial analysis, hydrological modelling, statistics, and the design of static and dynamic visualizations and documents. We discuss some of the challenges that arise when using R in hydrology and useful tools to overcome them, including the use of hydrological libraries, documentation, and vignettes (long-form guides that illustrate how to use packages); the role of integrated development environments (IDEs); and the challenges of big data and parallel computing in hydrology. Lastly, this paper provides a roadmap for R's future within hydrology, with R packages as a driver of progress in the hydrological sciences, application programming interfaces (APIs) providing new avenues for data acquisition and provision, enhanced teaching of hydrology in R, and the continued growth of the community via short courses and events.

Download Full-text

Reproducibility in plant pathology: where do we stand and a way forward.

10.31220/agrirxiv.2021.00082 ◽

2021 ◽

Author(s):

Adam H. Sparks ◽

Emerson del Ponte ◽

Kaique S. Alves ◽

Zachary S. L. Foster ◽

Niklaus J. Grünwald

Keyword(s):

Plant Pathology ◽

Open Source ◽

Open Science ◽

Scientific Study ◽

Reproducible Research ◽

Research Practices ◽

Open Research ◽

Computational Code ◽

Share Data ◽

Scientific Results

Abstract Open research practices have been highlighted extensively during the last ten years in many fields of scientific study as essential standards needed to promote transparency and reproducibility of scientific results. Scientific claims can only be evaluated based on how protocols, materials, equipment and methods were described; data were collected and prepared; and, analyses were conducted. Openly sharing protocols, data and computational code is central for current scholarly dissemination and communication, but in many fields, including plant pathology, adoption of these practices has been slow. We randomly selected 300 articles published from 2012 to 2018 across 21 journals representative of the plant pathology discipline and assigned them scores reflecting their openness and reproducibility. We found that most of the articles were not following protocols for open science, and were failing to share data or code in a reproducible way. We also propose that use of open-source tools facilitates reproducible work and analyses benefitting not just readers, but the authors as well. Finally, we also provide ideas and tools to promote open, reproducible research practices among plant pathologists.

Download Full-text

The Bioinformatics Open Source Conference (BOSC) 2013

10.7287/peerj.preprints.83 ◽

2013 ◽

Author(s):

Nomi L Harris ◽

Peter J A Cock ◽

Brad A Chapman ◽

Jeremy Goecks ◽

Hans-Rudolf Hotz ◽

...

Keyword(s):

Computational Biology ◽

Open Source ◽

Intelligent Systems ◽

Scientific Discovery ◽

Public Library ◽

Open Science ◽

Reproducible Research ◽

Data Sets ◽

Translational Bioinformatics ◽

Cultural Issues

The 14th annual Bioinformatics Open Source Conference (BOSC) was held in Berlin in July 2013, bringing together over 100 bioinformatics researchers, developers and users of open source software. Since its inception in 2000, BOSC has been organised as a Special Interest Group (SIG) satellite meeting preceding the large International Conference on Intelligent Systems for Molecular Biology (ISMB), which is the annual meeting of the International Society for Computational Biology (ISCB). BOSC provides bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community, and a focused environment for developers and users to interact and share ideas about standards, software development practices, and practical techniques for solving bioinformatics problems. As in previous years, BOSC 2013 was preceded by a Codefest, a two day hackathon that brings together bioinformatics open source project developers and members of the community and allows them to work collaboratively and achieve greater interoperability between tools developed by different groups. The session topics at BOSC 2013 included several that have been popular in previous years, including Cloud and Parallel Computing, Visualization, Software Interoperability, Genome-scale Data Management, and a session for updates on ongoing open source projects, as well as two new sessions: Translational Bioinformatics, recognizing the growing use of computational biology in medical applications, and Open Science and Reproducible Research. Open Science, a movement dedicated to making all aspects of scientific knowledge production freely available for reuse and extension, not only validates published results by allowing others to reproduce them, but also accelerates the pace of scientific discovery by enabling researchers to more efficiently build on previous work, rather than having to reinvent tools and reassemble data sets. BOSC typically features two keynote talks by researchers who are influential in some aspect of open source bioinformatics. Our first keynote talk this year was by Cameron Neylon, the Advocacy Director for the Public Library of Science (PLOS), who is a prominent advocate for open science. He discussed the cultural issues that are hindering open science, and how openness in scientific collaborations can generate impact. Our second keynote speaker, Sean Eddy, who is perhaps best known as the author of the HMMER software suite, began his keynote talk with an inspiring history of how he got involved in bioinformatics and proceeded to argue that dedicating effort to thorough engineering in tool development, which is often shunned as incremental, can become the key to creating a lasting impact. With the increasing reliance of more and more fields of biology on computational tools to manage and analyze their data, BOSC is well positioned to stay relevant to life science, and thus life scientists, for many years to come.

Download Full-text

The Bioinformatics Open Source Conference (BOSC) 2013

10.7287/peerj.preprints.83v1 ◽

2013 ◽

Author(s):

Nomi L Harris ◽

Peter J A Cock ◽

Brad A Chapman ◽

Jeremy Goecks ◽

Hans-Rudolf Hotz ◽

...

Keyword(s):

Computational Biology ◽

Open Source ◽

Intelligent Systems ◽

Scientific Discovery ◽

Public Library ◽

Open Science ◽

Reproducible Research ◽

Data Sets ◽

Translational Bioinformatics ◽

Cultural Issues

The 14th annual Bioinformatics Open Source Conference (BOSC) was held in Berlin in July 2013, bringing together over 100 bioinformatics researchers, developers and users of open source software. Since its inception in 2000, BOSC has been organised as a Special Interest Group (SIG) satellite meeting preceding the large International Conference on Intelligent Systems for Molecular Biology (ISMB), which is the annual meeting of the International Society for Computational Biology (ISCB). BOSC provides bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community, and a focused environment for developers and users to interact and share ideas about standards, software development practices, and practical techniques for solving bioinformatics problems. As in previous years, BOSC 2013 was preceded by a Codefest, a two day hackathon that brings together bioinformatics open source project developers and members of the community and allows them to work collaboratively and achieve greater interoperability between tools developed by different groups. The session topics at BOSC 2013 included several that have been popular in previous years, including Cloud and Parallel Computing, Visualization, Software Interoperability, Genome-scale Data Management, and a session for updates on ongoing open source projects, as well as two new sessions: Translational Bioinformatics, recognizing the growing use of computational biology in medical applications, and Open Science and Reproducible Research. Open Science, a movement dedicated to making all aspects of scientific knowledge production freely available for reuse and extension, not only validates published results by allowing others to reproduce them, but also accelerates the pace of scientific discovery by enabling researchers to more efficiently build on previous work, rather than having to reinvent tools and reassemble data sets. BOSC typically features two keynote talks by researchers who are influential in some aspect of open source bioinformatics. Our first keynote talk this year was by Cameron Neylon, the Advocacy Director for the Public Library of Science (PLOS), who is a prominent advocate for open science. He discussed the cultural issues that are hindering open science, and how openness in scientific collaborations can generate impact. Our second keynote speaker, Sean Eddy, who is perhaps best known as the author of the HMMER software suite, began his keynote talk with an inspiring history of how he got involved in bioinformatics and proceeded to argue that dedicating effort to thorough engineering in tool development, which is often shunned as incremental, can become the key to creating a lasting impact. With the increasing reliance of more and more fields of biology on computational tools to manage and analyze their data, BOSC is well positioned to stay relevant to life science, and thus life scientists, for many years to come.

Download Full-text

pyOpenSci: Open and reproducible research, powered by Python

Biodiversity Information Science and Standards ◽

10.3897/biss.5.75688 ◽

2021 ◽

Vol 5 ◽

Author(s):

Michael Trizna ◽

Leah Wasser ◽

David Nicholson

Keyword(s):

Open Source ◽

Peer Review ◽

Open Source Software ◽

Best Practice ◽

Review Process ◽

Peer Review Process ◽

Application Programming Interface ◽

Open Science ◽

Lessons Learned ◽

Reproducible Research

pyOpenSci (short for Python Open Science), funded by the Alfred P. Sloan Foundation, is building a diverse community that supports well documented, open source Python software that enables open reproducible science. pyOpenSci will work with the community to openly develop best practice guidelines and open standards for scientific Python software, which will be reinforced through a community-led peer review process and training. Packages that complete the peer review process become a part of the pyOpenSci ecosystem, where maintenance can be shared to ensure longevity and stability in code. pyOpenSci packages are also eligible for a “fast tracked” acceptance to JOSS (Journal of Open Source Software). In addition, we provide review for open science tools that would be of interest to TDWG members but are not within scope for JOSS, such as API (Application Programming Interface) wrappers. pyOpenSci is built on top of the successful model of rOpenSci, founded in 2011, which has fostered the development of several useful biodiversity informatics R packages. The pyOpenSci team looks to following the lessons learned by rOpenSci, to create a similarly successful community. We invite TDWG members developing open source software tools in Python to become part of the pyOpenSci community.

Download Full-text

Learning Open Science by doing Open Science. A reflection of a qualitative research project-based seminar

Education for Information ◽

10.3233/efi-190308 ◽

2020 ◽

Vol 36 (3) ◽

pp. 263-279

Author(s):

Isabel Steinhardt

Keyword(s):

Qualitative Research ◽

Open Source ◽

Knowledge Society ◽

Open Data ◽

Open Science ◽

Research Project ◽

Science Practices ◽

Bachelor's Degrees ◽

Participatory Technologies ◽

Digital Knowledge

Openness in science and education is increasing in importance within the digital knowledge society. So far, less attention has been paid to teaching Open Science in bachelor’s degrees or in qualitative methods. Therefore, the aim of this article is to use a seminar example to explore what Open Science practices can be taught in qualitative research and how digital tools can be involved. The seminar focused on the following practices: Open data practices, the practice of using the free and open source tool “Collaborative online Interpretation, the practice of participating, cooperating, collaborating and contributing through participatory technologies and in social (based) networks. To learn Open Science practices, the students were involved in a qualitative research project about “Use of digital technologies for the study and habitus of students”. The study shows the practices of Open Data are easy to teach, whereas the use of free and open source tools and participatory technologies for collaboration, participation, cooperation and contribution is more difficult. In addition, a cultural shift would have to take place within German universities to promote Open Science practices in general.

Download Full-text

Roundtable: Challenges in repeatable experiments and reproducible research in data science

Proceedings of Moscow Institute of Physics and Technology ◽

10.53815/20726759_2021_13_2_100 ◽

2021 ◽

Vol 13 (2) ◽

pp. 100-108

Author(s):

K.V. Vorontsov ◽

V.I. Iglovikov ◽

V.V. Strijov ◽

A.E. Ustuzhanin ◽

A.S. Khritankov

Keyword(s):

Data Science ◽

Reproducible Research

Download Full-text

An Open Source-Based BCI Application for Virtual World Tour and Its Usability Evaluation

Frontiers in Human Neuroscience ◽

10.3389/fnhum.2021.647839 ◽

2021 ◽

Vol 15 ◽

Author(s):

Sanghum Woo ◽

Jongmin Lee ◽

Hyunji Kim ◽

Sungwoo Chun ◽

Daehyung Lee ◽

...

Keyword(s):

Open Source ◽

Virtual World ◽

Communication Channel ◽

Brain Computer Interface ◽

Open Science ◽

Computer Interface ◽

The Public ◽

Control Functions ◽

Computer Interfaces ◽

And Control

Brain–computer interfaces can provide a new communication channel and control functions to people with restricted movements. Recent studies have indicated the effectiveness of brain–computer interface (BCI) applications. Various types of applications have been introduced so far in this field, but the number of those available to the public is still insufficient. Thus, there is a need to expand the usability and accessibility of BCI applications. In this study, we introduce a BCI application for users to experience a virtual world tour. This software was built on three open-source environments and is publicly available through the GitHub repository. For a usability test, 10 healthy subjects participated in an electroencephalography (EEG) experiment and evaluated the system through a questionnaire. As a result, all the participants successfully played the BCI application with 96.6% accuracy with 20 blinks from two sessions and gave opinions on its usability (e.g., controllability, completeness, comfort, and enjoyment) through the questionnaire. We believe that this open-source BCI world tour system can be used in both research and entertainment settings and hopefully contribute to open science in the BCI field.

Download Full-text

Open Code is not enough: Towards a replicable future for geographic data science

10.31235/osf.io/3hbnt ◽

2019 ◽

Author(s):

Levi John Wolf ◽

Sergio J. Rey ◽

Taylor M. Oshan

Keyword(s):

Spatial Data ◽

Data Science ◽

Current Model ◽

Open Science ◽

Social Changes ◽

Working Definition ◽

Geospatial Cyberinfrastructure ◽

Geographic Data ◽

Definition Of ◽

Healthy Part

Open science practices are a large and healthy part of computational geography and the burgeoning field of spatial data science. In many forms, open geospatial cyberinfrastructure adheres to a varying and informal set of practices and codes that empower levels of collaboration that are impossible otherwise. Pathbreaking work in geographical sciences has explicitly brought these concepts into focus for our current model of open science in geography. In practice, however, these blend together into a somewhat ill-advised but easy-to-use working definition of open science: you know open science when you see it (on GitHub). However, open science lags far behind the needs revealed by this level of collaboration. In this paper, we describe the concerns of open geographic data science, in terms of replicability and open science. We discuss the practical techniques that engender community-building in open science communities, and discuss the impacts that these kinds of social changes have on the technological architecture of scientific infrastructure.

Download Full-text

Correction to: Gendered behavior as a disadvantage in open source software development

EPJ Data Science ◽

10.1140/epjds/s13688-019-0209-5 ◽

2019 ◽

Vol 8 (1) ◽

Author(s):

Balazs Vedres ◽

Orsolya Vasarhelyi

Keyword(s):

United Kingdom ◽

Software Development ◽

Open Source ◽

Open Source Software ◽

Data Science ◽

Open Source Software Development ◽

Central European ◽

University Of Oxford ◽

European University

Following publication of the original article [1], we have been notified that one more affiliation of the corresponding author is missing. Currently Balasz Vedres affiliation is: 1 Oxford Internet Institute, University of Oxford, Oxford, United Kingdom It should be: 1 Oxford Internet Institute, University of Oxford, Oxford, United Kingdom; 2 Department of Network and Data Science, Central European University, Budapest, Hungary.

Download Full-text