Creating and sharing reproducible research code the workflowr way

Making scientific analyses reproducible, well documented, and easily shareable is crucial to maximizing their impact and ensuring that others can build on them. However, accomplishing these goals is not easy, requiring careful attention to organization, workflow, and familiarity with tools that are not a regular part of every scientist's toolbox. We have developed an R package, workflowr, to help all scientists, regardless of background, overcome these challenges. Workflowr aims to instill a particular "workflow" — a sequence of steps to be repeated and integrated into research practice — that helps make projects more reproducible and accessible.This workflow integrates four key elements: (1) version control (via Git); (2) literate programming (via R Markdown); (3) automatic checks and safeguards that improve code reproducibility; and (4) sharing code and results via a browsable website. These features exploit powerful existing tools, whose mastery would take considerable study. However, the workflowr interface is simple enough that novice users can quickly enjoy its many benefits. By simply following the workflowr "workflow", R users can create projects whose results, figures, and development history are easily accessible on a static website — thereby conveniently shareable with collaborators by sending them a URL — and accompanied by source code and reproducibility safeguards. The workflowr R package is open source and available on CRAN, with full documentation and source code available at https://github.com/jdblischak/workflowr.

Download Full-text

fullsibQTL: an R package for QTL mapping in biparental populations of outcrossing species

10.1101/2020.12.04.412262 ◽

2020 ◽

Author(s):

Rodrigo Gazaffi ◽

Rodrigo R. Amadeu ◽

Marcelo Mollinari ◽

João R. B. F. Rosa ◽

Cristiane H. Taniguti ◽

...

Keyword(s):

Qtl Mapping ◽

Open Source ◽

Qtl Analysis ◽

Source Code ◽

R Package ◽

Genetic Maps ◽

Linkage Phase ◽

Position Effects ◽

Genetic Features ◽

Outcrossing Species

ABSTRACTAccurate QTL mapping in outcrossing species requires software programs which consider genetic features of these populations, such as markers with different segregation patterns and different level of information. Although the available mapping procedures to date allow inferring QTL position and effects, they are mostly not based on multilocus genetic maps. Having a QTL analysis based in such maps is crucial since they allow informative markers to propagate their information to less informative intervals of the map. We developed fullsibQTL, a novel and freely available R package to perform composite interval QTL mapping considering outcrossing populations and markers with different segregation patterns. It allows to estimate QTL position, effects, segregation patterns, and linkage phase with flanking markers. Additionally, several statistical and graphical tools are implemented, for straightforward analysis and interpretations. fullsibQTL is an R open source package with C and R source code (GPLv3). It is multiplatform and can be installed from https://github.com/augusto-garcia/fullsibQTL.

Download Full-text

A Prologue of Git and SVN

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a9451.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 988-990

Keyword(s):

Control System ◽

Open Source ◽

Control Systems ◽

Source Code ◽

Version Control ◽

Control Software ◽

The World ◽

Revision Control ◽

Version Control System ◽

General Public License

Version Control Software or Revision Control Software are the most important things in the world of software development. In this paper, we have described two version control tools: Git and Apache Subversion. Git comes as free and open source code management and version control system which is disseminated with the GNU general public license. Apache Subversion abbreviated as SVN is one amongst a software versioning and revision control systems given as open source under Apache License. Git design, its functionality, and usage of Git and SVN are discussed in this paper. The goal of this research paper is to accentuate on GIT and SVN tools, evaluate and compare five version control tools to ascertain their usage and efficacy.

Download Full-text

ggVennDiagram: An Intuitive, Easy-to-Use, and Highly Customizable R Package to Generate Venn Diagram

Frontiers in Genetics ◽

10.3389/fgene.2021.706907 ◽

2021 ◽

Vol 12 ◽

Author(s):

Chun-Hui Gao ◽

Guangchuang Yu ◽

Peng Cai

Keyword(s):

Open Source ◽

Open Source Software ◽

Source Code ◽

R Package ◽

Venn Diagram ◽

Free Access ◽

High Quality ◽

Venn Diagrams ◽

Label Edge ◽

The Cost

Venn diagrams are widely used diagrams to show the set relationships in biomedical studies. In this study, we developed ggVennDiagram, an R package that could automatically generate high-quality Venn diagrams with two to seven sets. The ggVennDiagram is built based on ggplot2, and it integrates the advantages of existing packages, such as venn, RVenn, VennDiagram, and sf. Satisfactory results can be obtained with minimal configurations. Furthermore, we designed comprehensive objects to store the entire data of the Venn diagram, which allowed free access to both intersection values and Venn plot sub-elements, such as set label/edge and region label/filling. Therefore, high customization of every Venn plot sub-element can be fulfilled without increasing the cost of learning when the user is familiar with ggplot2 methods. To date, ggVennDiagram has been cited in more than 10 publications, and its source code repository has been starred by more than 140 GitHub users, suggesting a great potential in applications. The package is an open-source software released under the GPL-3 license, and it is freely available through CRAN (https://cran.r-project.org/package=ggVennDiagram).

Download Full-text

modglm: An R package for testing, interpreting, and displaying interactions in generalized linear models of discrete data

10.31234/osf.io/6vmsa ◽

2021 ◽

Author(s):

Connor McCabe ◽

Max Andrew Halvorson ◽

Kevin Michael King ◽

Xiaolin Cao ◽

Dale Sim Kim

Keyword(s):

Open Source ◽

Generalized Linear Models ◽

Open Source Software ◽

Interaction Effect ◽

Software Package ◽

Linear Models ◽

R Package ◽

Interaction Effects ◽

Research Practice ◽

Nonlinear Scale

Many researchers hope to examine interaction effects using generalized linear models (GLMs) to predict outcomes on nonlinear scales. For instance, logistic and Poisson GLMs are used to estimate associations between predictors and outcomes in nonlinear probability and count scales, respectively. However, we (McCabe et al., 2021; Halvorson et al., in press) and others (Ai & Norton, 2003; Mize, 2019; Norton, Wang, & Ai, 2004) have shown that testing and interpreting interaction effects on these scales is not straightforward. GLMs require the application of partial derivatives and/or discrete differences to compute and probe interaction effects appropriately when models are interpreted on their nonlinear scale. Currently available open-source software does not provide methods of computing these interaction effects on probability and count scales, reflecting a central limitation in applying these methods in research practice. Here, we introduce `modglm`, an R-based software package that accompanies our manuscript providing recommendations for computing interaction effects in nonlinear probability and counts (McCabe et al., 2021). This software produces the interaction effect between two variables in generalized linear models of probabilities and counts and provides additional statistics and plotting utilities for evaluating and describing this effect.

Download Full-text

Current Protocols: Open and Reproducible Research on OSF

10.31219/osf.io/ztvu9 ◽

2019 ◽

Author(s):

Ian Sullivan ◽

Alexander Carl DeHaven ◽

David Thomas Mellor

Keyword(s):

Data Management ◽

Open Source ◽

Scientific Community ◽

Management Plan ◽

Version Control ◽

Reproducible Research ◽

Research Practices ◽

Share Data

By implementing more transparent research practices, authors have the opportunity to stand out and showcase work that is more reproducible, easier to build upon, and more credible. The scientist gains by making work easier to share and maintain within their own lab, and the scientific community gains by making underlying data or research materials more available for confirmation or making new discoveries. The following protocol gives the author step by step instructions for using the free and open source OSF to create a data management plan, preregister their study, use version control, share data and other research materials, or post a preprint for quick and easy dissemination.

Download Full-text

THE TECHNOLOGY OF DETECTION INFORMATION ABOUT VULNERABILITY IN THIRD-PARTY OPEN SOURCE SOFTWARE

ИНФОРМАЦИЯ И БЕЗОПАСНОСТЬ ◽

10.36622/vstu.2020.23.3.003 ◽

2020 ◽

pp. 347-364

Author(s):

Алексей Леонидович Сердечный ◽

Игорь Васильевич Герасимов ◽

Олег Юрьевич Макаров ◽

Юрий Геннадьевич Пастернак ◽

Николай Михайлович Тихомиров ◽

...

Keyword(s):

Risk Assessment ◽

Open Source ◽

Open Source Software ◽

Rapid Detection ◽

Source Code ◽

Third Party ◽

Version Control ◽

Software Developers ◽

Open Source Code

В статье приведены результаты разработки технологии выявления сведений об уязвимостях сторонних компонентов программного обеспечения (ПО), позволяющей своевременно обнаруживать проблемы безопасности, связанные с использованием заимствованных компонентов с открытым исходным кодом. Технология отличается процедурами оперативного обнаружения, ранжирования и подтверждения достоверности первоисточников сообщений о таких проблемах. Разработанная технология основана на проведении сбора и семантического анализа сведений об ошибках и средствах (алгоритмах) эксплуатации уязвимостей ПО, содержащихся в сообщениях, публикуемых на информационных ресурсах разработчиков ПО с открытым исходным кодом. Технология включает процедуру подтверждения сведений о наиболее опасных уязвимостях с последующей оценкой рисков для подтверждённых уязвимостей. В статье также приводятся результаты реализации предлагаемой технологии в виде средства сбора и интерактивного анализа сообщений о ошибках в ПО с открытым исходным кодом, размещаемым на платформах для совместной разработки GitHub и GitLab. Технология выявления сведений об уязвимостях сторонних компонентов позволяет повысить защищённость ПО, использующего в своём составе общедоступные компоненты с открытым исходным кодом. The article presents the results of the development the technology of detection information about vulnerability in third-party open source software, which allows timely detection of security problems associated with the use of borrowed components provided with open source code. The technology is characterized by procedures for rapid detection, ranking, and confirmation of the authenticity sources of primary reports about such problems. The technology is based on collecting and mining information about bugs, vulnerabilities and exploits contained in messages that published in sources of open source software developers. The technology includes a procedure for confirming information about the most dangerous vulnerabilities, followed by a risk assessment for confirmed vulnerabilities. The article also presents the results of implementing the proposed technology as a tool for collecting and interactively analyzing bug messages in open source software hosted on the GitHub and GitLab collaborative version control platforms. The technology for detecting information about vulnerabilities of third-party components allows you to increase the security of software that uses publicly available open source components.

Download Full-text

Impact of Clone Refactoring on External Quality Attributes of Open Source Softwares

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183833 ◽

2018 ◽

pp. 86-94

Author(s):

Himanshi Vashisht ◽

Sanjay Bharadwaj ◽

Sushma Sharma

Keyword(s):

Open Source ◽

Internal Structure ◽

Software Quality ◽

Source Code ◽

Quality Attributes ◽

Software Component ◽

External Quality ◽

Code Refactoring ◽

Observable Behaviour

Code refactoring is a “Process of restructuring an existing source code.”. It also helps in improving the internal structure of the code without really affecting its external behaviour”. It changes a source code in such a way that it does not alter the external behaviour yet still it improves its internal structure. It is a way to clean up code that minimizes the chances of introducing bugs. Refactoring is a change made to the internal structure of a software component to make it easier to understand and cheaper to modify, without changing the observable behaviour of that software component. Bad smells indicate that there is something wrong in the code that have to refactor. There are different tools that are available to identify and emove these bad smells. A software has two types of quality attributes- Internal and external. In this paper we will study the effect of clone refactoring on software quality attributes.

Download Full-text