scientific software Latest Research Papers

Background Bioinformatics software is developed for collecting, analyzing, integrating, and interpreting life science datasets that are often enormous. Bioinformatics engineers often lack the software engineering skills necessary for developing robust, maintainable, reusable software. This study presents review and discussion of the findings and efforts made to improve the quality of bioinformatics software. Methodology A systematic review was conducted of related literature that identifies core software engineering concepts for improving bioinformatics software development: requirements gathering, documentation, testing, and integration. The findings are presented with the aim of illuminating trends within the research that could lead to viable solutions to the struggles faced by bioinformatics engineers when developing scientific software. Results The findings suggest that bioinformatics engineers could significantly benefit from the incorporation of software engineering principles into their development efforts. This leads to suggestion of both cultural changes within bioinformatics research communities as well as adoption of software engineering disciplines into the formal education of bioinformatics engineers. Open management of scientific bioinformatics development projects can result in improved software quality through collaboration amongst both bioinformatics engineers and software engineers. Conclusions While strides have been made both in identification and solution of issues of particular import to bioinformatics software development, there is still room for improvement in terms of shifts in both the formal education of bioinformatics engineers as well as the culture and approaches of managing scientific bioinformatics research and development efforts.

Download Full-text

Ontology-based data integration for the internet of things in a scientific software ecosystem

International Journal of Computer Applications in Technology ◽

10.1504/ijcat.2022.10044215 ◽

2022 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Jade Ferreira ◽

Leonardo De Aguiar ◽

Victor Stroele ◽

Fernanda Campos ◽

Regina Braga ◽

...

Keyword(s):

Internet Of Things ◽

Data Integration ◽

The Internet ◽

Scientific Software ◽

Software Ecosystem ◽

The Internet Of Things

Download Full-text

When Scientific Software Meets Software Engineering

Computer ◽

10.1109/mc.2021.3102299 ◽

2021 ◽

Vol 54 (12) ◽

pp. 60-71

Author(s):

Dorian Leroy ◽

June Sallou ◽

Johann Bourcier ◽

Benoit Combemale

Keyword(s):

Software Engineering ◽

Scientific Software

Download Full-text

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

10.26434/chemrxiv-2021-6jt1l-v2 ◽

2021 ◽

Author(s):

Jara Popp ◽

Till Biskup

Keyword(s):

Data Analysis ◽

Best Practices ◽

Spectroscopic Data ◽

Scientific Practice ◽

Scientific Software ◽

Good Scientific Practice ◽

Viable Approach ◽

Python Programming ◽

User Friendly ◽

Processing Steps

Reproducibility is at the heart of science. However, most published results usually lack the information necessary to be independently reproduced. Even more, most authors will not be able to reproduce the results from a few years ago due to lacking a gap-less record of every processing and analysis step including all parameters involved. There is only one way to overcome this problem: developing robust tools for data analysis that, while maintaining a maximum of flexibility in their application, allow the user to perform advanced processing steps in a scientifically sound way. At the same time, the only viable approach for reproducible and traceable analysis is to relieve the user of the responsibility for logging all processing steps and their parameters. This can only be achieved by using a system that takes care of these crucial though often neglected tasks. Here, we present a solution to this problem: a framework for the analysis of spectroscopic data (ASpecD) written in the Python programming language that can be used without any actual programming needed. This framework is made available open-source and free of charge and focusses on usability, small footprint and modularity while ensuring reproducibility and good scientific practice. Furthermore, we present a set of best practices and design rules for scientific software development and data analysis. Together, this empowers scientists to focus on their research minimising the need to implement complex software tools while ensuring full reproducibility. We anticipate this to have a major impact on reproducibility and good scientific practice, as we raise the awareness of their importance, summarise proven best practices and present a working user-friendly software solution.

Download Full-text

Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

Nature Communications ◽

10.1038/s41467-021-27222-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Julia Koehler Leman ◽

Sergey Lyskov ◽

Steven M. Lewis ◽

Jared Adolf-Bryfogle ◽

Rebecca F. Alford ◽

...

Keyword(s):

Scientific Community ◽

High Performance ◽

Scientific Software ◽

Modeling Tools ◽

Design Concepts ◽

Software Applications ◽

Engineering Practices ◽

Macromolecular Modeling ◽

High Performance Computing Cluster ◽

Reproducible Manner

AbstractEach year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.

Download Full-text

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

10.26434/chemrxiv-2021-6jt1l ◽

2021 ◽

Author(s):

Jara Popp ◽

Till Biskup

Keyword(s):

Data Analysis ◽

Best Practices ◽

Spectroscopic Data ◽

Scientific Practice ◽

Scientific Software ◽

Good Scientific Practice ◽

Viable Approach ◽

Python Programming ◽

User Friendly ◽

Processing Steps

Reproducibility is at the heart of science. However, most published results usually lack the information necessary to be independently reproduced. Even more, most authors will not be able to reproduce the results from a few years ago due to lacking a gap-less record of every processing and analysis step including all parameters involved. There is only one way to overcome this problem: developing robust tools for data analysis that, while maintaining a maximum of flexibility in their application, allow the user to perform advanced processing steps in a scientifically sound way. At the same time, the only viable approach for reproducible and traceable analysis is to relieve the user of the responsibility for logging all processing steps and their parameters. This can only be achieved by using a system that takes care of these crucial though often neglected tasks. Here, we present a solution to this problem: a framework for the analysis of spectroscopic data (ASpecD) written in the Python programming language that can be used without any actual programming needed. This framework is made available open-source and free of charge and focusses on usability, small footprint and modularity while ensuring reproducibility and good scientific practice. Furthermore, we present a set of best practices and design rules for scientific software development and data analysis. Together, this empowers scientists to focus on their research minimising the need to implement complex software tools while ensuring full reproducibility. We anticipate this to have a major impact on reproducibility and good scientific practice, as we raise the awareness of their importance, summarise proven best practices and present a working user-friendly software solution.

Download Full-text

Ten simple rules on writing clean and reliable open-source scientific software

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009481 ◽

2021 ◽

Vol 17 (11) ◽

pp. e1009481

Author(s):

Haley Hunter-Zinck ◽

Alexandre Fioravante de Siqueira ◽

Váleri N. Vásquez ◽

Richard Barnes ◽

Ciera C. Martinez

Keyword(s):

Software Development ◽

Open Source ◽

Best Practice ◽

Scientific Research ◽

Development Project ◽

Scientific Software ◽

Software Development Project ◽

Code Unit ◽

Software Code ◽

Unit Tests

Functional, usable, and maintainable open-source software is increasingly essential to scientific research, but there is a large variation in formal training for software development and maintainability. Here, we propose 10 “rules” centered on 2 best practice components: clean code and testing. These 2 areas are relatively straightforward and provide substantial utility relative to the learning investment. Adopting clean code practices helps to standardize and organize software code in order to enhance readability and reduce cognitive load for both the initial developer and subsequent contributors; this allows developers to concentrate on core functionality and reduce errors. Clean coding styles make software code more amenable to testing, including unit tests that work best with modular and consistent software code. Unit tests interrogate specific and isolated coding behavior to reduce coding errors and ensure intended functionality, especially as code increases in complexity; unit tests also implicitly provide example usages of code. Other forms of testing are geared to discover erroneous behavior arising from unexpected inputs or emerging from the interaction of complex codebases. Although conforming to coding styles and designing tests can add time to the software development project in the short term, these foundational tools can help to improve the correctness, quality, usability, and maintainability of open-source scientific software code. They also advance the principal point of scientific research: producing accurate results in a reproducible way. In addition to suggesting several tips for getting started with clean code and testing practices, we recommend numerous tools for the popular open-source scientific software languages Python, R, and Julia.

Download Full-text

Sustainability and Digital Teaching Competence in Higher Education

Sustainability ◽

10.3390/su132212354 ◽

2021 ◽

Vol 13 (22) ◽

pp. 12354

Author(s):

Pilar Colás-Bravo ◽

Jesús Conde-Jiménez ◽

Salvador Reyes-de-Cózar

Keyword(s):

Higher Education ◽

Systematic Review ◽

Sources Of Information ◽

Digital Competence ◽

Scientific Software ◽

Teaching Competencies ◽

University Environment ◽

Teaching Competence ◽

Qualitative Systematic Review ◽

The University

This article examines the research that explores the relationship between sustainability and digital teaching competence in the university environment, through a qualitative systematic review, which covers 2011 to 2021. It is intended to identify how sustainability is applied in higher education through teaching experiences linked to the use of ICT, where the digital teaching competence is specified and put into practice. In other words, it is about responding to the following questions: What digital skills are being applied to develop educational sustainability in higher education? In which aspects of educational and pedagogical sustainability are they projected? As a work methodology, the PRISMA protocol is applied as the technique of systematic review, using the Scopus and WOS databases as sources of information. Subsequently, a qualitative analysis of the selected articles is carried out using the ATLAS.ti scientific software, using the DigCompEdu model as the basis for the analysis of the information. The results shed light on the panorama of research on digital competence and sustainability and the evolution of scientific production over ten years, as well as the methodology applied in these studies. The DigCompEdu model is found to be useful for registering the modalities of teaching competencies put into practice, manifesting a primacy of pedagogical digital competences over those of professional development and student empowerment. Sustainability development areas are also identified, linked to teaching digital competence, such as inclusion, educational quality or lifelong learning.

Download Full-text

A Framework for Creating Knowledge Graphs of Scientific Software Metadata

Quantitative Science Studies ◽

10.1162/qss_a_00167 ◽

2021 ◽

pp. 1-37

Author(s):

Aidan Kelley ◽

Daniel Garijo

Keyword(s):

Artificial Intelligence ◽

Computational Methods ◽

Web Sites ◽

Knowledge Graph ◽

Scientific Software ◽

Scientific Publications ◽

Knowledge Graphs ◽

Set Up ◽

Number Of Publications

Abstract An increasing number of researchers rely on computational methods to generate or manipulate the results described in their scientific publications. Software created to this end—scientific software—is key to understanding, reproducing, and reusing existing work in many disciplines, ranging from Geosciences to Astronomy or Artificial Intelligence. However, scientific software is usually challenging to find, set up, and compare to similar software due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and the lack of structured metadata to describe it. As a result, researchers have to manually inspect existing tools in order to understand their differences and incorporate them into their work. This approach scales poorly with the number of publications and tools made available every year. In this paper we address these issues by introducing a framework for automatically extracting scientific software metadata from its documentation (in particular, their readme files); a methodology for structuring the extracted metadata in a Knowledge Graph (KG) of scientific software; and an exploitation framework for browsing and comparing the contents of the generated KG. We demonstrate our approach by creating a KG with metadata from over ten thousand scientific software entries from public code repositories.

Download Full-text

Hands-on with OWL: the Oak–Ridge Wang–Landau Monte Carlo software suite

Journal of Physics Conference Series ◽

10.1088/1742-6596/2122/1/012001 ◽

2021 ◽

Vol 2122 (1) ◽

pp. 012001

Author(s):

Ying Wai Li ◽

Krishna Chaitanya Pitike ◽

Markus Eisenbach ◽

Valentino R. Cooper

Keyword(s):

Monte Carlo ◽

Large Scale ◽

Condensed Matter ◽

Simulation Studies ◽

Scientific Software ◽

Materials Properties ◽

Oak Ridge ◽

Hands On ◽

Recent Developments ◽

Computer Simulation Studies

Abstract The Oak–Ridge Wang–Landau (OWL) package is an open-source scientific software specialized for large-scale, Monte Carlo simulations for the study of materials properties at finite temperature. In this paper, we discuss the main features and capabilities of OWL, followed by detailed descriptions of building and running the code. The readers will be guided through the usage and functionality of the code with a few hands-on examples. This paper is based on a tutorial on OWL given at the 32nd Center for Simulational Physics Workshop on Recent Developments in Computer Simulation Studies in Condensed Matter Physics.

Download Full-text

scientific software
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Improving bioinformatics software quality through incorporation of software engineering practices

Ontology-based data integration for the internet of things in a scientific software ecosystem

When Scientific Software Meets Software Engineering

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

Ten simple rules on writing clean and reliable open-source scientific software

Sustainability and Digital Teaching Competence in Higher Education

A Framework for Creating Knowledge Graphs of Scientific Software Metadata

Hands-on with OWL: the Oak–Ridge Wang–Landau Monte Carlo software suite

Export Citation Format

scientific softwareRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Improving bioinformatics software quality through incorporation of software engineering practices

Ontology-based data integration for the internet of things in a scientific software ecosystem

When Scientific Software Meets Software Engineering

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks

ASpecD: A Modular Framework for the Analysis of Spectroscopic Data Focussing on Reproducibility and Good Scientific Practice

Ten simple rules on writing clean and reliable open-source scientific software

Sustainability and Digital Teaching Competence in Higher Education

A Framework for Creating Knowledge Graphs of Scientific Software Metadata

Hands-on with OWL: the Oak–Ridge Wang–Landau Monte Carlo software suite

scientific software
Recently Published Documents