Using ReproZip for Reproducibility and Library Services

Mapping Intimacies ◽

10.31229/osf.io/5tm8d ◽

2017 ◽

Author(s):

Vicky Steeves ◽

Remi Rampin ◽

Fernando Chirigati

Keyword(s):

Machine Learning ◽

Open Source ◽

Operating Systems ◽

Digital Libraries ◽

Digital Humanities ◽

Use Cases ◽

Library Services ◽

Open Source Tool ◽

Library Use ◽

New Machine

This is a pre-print of a manuscript pending publication. Achieving research reproducibility is challenging in many ways: there are social and cultural obstacles as well as a constantly changing technical landscape that makes replicating and reproducing research difficult. Users face challenges in reproducing research across different operating systems, in using different versions of software across long projects and among collaborations, and in using publicly available work. The dependencies required to reproduce the computational environments in which research happens can be exceptionally hard to track – in many cases, these dependencies are hidden or nested too deeply to discover, and thus impossible to install on a new machine, which means adoption remains low. In this paper, we present ReproZip, an open source tool to help overcome the technical difficulties involved in preserving and replicating research, applications, databases, software, and more. We examine the current use cases of ReproZip, ranging from digital humanities to machine learning. We also explore potential library use cases for ReproZip, particularly in digital libraries and archives, liaison librarianship, and other library services. We believe that libraries and archives can leverage ReproZip to deliver more robust reproducibility services, repository services, as well as enhanced discoverability and preservation of research materials, applications, software, and computational environments.

Download Full-text

Using ReproZip for Reproducibility and Library Services

IASSIST Quarterly ◽

10.29173/iq18 ◽

2017 ◽

Vol 42 (1) ◽

pp. 14

Author(s):

Vicky Steeves ◽

Rémi Rampin ◽

Fernando Chirigati

Keyword(s):

Machine Learning ◽

Open Source ◽

Operating Systems ◽

Digital Libraries ◽

Digital Humanities ◽

Use Cases ◽

Library Services ◽

Open Source Tool ◽

Library Use ◽

New Machine

Achieving research reproducibility is challenging in many ways: there are social and cultural obstacles as well as a constantly changing technical landscape that makes replicating and reproducing research difficult. Users face challenges in reproducing research across different operating systems, in using different versions of software across long projects and among collaborations, and in using publicly available work. The dependencies required to reproduce the computational environments in which research happens can be exceptionally hard to track – in many cases, these dependencies are hidden or nested too deeply to discover, and thus impossible to install on a new machine, which means adoption remains low. In this paper, we present ReproZip , an open source tool to help overcome the technical difficulties involved in preserving and replicating research, applications, databases, software, and more. We will examine the current use cases of ReproZip , ranging from digital humanities to machine learning. We also explore potential library use cases for ReproZip, particularly in digital libraries and archives, liaison librarianship, and other library services. We believe that libraries and archives can leverage ReproZip to deliver more robust reproducibility services, repository services, as well as enhanced discoverability and preservation of research materials, applications, software, and computational environments.

Download Full-text

ThManager: An Open Source Tool for Creating and Visualizing SKOS

Information Technology and Libraries ◽

10.6017/ital.v26i3.3274 ◽

2007 ◽

Vol 26 (3) ◽

pp. 39 ◽

Cited By ~ 10

Author(s):

Javier Lacasta ◽

Javier Nogueras-Iso ◽

Francisco Javier López-Pellicer ◽

Pedro Rafail Muro-Medrano ◽

Francisco Javier Zarazaga-Soria

Keyword(s):

Information Retrieval ◽

Open Source ◽

Data Sharing ◽

Digital Libraries ◽

Knowledge Organization ◽

Open Source Tool ◽

Knowledge Models ◽

Knowledge Organization Systems ◽

Interchange Format

Knowledge organization systems denotes formally represented knowledge that is used within the context of digital libraries to improve data sharing and information retrieval. To increase their use, and to reuse them when possible, it is vital to manage them adequately and to provide them in a standard interchange format. Simple knowledge organization systems (SKOS) seem to be the most promising representation for the type of knowledge models used in digital libraries, but there is a lack of tools that are able to properly manage it. This work presents a tool that fills this gap, facilitating their use in different environments and using SKOS as an interchange format.

Download Full-text

Machine Learning Boosted Docking (HASTEN): An Open-Source Tool To Accelerate Structurebased Virtual Screening Campaigns

10.26434/chemrxiv.14345849 ◽

2021 ◽

Author(s):

Tuomo Kalliokoski

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Open Source ◽

Learning Models ◽

Open Source Tool ◽

The Mean ◽

Machine Learning Models

The software macHine leArning booSTed dockiNg (HASTEN) was developed to accelerate structure-based virtual screening using machine learning models. It has been validated using datasets both from literature (12 datasets, each containing three million molecules docked with FRED) and in-house sources (one dataset of four million compounds docked with Glide). HASTEN showed reasonable performance by having the mean recall value of 0.78 of the top one percent scoring molecules after docking 10 % of the dataset for the literature data, whereas excellent recall value of 0.95 was achieved for the in-house data. The program can be used with any docking- and machine learning methodology, and is freely available from https://github.com/TuomoKalliokoski/HASTEN.

Download Full-text

An open-source tool for analysis and automatic identification of dendritic spines using machine learning

PLoS ONE ◽

10.1371/journal.pone.0199589 ◽

2018 ◽

Vol 13 (7) ◽

pp. e0199589 ◽

Cited By ~ 5

Author(s):

Michael S. Smirnov ◽

Tavita R. Garrett ◽

Ryohei Yasuda

Keyword(s):

Machine Learning ◽

Open Source ◽

Dendritic Spines ◽

Automatic Identification ◽

Open Source Tool

Download Full-text

Machine Learning Boosted Docking (HASTEN): An Open-Source Tool To Accelerate Structure-based Virtual Screening Campaigns

10.26434/chemrxiv.14345849.v2 ◽

2021 ◽

Author(s):

Tuomo Kalliokoski

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Open Source ◽

Learning Models ◽

Open Source Tool ◽

Accelerate Structure ◽

The Mean ◽

Machine Learning Models

The software macHine leArning booSTed dockiNg (HASTEN) was developed to accelerate structure-based virtual screening using machine learning models. It has been validated using datasets both from literature (12 datasets, each containing three million molecules docked with FRED) and in-house sources (one dataset of four million compounds docked with Glide). HASTEN showed reasonable performance by having the mean recall value of 0.78 of the top one percent scoring molecules after docking 10 % of the dataset for the literature data, whereas excellent recall value of 0.95 was achieved for the in-house data. The program can be used with any docking- and machine learning methodology, and is freely available from https://github.com/TuomoKalliokoski/HASTEN.

Download Full-text

CNN-PepPred: An open-source tool to create convolutional NN models for the discovery of patterns in peptide sets. Application to peptide-MHC class II binding prediction

Bioinformatics ◽

10.1093/bioinformatics/btab687 ◽

2021 ◽

Author(s):

Valentin Junet ◽

Xavier Daura

Keyword(s):

Neural Networks ◽

Open Source ◽

Mhc Class Ii ◽

Convolutional Neural Networks ◽

Operating Systems ◽

Class Ii ◽

Supplementary Information ◽

Supplementary Data ◽

Binding Prediction ◽

Open Source Tool

Abstract Summary The ability to unveil binding patterns in peptide sets has important applications in several biomedical areas, including the development of vaccines. We present an open-source tool, CNN-PepPred, that uses convolutional neural networks to discover such patterns, along with its application to peptide-HLA class II binding prediction. The tool can be used locally on different operating systems, with CPUs or GPUs, to train, evaluate, apply and visualize models. Availability and Implementation CNN-PepPred is freely available as a Python tool with a detailed User’s Guide at: https://github.com/ComputBiol-IBB/CNN-PepPred Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Machine Learning Boosted Docking (HASTEN): An Open-Source Tool To Accelerate Structurebased Virtual Screening Campaigns

10.26434/chemrxiv.14345849.v1 ◽

2021 ◽

Author(s):

Tuomo Kalliokoski

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Open Source ◽

Learning Models ◽

Open Source Tool ◽

The Mean ◽

Machine Learning Models

Download Full-text

Machine Learning Boosted Docking (HASTEN): An Open‐source Tool To Accelerate Structure‐based Virtual Screening Campaigns

Molecular Informatics ◽

10.1002/minf.202100089 ◽

2021 ◽

Author(s):

Tuomo Kalliokoski

Keyword(s):

Machine Learning ◽

Virtual Screening ◽

Open Source ◽

Open Source Tool ◽

Accelerate Structure

Download Full-text

RAVIS: Resource Forecast and Ramp Visualization for Situational Awareness - An Introduction to the Open-Source Tool and Use Cases

10.2172/1782444 ◽

2021 ◽

Author(s):

Paul Edwards ◽

Haiku Sky ◽

Venkat Krishnan

Keyword(s):

Open Source ◽

Situational Awareness ◽

Use Cases ◽

Open Source Tool

Download Full-text

When Perfect is the Enemy of Good - Quality and Sustainability in Digitization Processes

Archiving Conference ◽

10.2352/issn.2168-3204.2020.1.0.43 ◽

2020 ◽

Vol 2020 (1) ◽

pp. 43-48

Author(s):

Millard Wesley Long Schisler

Keyword(s):

Machine Learning ◽

Everyday Life ◽

Open Source ◽

Open Source Software ◽

Digital Humanities ◽

Added Value ◽

Research Institutions ◽

Develop Software

Machine Learning and IIIF are popular topics today when it comes to digitisation projects and digital humanities. But are these really practical topics or just buzzwords? Are these rather exclusive technologies of some elite cultural and research institutions? Or can everyday digitisation projects with less exquisite materials really benefit from such technologies? The example of the community around the open source software Goobi shows what the reality of numerous digitisation projects really looks like. What is no longer just theory and can be used in everyday life without having to develop software yourself? And what added value can actually be expected here?

Download Full-text