BioImageIT: Open-source framework for integration of image data-management with analysis

Open science and FAIR principles have become major topics in the field of bioimaging. This is due to both new data acquisition technologies that generate large datasets, and new analysis approaches that automate data mining with high accuracy. Nevertheless, data are rarely shared and rigorously annotated because it requires a lot of manual and tedious management tasks and software packaging. We present BioImageIT, an open-source framework for integrating data management according to FAIR principles with data processing.

Download Full-text

NOESIS: A Framework for Complex Network Data Analysis

Complexity ◽

10.1155/2019/1439415 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14

Author(s):

Víctor Martínez ◽

Fernando Berzal ◽

Juan-Carlos Cubero

Keyword(s):

Data Mining ◽

Open Source ◽

Complex Network ◽

Open Source Software ◽

Network Data ◽

Analysis Techniques ◽

Software Analysis ◽

Open Source Framework ◽

Network Properties ◽

Real World Problems

Network data mining has attracted a lot of attention since a large number of real-world problems have to deal with complex network data. In this paper, we present NOESIS, an open-source framework for network-based data mining. NOESIS features a large number of techniques and methods for the analysis of structural network properties, network visualization, community detection, link scoring, and link prediction. The proposed framework has been designed following solid design principles and exploits parallel computing using structured parallel programming. NOESIS also provides a stand-alone graphical user interface allowing the use of advanced software analysis techniques to users without prior programming experience. This framework is available under a BSD open-source software license.

Download Full-text

eLabFTW as an Open Science tool to improve the quality and translation of preclinical research

F1000Research ◽

10.12688/f1000research.52157.1 ◽

2021 ◽

Vol 10 ◽

pp. 292

Author(s):

Michael Hewera ◽

Daniel Hänggi ◽

Björn Gerlach ◽

Ulf Dietrich Kahlert

Keyword(s):

Data Management ◽

Open Source ◽

Academic Research ◽

Open Data ◽

Open Science ◽

Research Data ◽

Preclinical Research ◽

Research Data Management ◽

New Methods ◽

Set Up

Reports of non-replicable research demand new methods of research data management. Electronic laboratory notebooks (ELNs) are suggested as tools to improve the documentation of research data and make them universally accessible. In a self-guided approach, we introduced the open-source ELN eLabFTW into our lab group and, after using it for a while, think it is a useful tool to overcome hurdles in ELN introduction by providing a combination of properties making it suitable for small preclinical labs, like ours. We set up our instance of eLabFTW, without any further programming needed. Our efforts to embrace open data approach by introducing an ELN fits well with other institutional organized ELN initiatives in academic research.

Download Full-text

Plano de gestão de dados fair: uma proposta para a Fiocruz | Fair data management plan: a proposal for Fiocruz

Liinc em Revista ◽

10.18617/liinc.v15i2.5030 ◽

2019 ◽

Vol 15 (2) ◽

Author(s):

Viviane Santos de Oliveira Veiga ◽

Patricia Henning ◽

Simone Dib ◽

Erick Penedo ◽

Jefferson Da Costa Lima ◽

...

Keyword(s):

Life Cycle ◽

Data Management ◽

Open Science ◽

Management Plan ◽

Research Data ◽

Research Institutions ◽

Research Data Management ◽

Management Plans ◽

Fair Principles

RESUMO Este artigo trás para discussão o papel dos planos de gestão de dados como instrumento facilitador da gestão dos dados durante todo o ciclo de vida da pesquisa. A abertura de dados de pesquisa é pauta prioritária nas agendas científicas, por ampliar tanto a visibilidade e transparência das investigações, como a capacidade de reprodutibilidade e reuso dos dados em novas pesquisas. Nesse contexto, os princípios FAIR, um acrônimo para ‘Findable’, ‘Accessible’, ‘Interoperable’ e ‘Reusable’ é fundamental por estabelecerem orientações basilares e norteadoras na gestão, curadoria e preservação dos dados de pesquisa direcionados para o compartilhamento e o reuso. O presente trabalho tem por objetivo apresentar uma proposta de template de Plano de Gestão de Dados, alinhado aos princípios FAIR, para a Fundação Oswaldo Cruz. A metodologia utilizada é de natureza bibliográfica e de análise documental de diversos planos de gestão de dados europeus. Concluímos que a adoção de um plano de gestão nas práticas cientificas de universidades e instituições de pesquisa é fundamental. No entanto, para tirar maior proveito dessa atividade é necessário contar com a participação de todos os atores envolvidos no processo, além disso, esse plano de gestão deve ser machine-actionable, ou seja, acionável por máquina.Palavras-chave: Plano de Gestão de Dados; Dado de Pesquisa; Princípios FAIR; PGD Acionável por Máquina; Ciência Aberta.ABSTRACT This article proposes to discuss the role of data management plans as a tool to facilitate data management during researches life cycle. Today, research data opening is a primary agenda at scientific agencies as it may boost investigations’ visibility and transparency as well as the ability to reproduce and reuse its data on new researches. Within this context, FAIR principles, an acronym for Findable, Accessible, Interoperable and Reusable, is paramount, as it establishes basic and guiding orientations for research data management, curatorship and preservation with an intent on its sharing and reuse. The current work intends to present to the Fundação Oswaldo Cruz a new Data Management Plan template proposal, aligned with FAIR principles. The methodology used is bibliographical research and documental analysis of several European data management plans. We conclude that the adoption of a management plan on universities and research institutions scientific activities is paramount. However, to be fully benefited from this activity, all actors involved in the process must participate, and, on top of that, this plan must be machine-actionable.Keywords: Data Management Plan; Research Data; FAIR Principles; DMP Machine-Actionable; Open Science.

Download Full-text

pISA-tree - a data management framework for life science research projects using a standardised directory tree

10.1101/2021.11.18.468977 ◽

2021 ◽

Author(s):

Marko Petek ◽

Maja Zagorscak ◽

Andrej Blejec ◽

Ziva Ramsak ◽

Anna Coll ◽

...

Keyword(s):

Data Management ◽

Life Science ◽

Source Code ◽

Science Research ◽

Open Science ◽

Reproducible Research ◽

Public Repository ◽

Management Framework ◽

Fair Principles ◽

R Packages

We have developed pISA-tree, a straightforward and flexible data management solution for organisation of life science project-associated research data and metadata. It enables on-the-fly creation of enriched directory tree structure (project/Investigation/Study/Assay) via a series of sequential batch files in a standardised manner based on the ISA metadata framework. The system supports reproducible research and is in accordance with the Open Science initiative and FAIR principles. Compared with similar frameworks, it does not require any systems administration and maintenance as it can be run on a personal computer or network drive. It is complemented with two R packages, pisar and seekr, where the former facilitates integration of the pISA-tree datasets into bioinformatic pipelines and the latter enables synchronisation with the FAIRDOMHub public repository using the SEEK API. Source code and detailed documentation of pISA-tree and its supporting R packages are available from https://github.com/NIB-SI/pISA-tree.

Download Full-text

The NOESIS Open Source Framework for Network Data Mining

Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management ◽

10.5220/0005610103160321 ◽

2015 ◽

Cited By ~ 1

Author(s):

Víctor Martínez ◽

Fernando Berzal ◽

Juan-Carlos Cubero

Keyword(s):

Data Mining ◽

Open Source ◽

Network Data ◽

Open Source Framework

Download Full-text

From FAIR research data toward FAIR and open research software

it - Information Technology ◽

10.1515/itit-2019-0040 ◽

2020 ◽

Vol 62 (1) ◽

pp. 39-47 ◽

Cited By ~ 2

Author(s):

Wilhelm Hasselbring ◽

Leslie Carr ◽

Simon Hettrick ◽

Heather Packer ◽

Thanassis Tiropanis

Keyword(s):

Open Source ◽

Open Source Software ◽

Scientific Practice ◽

Open Science ◽

Research Data ◽

Scientific Publications ◽

Research Software ◽

Current State ◽

Software Licenses ◽

Fair Principles

AbstractThe Open Science agenda holds that science advances faster when we can build on existing results. Therefore, research data must be FAIR (Findable, Accessible, Interoperable, and Reusable) in order to advance the findability, reproducibility and reuse of research results. Besides the research data, all the processing steps on these data – as basis of scientific publications – have to be available, too.For good scientific practice, the resulting research software should be both open and adhere to the FAIR principles to allow full repeatability, reproducibility, and reuse. As compared to research data, research software should be both archived for reproducibility and actively maintained for reusability.The FAIR data principles do not require openness, but research software should be open source software. Established open source software licenses provide sufficient licensing options, such that it should be the rare exception to keep research software closed.We review and analyze the current state in this area in order to give recommendations for making research software FAIR and open.

Download Full-text

Measuring FAIR Principles to Inform Fitness for Use

International Journal of Digital Curation ◽

10.2218/ijdc.v13i1.630 ◽

2018 ◽

Vol 13 (1) ◽

pp. 35-46

Author(s):

Carolyn Hank ◽

Bradley Wade Bishop

Keyword(s):

Best Practices ◽

Data Quality ◽

Data Management ◽

Open Science ◽

Conceptual Level ◽

Data Types ◽

Good Data ◽

Secondary Purpose ◽

Fair Principles ◽

Data Quality Metrics

For open science to flourish, data and any related digital outputs should be discoverable and re-usable by a variety of potential consumers. The recent FAIR Data Principles produced by the Future of Research Communication and e-Scholarship (FORCE11) collective provide a compilation of considerations for making data findable, accessible, interoperable, and re-usable. The principles serve as guideposts to ‘good’ data management and stewardship for data and/or metadata. On a conceptual level, the principles codify best practices that managers and stewards would find agreement with, exist in other data quality metrics, and already implement. This paper reports on a secondary purpose of the principles: to inform assessment of data’s FAIR-ness or, put another way, data’s fitness for use. Assessment of FAIR-ness likely requires more stratification across data types and among various consumer communities, as how data are found, accessed, interoperated, and re-used differs depending on types and purposes. This paper’s purpose is to present a method for qualitatively measuring the FAIR Data Principles through operationalizing findability, accessibility, interoperability, and re- usability from a re-user’s perspective. The findings may inform assessments that could also be used to develop situationally-relevant fitness for use frameworks.

Download Full-text

Sustainable FAIR Data management is challenging for RIs and it is challenging to solid Earth scientists

10.5194/egusphere-egu2020-18570 ◽

2020 ◽

Author(s):

Massimo Cocco ◽

Daniele Bailo ◽

Keith G. Jeffery ◽

Rossana Paciello ◽

Valerio Vinciarelli ◽

...

Keyword(s):

Data Management ◽

Solid Earth ◽

Service Management ◽

Open Science ◽

Adoption Process ◽

Scientific Communities ◽

Research Infrastructures ◽

Data Products ◽

Data Stewardship ◽

Fair Principles

Interoperability has long been an objective for research infrastructures dealing with research data to foster open access and open science. More recently, FAIR principles (Findability, Accessibility, Interoperability and Reusability) have been proposed. The FAIR principles are now reference criteria for promoting and evaluating openness of scientific data. FAIRness is considered a necessary target for research infrastructures in different scientific domains at European and global level.Solid Earth RIs have long been committed to engage scientific communities involved in data collection, standardization and quality management as well as providing metadata and services for qualification, storage and accessibility. They are working to adopt FAIR principles, thus addressing the onerous task of turning these principles into practices. To make FAIR principles a reality in terms of service provision for data stewardship, some RI implementers in EPOS have proposed a FAIR-adoption process leveraging a four stage roadmap that reorganizes FAIR principles to better fit to scientists and RI implementers mindset. The roadmap considers FAIR principles as requirements in the software development life cycle, and reorganizes them into data, metadata, access services and use services. Both the implementation and the assessment of &#8220;FAIRness&#8221; level by means of questionnaire and metrics is made simple and closer to day-to-day scientists works.FAIR data and service management is demanding, requiring resources and skills and more importantly it needs sustainable IT resources. For this reason, FAIR data management is challenging for many Research Infrastructures and data providers turning FAIR principles into reality through viable and sustainable practices. FAIR data management also includes implementing services to access data as well as to visualize, process, analyse and model them for generating new scientific products and discoveries.FAIR data management is challenging to Earth scientists because it depends on their perception of finding, accessing and using data and scientific products: in other words, the perception of data sharing. The sustainability of FAIR data and service management is not limited to financial sustainability and funding; rather, it also includes legal, governance and technical issues that concern the scientific communities.In this contribution, we present and discuss some of the main challenges that need to be urgently tackled in order to run and operate FAIR data services in the long-term, as also envisaged by the European Open Science Cloud initiative: a) sustainability of the IT solutions and resources to support practices for FAIR data management (i.e., PID usage and preservation, including costs for operating the associated IT services); b) re-usability, which on one hand requires clear and tested methods to manage heterogeneous metadata and provenance, while on the other hand can be considered a frontier research field; c) FAIR services provision, which presents many open questions related to the application of FAIR principles to services for data stewardship, and to services for the creation of data products taking in input FAIR raw data, for which is not clear how FAIRness compliancy of data products can be still guaranteed.

Download Full-text

Cloud Computing Environment Data Mining Storage Management Design

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.765-767.850 ◽

2013 ◽

Vol 765-767 ◽

pp. 850-853

Author(s):

Feng Pan ◽

Ping Lv

Keyword(s):

Data Mining ◽

Cloud Computing ◽

Open Source ◽

Data Storage ◽

Information Access ◽

Implementation Process ◽

Storage Management ◽

Financial Industry ◽

Open Source Framework ◽

Specific Implementation

In view of the current financial, securities, insurance and other industries data information management situation, need to use cloud computing technology on data storage management, enhance the data information access ability, through the application of cloud computation technology research, can improve the financial industry data mining capabilities. The specific implementation process to adopt open source framework is a shortcut. However, reference and use open source framework needs to be combined with their actual needs, through the analysis of the characteristics of Hadoop. The HDFS as Hadoop research foundation, the HDFS features are applied to the practical project, the establishment of HDFS fully supports the relational data model, data mining capacity upgrade.

Download Full-text

A Hitchhiker’s Guide to Working with Large, Open-Source Neuroimaging Datasets

10.20944/preprints202007.0153.v1 ◽

2020 ◽

Author(s):

Corey Horien ◽

Stephanie Noble ◽

Abigail Greene ◽

Kangjoo Lee ◽

Daniel Barron ◽

...

Keyword(s):

Life Cycle ◽

Open Source ◽

State Of The Art ◽

Scientific Discovery ◽

Open Science ◽

Large Datasets ◽

End User ◽

Data Life Cycle ◽

The World ◽

Novice Users

Large datasets that enable researchers to perform investigations with unprecedented rigor are growing increasingly common in neuroimaging. Due to the simultaneous increasing popularity of open science, these state-of-the-art datasets are more accessible than ever to researchers around the world. While analysis of these samples has pushed the field forward, they pose a new set of challenges that might cause difficulties for novice users. Here, we offer practical tips for working with large datasets from the end-user’s perspective. We cover all aspects of the data life cycle: from what to consider when downloading and storing the data, to tips on how to become acquainted with a dataset one did not collect, to what to share when communicating results. This manuscript serves as a practical guide one can use when working with large neuroimaging datasets, thus dissolving barriers to scientific discovery.

Download Full-text