scholarly journals Applying FAIR Principles to Plant Phenotypic Data Management in GnpIS

2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
C. Pommier ◽  
C. Michotey ◽  
G. Cornut ◽  
P. Roumet ◽  
E. Duchêne ◽  
...  

GnpIS is a data repository for plant phenomics that stores whole field and greenhouse experimental data including environment measures. It allows long-term access to datasets following the FAIR principles: Findable, Accessible, Interoperable, and Reusable, by using a flexible and original approach. It is based on a generic and ontology driven data model and an innovative software architecture that uncouples data integration, storage, and querying. It takes advantage of international standards including the Crop Ontology, MIAPPE, and the Breeding API. GnpIS allows handling data for a wide range of species and experiment types, including multiannual perennial plants experimental network or annual plant trials with either raw data, i.e., direct measures, or computed traits. It also ensures the integration and the interoperability among phenotyping datasets and with genotyping data. This is achieved through a careful curation and annotation of the key resources conducted in close collaboration with the communities providing data. Our repository follows the Open Science data publication principles by ensuring citability of each dataset. Finally, GnpIS compliance with international standards enables its interoperability with other data repositories hence allowing data links between phenotype and other data types. GnpIS can therefore contribute to emerging international federations of information systems.

2019 ◽  
Vol 46 (8) ◽  
pp. 622-638
Author(s):  
Joachim Schöpfel ◽  
Dominic Farace ◽  
Hélène Prost ◽  
Antonella Zane

Data papers have been defined as scholarly journal publications whose primary purpose is to describe research data. Our survey provides more insights about the environment of data papers, i.e., disciplines, publishers and business models, and about their structure, length, formats, metadata, and licensing. Data papers are a product of the emerging ecosystem of data-driven open science. They contribute to the FAIR principles for research data management. However, the boundaries with other categories of academic publishing are partly blurred. Data papers are (can be) generated automatically and are potentially machine-readable. Data papers are essentially information, i.e., description of data, but also partly contribute to the generation of knowledge and data on its own. Part of the new ecosystem of open and data-driven science, data papers and data journals are an interesting and relevant object for the assessment and understanding of the transition of the former system of academic publishing.


2020 ◽  
Vol 2 (1-2) ◽  
pp. 192-198 ◽  
Author(s):  
Mark Hahnel ◽  
Dan Valen

Data repository infrastructures for academics have appeared in waves since the dawn of Web technology. These waves are driven by changes in societal needs, archiving needs and the development of cloud computing resources. As such, the data repository landscape has many flavors when it comes to sustainability models, target audiences and feature sets. One thing that links all data repositories is a desire to make the content they host reusable, building on the core principles of cataloging content for economical and research speed efficiency. The FAIR principles are a common goal for all repository infrastructures to aim for. No matter what discipline or infrastructure, the goal of reusable content, for both humans and machines, is a common one. This is the first time that repositories can work toward a common goal that ultimately lends itself to interoperability. The idea that research can move further and faster as we un-silo these fantastic resources is an achievable one. This paper investigates the steps that existing repositories need to take in order to remain useful and relevant in a FAIR research world.


Author(s):  
Ingrid Dillo ◽  
Lisa De Leeuw

Open data and data management policies that call for the long-term storage and accessibility of data are becoming more and more commonplace in the research community. With it the need for trustworthy data repositories to store and disseminate data is growing. CoreTrustSeal, a community based and non-profit organisation, offers data repositories a core level certification based on the DSA-WDS Core Trustworthy Data Repositories Requirements catalogue and procedures. This universal catalogue of requirements reflects the core characteristics of trustworthy data repositories. Core certification involves an uncomplicated process whereby data repositories supply evidence that they are sustainable and trustworthy. A repository first conducts an internal self-assessment, which is then reviewed by community peers. Once the self-assessment is found adequate the CoreTrustSeal board certifies the repository with a CoreTrustSeal. The Seal is valid for a period of three years. Being a certified repository has several external and internal benefits. It for instance improves the quality and transparency of internal processes, increases awareness of and compliance with established standards, builds stakeholder confidence, enhances the reputation of the repository, and demonstrates that the repository is following good practices. It is also offering a benchmark for comparison and helps to determine the strengths and weaknesses of a repository. In the future we foresee a larger uptake through different domains, not in the least because within the European Open Science Cloud, the FAIR principles and therefore also the certification of trustworthy digital repositories holding data is becoming increasingly important. Next to that the CoreTrustSeal requirements will most probably become a European Technical standard which can be used in procurement (under review by the European Commission).


2021 ◽  
Author(s):  
Julian Hocker ◽  
Taryn Bipat ◽  
David W. McDonald ◽  
Mark Zachry

Abstract Qualitative science methods have largely been omitted from discussions of open science. Platforms focused on qualitative science that support open science data and method sharing are rare. Sharing and exchanging coding schemas has great potential for supporting traceability in qualitative research as well as for facilitating the re-use of coding schemas. In this study, we describe and evaluate QualiCO, an ontology for qualitative coding schemas. QualiCO is designed to describe a wide range of qualitative coding schemas. Twenty qualitative researchers used QualiCO to complete two coding tasks. In our findings, we present task performance and interview data that focus participants’ attention on the ontology. Participants used QualiCO to complete the coding tasks, decreasing time on task, while improving accuracy, signifying that QualiCO enabled the reuse of qualitative coding schemas. Our discussion elaborates some issues that participants had and highlights how conceptual and prior practice frames their interpretation of how QualiCo can be used.


Metabolomics ◽  
2019 ◽  
Vol 15 (10) ◽  
Author(s):  
Kevin M. Mendez ◽  
Leighton Pritchard ◽  
Stacey N. Reinke ◽  
David I. Broadhurst

Abstract Background A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike. Aim of Review To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science. Key Scientific Concepts of Review This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.


2018 ◽  
Vol 12 (2) ◽  
Author(s):  
Amy M Pienta ◽  
Dharma Akmon ◽  
Justin Noble ◽  
Lynette Hoelter ◽  
Susan Jekielek

Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development. By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior. Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies. The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.


2021 ◽  
Author(s):  
Muhammed Khawatmi ◽  
Yoann Steux ◽  
Sadam Zourob ◽  
Heba Z Sailem

Intuitive visualisation of quantitative microscopy data is crucial for interpreting and discovering new patterns in complex bioimage data. Existing visualisation approaches, such as bar charts, scatter plots and heat maps, do not accommodate the complexity of visual information present in microscopy data. Here we develop ShapoGraphy, a first of its kind method accompanied by a user-friendly web-based application for creating interactive quantitative pictorial representations of phenotypic data and facilitating the understanding and analysis of image datasets (www.shapography.com). ShapoGraphy enables the user to create a structure of interest as a set of shapes. Each shape can encode different variables that are mapped to the shape dimensions, colours, symbols, and stroke features. We illustrate the utility of ShapoGraphy using various image data, including high dimensional multiplexed data. Our results show that ShapoGraphy allows a better understanding of cellular phenotypes and relationships between variables. In conclusion, ShopoGraphy supports scientific discovery and communication by providing a wide range of users with a rich vocabulary to create engaging and intuitive representations of diverse data types.


1970 ◽  
Vol 12 (2) ◽  
pp. 177-195 ◽  
Author(s):  
Alastair Dunning ◽  
Madeleine De Smaele ◽  
Jasmin Böhmer

This practice paper describes an ongoing research project to test the effectiveness and relevance of the FAIR Data Principles. Simultaneously, it will analyse how easy it is for data archives to adhere to the principles. The research took place from November 2016 to January 2017, and will be underpinned with feedback from the repositories. The FAIR Data Principles feature 15 facets corresponding to the four letters of FAIR - Findable, Accessible, Interoperable, Reusable. These principles have already gained traction within the research world. The European Commission has recently expanded its demand for research to produce open data. The relevant guidelines1are explicitly written in the context of the FAIR Data Principles. Given an increasing number of researchers will have exposure to the guidelines, understanding their viability and suggesting where there may be room for modification and adjustment is of vital importance. This practice paper is connected to a dataset(Dunning et al.,2017) containing the original overview of the sample group statistics and graphs, in an Excel spreadsheet. Over the course of two months, the web-interfaces, help-pages and metadata-records of over 40 data repositories have been examined, to score the individual data repository against the FAIR principles and facets. The traffic-light rating system enables colour-coding according to compliance and vagueness. The statistical analysis provides overall, categorised, on the principles focussing, and on the facet focussing results. The analysis includes the statistical and descriptive evaluation, followed by elaborations on Elements of the FAIR Data Principles, the subject specific or repository specific differences, and subsequently what repositories can do to improve their information architecture. (1) H2020 Guidelines on FAIR Data Management:http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf


2017 ◽  
Vol 35 (4) ◽  
pp. 626-649 ◽  
Author(s):  
Wei Jeng ◽  
Daqing He ◽  
Yu Chi

Purpose Owing to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The open archival information system (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories. Considering that OAIS is a reference model that requires customization for actual practice, this paper aims to examine how the current practices in a data repository map to the OAIS environment and functional components. Design/methodology/approach The authors conducted two focus-group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR). By examining their current actions (activities regarding their work responsibilities) and IT practices, they studied the barriers and challenges of archiving and curating qualitative data at ICPSR. Findings The authors observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries. On the other hand, they find that the cost of preventing disclosure risk and a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing. Originality/value The authors evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. They also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be and the associated challenges that accompany these ideal technologies. Most importantly, they helped to prioritize challenges and barriers from the data curator’s perspective and to contribute implications of data sharing and reuse in social sciences.


Data ◽  
2019 ◽  
Vol 4 (4) ◽  
pp. 147 ◽  
Author(s):  
Gregory Giuliani ◽  
Gilberto Camara ◽  
Brian Killough ◽  
Stuart Minchin

Earth Observation Data Cubes (EODC) have emerged as a promising solution to efficiently and effectively handle Big Earth Observation (EO) Data generated by satellites and made freely and openly available from different data repositories. The aim of this Special Issue, “Earth Observation Data Cube”, in Data, is to present the latest advances in EODC development and implementation, including innovative approaches for the exploitation of satellite EO data using multi-dimensional (e.g., spatial, temporal, spectral) approaches. This Special Issue contains 14 articles covering a wide range of topics such as Synthetic Aperture Radar (SAR), Analysis Ready Data (ARD), interoperability, thematic applications (e.g., land cover, snow cover mapping), capacity development, semantics, processing techniques, as well as national implementations and best practices. These papers made significant contributions to the advancement of a more Open and Reproducible Earth Observation Science, reducing the gap between users’ expectations for decision-ready products and current Big Data analytical capabilities, and ultimately unlocking the information power of EO data by transforming them into actionable knowledge.


Sign in / Sign up

Export Citation Format

Share Document