scholarly journals Two real use cases of FAIR maturity indicators in the life sciences

2019 ◽  
Author(s):  
Serena Bonaretti ◽  
Egon Willighagen

AbstractData sharing and reuse are crucial to enhance scientific progress and maximize return of investments in science. Although attitudes are increasingly favorable, data reuse remains difficult for lack of infrastructures, standards, and policies. The FAIR (findable, accessible, interoperable, reusable) principles aim to provide recommendations to increase data reuse. Because of the broad interpretation of the FAIR principles, maturity indicators are necessary to determine FAIRness of a dataset. In this work, we propose a reproducible computational workflow to assess data FAIRness in the life sciences. Our implementation follows principles and guidelines recommended by the maturity indicator authoring group and integrates concepts from the literature. In addition, we propose a FAIR balloon plot to summarize and compare dataset FAIRness. We evaluated our method on two real use cases where researchers looked for datasets to answer their scientific questions. We retrieved information from repositories (ArrayExpress and Gene Expression Omnibus), a registry of repositories (re3data.org), and a searchable resource (Google Dataset Search) via application program interface (API) wherever possible. With our analysis, we found that the two datasets met the majority of the criteria defined by the maturity indicators, and we showed areas where improvements can easily be reached. We suggest that use of standard schema for metadata and presence of specific attributes in registries of repositories could increase FAIRness of datasets.

Nanomaterials ◽  
2020 ◽  
Vol 10 (10) ◽  
pp. 2068 ◽  
Author(s):  
Ammar Ammar ◽  
Serena Bonaretti ◽  
Laurent Winckers ◽  
Joris Quik ◽  
Martine Bakker ◽  
...  

Data sharing and reuse are crucial to enhance scientific progress and maximize return of investments in science. Although attitudes are increasingly favorable, data reuse remains difficult due to lack of infrastructures, standards, and policies. The FAIR (findable, accessible, interoperable, reusable) principles aim to provide recommendations to increase data reuse. Because of the broad interpretation of the FAIR principles, maturity indicators are necessary to determine the FAIRness of a dataset. In this work, we propose a reproducible computational workflow to assess data FAIRness in the life sciences. Our implementation follows principles and guidelines recommended by the maturity indicator authoring group and integrates concepts from the literature. In addition, we propose a FAIR balloon plot to summarize and compare dataset FAIRness. We evaluated the feasibility of our method on three real use cases where researchers looked for six datasets to answer their scientific questions. We retrieved information from repositories (ArrayExpress, Gene Expression Omnibus, eNanoMapper, caNanoLab, NanoCommons and ChEMBL), a registry of repositories, and a searchable resource (Google Dataset Search) via application program interfaces (API) wherever possible. With our analysis, we found that the six datasets met the majority of the criteria defined by the maturity indicators, and we showed areas where improvements can easily be reached. We suggest that use of standard schema for metadata and the presence of specific attributes in registries of repositories could increase FAIRness of datasets.


2019 ◽  
Author(s):  
Andra Waagmeester ◽  
Gregory Stupp ◽  
Sebastian Burgstaller-Muehlbacher ◽  
Benjamin M. Good ◽  
Malachi Griffith ◽  
...  

AbstractWikidata is a community-maintained knowledge base that epitomizes the FAIR principles of Findability, Accessibility, Interoperability, and Reusability. Here, we describe the breadth and depth of biomedical knowledge contained within Wikidata, assembled from primary knowledge repositories on genomics, proteomics, genetic variants, pathways, chemical compounds, and diseases. We built a collection of open-source tools that simplify the addition and synchronization of Wikidata with source databases. We furthermore demonstrate several use cases of how the continuously updated, crowd-contributed knowledge in Wikidata can be mined. These use cases cover a diverse cross section of biomedical analyses, from crowdsourced curation of biomedical ontologies, to phenotype-based diagnosis of disease, to drug repurposing.


2020 ◽  
Vol 15 (1) ◽  
pp. 8
Author(s):  
Gerard Weatherby ◽  
Michael Robert Gryk

This paper reports on the ongoing activities and curation practices of the National Center for Biomolecular NMR Data Processing and Analysis1. Over the past several years, the Center has been developing and extending computational workflow management software for use by a community of biomolecular NMR spectroscopists. Previous work had been to refactor the workflow system to utilize the PREMIS framework for reporting retrospective provenance as well as for sharing workflows between scientists and to support data reuse. In this paper, we report on our recent efforts to embed analytics within the workflow execution and within provenance tracking. Important metrics for each of the intermediate datasets are included within the corresponding PREMIS intellectual object, which allows for both inspection of the operation of individual actors as well as visualization of the changes throughout a full processing workflow. These metrics can be viewed within the workflow management system or through standalone metadata widgets. Our approach is to support a hybrid approach of both automated, workflow execution as well as manual intervention and metadata management. In this combination, the workflow system and metadata widgets encourage the domain experts to be avid curators of the data which they create, fostering both computational reproducibility and scientific data reuse.  


2019 ◽  
Vol 39 (06) ◽  
pp. 290-299
Author(s):  
Naushad Ali PM ◽  
Sidra Saeed

This study investigates perception of research scholars towards research data management and sharing. A survey was conducted among research scholars from Faculty of Life Sciences and Social Sciences, Aligarh Muslim University (AMU). In total, 352 participants filled out the questionnaire. The study shows that research scholars ofFaculty of Social Sciences are more willing to share their research data as compared to Research Scholars of Life Sciences. Contributing to scientific progress and increasing research citations and visibility were the key factors that motivated researchers to share data. However, confidentiality and data misuse were the main concerns among those who were unwilling to share. Finally, some recommendations to improve the of data management and sharing practices are presented.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 692
Author(s):  
Zsófia Viktória Vida ◽  
István Péter Járay ◽  
Balázs Lengyel

Background: Scientific progress during doctoral studies is a combination of individual effort and teamwork. A recently growing body of interdisciplinary literature has investigated the determinants of early career success in academia, in which learning from supervisors and co-authors play a great role. Yet, it is less understood how collaboration patterns of the research team, in which the doctoral student participates, influences the future career of students. Here we take a social network analysis approach to investigate this and define the research team as the co-authorship network of the student. Methods: We use the Hungarian Scientific Bibliography Database, which includes all publications of PhD students who defended theses from the year 1993. The data also include thesis information, and the publications of co-authors of students. Using this data, we quantify cohesion in the ego-network of PhD students, the impact measured by citations received, and productivity measured by number of publications. We run multivariate linear regressions to measure the relation of network cohesion, and publication outputs during doctoral years with future impact. Results: We find that those students in life sciences, but not in other fields, who have a cohesive co-author network during studies and two years after defence receive significantly more citations in eight years. We find that the number of papers published during PhD years and closely after the defence correlates negatively while the impact of these papers correlates positively with future success of students in all fields. Conclusions: These results highlight that research teams are effective learning environments for PhD students where collaborations create a tightly knit knowledge network.


2019 ◽  
Vol 39 (06) ◽  
pp. 329-337
Author(s):  
Juan-José Boté ◽  
Miquel Termens

Research centres, universities and public organisations create datasets that can be reused in research. Reusing data makes it possible to reproduce studies, generate new research questions and new knowledge, but it also gives rise to technical and ethical challenges. Part of these issues are repositories interoperability to accomplish FAIR principles or issues related to data privacy or anonymity. At the same time, funding institutions require that data management plans be submitted for grants, and research tends to be increasingly interdisciplinary. Interdisciplinarity may entail barriers for researchers to reuse data, such as a lack of skills to manipulate data, given that each discipline generates different types of data in different technical formats, often non-standardized. Additionally, the use of standards to validate data reuse and better metadata to find appropriate datasets seem necessary. This paper offers a review of the literature that addresses data reuse in terms of technical, ethical-related issues.


2021 ◽  
Vol 12 ◽  
Author(s):  
J. Russell Huie ◽  
Austin Chou ◽  
Abel Torres-Espin ◽  
Jessica L. Nielson ◽  
Esther L. Yuh ◽  
...  

The guiding principle for data stewardship dictates that data be FAIR: findable, accessible, interoperable, and reusable. Data reuse allows researchers to probe data that may have been originally collected for other scientific purposes in order to gain novel insights. The current study reuses the Transforming Research and Clinical Knowledge for Traumatic Brain Injury (TRACK-TBI) Pilot dataset to build upon prior findings and ask new scientific questions. Specifically, we have previously used a multivariate analytics approach to multianalyte serum protein data from the TRACK-TBI Pilot dataset to show that an inflammatory ensemble of biomarkers can predict functional outcome at 3 and 6 months post-TBI. We and others have shown that there are quantitative and qualitative changes in inflammation that come with age, but little is known about how this interaction affects recovery from TBI. Here we replicate the prior proteomics findings with improved missing value analyses and non-linear principal component analysis and then expand upon this work to determine whether age moderates the effect of inflammation on recovery. We show that increased age correlates with worse functional recovery on the Glasgow Outcome Scale-Extended (GOS-E) as well as increased inflammatory signature. We then explore the interaction between age and inflammation on recovery, which suggests that inflammation has a more detrimental effect on recovery for older TBI patients.


2018 ◽  
Author(s):  
Luke Holman ◽  
Claire Morandin

AbstractEvidence suggests that women in academia are hindered by conscious and unconscious biases, and often feel excluded from formal and informal opportunities for research collaboration. In addition to ensuring fairness and helping to redress gender imbalance in the academic workforce, increasing women’s access to collaboration could help scientific progress by drawing on more of the available human capital. Here, we test whether researchers tend to collaborate with same-gendered colleagues, using more stringent methods and a larger dataset than in past work. Our results reaffirm that researchers co-publish with colleagues of the same gender more often than expected by chance, and show that this ‘gender homophily’ is slightly stronger today than it was 10 years ago. Contrary to our expectations, we found no evidence that homophily is driven mostly by senior academics, and no evidence that homophily is stronger in fields where women are in the minority. Interestingly, journals with a high impact factor for their discipline tended to have comparatively low homophily, as predicted if mixed-gender teams produce better research. We discuss some potential causes of gender homophily in science.


2021 ◽  
pp. 1-14
Author(s):  
Ebtisam Alharbi ◽  
Rigina Skeva ◽  
Nick Juty ◽  
Caroline Jay ◽  
Carole Goble

Abstract The findable, accessible, interoperable, reusable (FAIR) principles for scientific data management and stewardship aim to facilitate data reuse at scale by both humans and machines. Research and development (R&D) in the pharmaceutical industry is becoming increasingly data driven, but managing its data assets according to FAIR principles remains costly and challenging. To date, little scientific evidence exists about how FAIR is currently implemented in practice, what its associated costs and benefits are, and how decisions are made about the retrospective FAIRification of datasets in pharmaceutical R&D. This paper reports the results of semi-structured interviews with 14 pharmaceutical professionals who participate in various stages of drug R&D in 7 pharmaceutical businesses. Inductive thematic analysis identified three primary themes of the benefits and costs of FAIRification, and the elements that influence the decision-making process for FAIRifying legacy datasets. Participants collectively acknowledged the potential contribution of FAIRification to data reusability in diverse research domains and the subsequent potential for cost-savings. Implementation costs, however, were still considered a barrier by participants, with the need for considerable expenditure in terms of resources, and cultural change. How decisions were made about FAIRification was influenced by legal and ethical considerations, management commitment, and data prioritisation. The findings have significant implications for those in the pharmaceutical R&D industry who are engaged in driving FAIR implementation, and for external parties who seek to better understand existing practices and challenges.


2010 ◽  
Vol 9 ◽  
pp. CIN.S5371
Author(s):  
Thomas S. Deisboeck ◽  
Jonathan Sagotsky

The world wide web has furthered the emergence of a multitude of online expert communities. Continued progress on many of the remaining complex scientific questions requires a wide ranging expertise spectrum with access to a variety of distinct data types. Moving beyond peer-to-peer to community-to-community interaction is therefore one of the biggest challenges for global interdisciplinary Life Sciences research, including that of cancer. Cross-domain data query, access, and retrieval will be important innovation areas to enable and facilitate this interaction in the coming years.


Sign in / Sign up

Export Citation Format

Share Document