scholarly journals Scaling by Optimising: Modularisation of Data Curation Services in Growing Organisations

2021 ◽  
Vol 16 (1) ◽  
pp. 20
Author(s):  
Hagen Peukert

After a century of theorising and applying management practices, we are in the middle of entering a new stage in management science: digital management. The management of digital data submerges in traditional functions of management and, at the same time, continues to recreate viable solutions and conceptualisations in its established fields, e.g. research data management. Yet, one can observe bilateral synergies and mutual enrichment of traditional and data management practices in all fields. The paper at hand addresses a case in point, in which new and old management practices amalgamate to meet a steadily, in part characterised by leaps and bounds, increasing demand of data curation services in academic institutions. The idea of modularisation, as known from software engineering, is applied to data curation workflows so that economies of scale and scope can be used. While scaling refers to both management science and data science, optimising is understood in the traditional managerial sense, that is, with respect to the cost function. By means of a situation analysis describing how data curation services were applied from one department to the entire institution and an analysis of the factors of influence, a method of modularisation is outlined that converges to an optimal state of curation workflows.

2016 ◽  
Vol 11 (1) ◽  
pp. 156 ◽  
Author(s):  
Wei Jeng ◽  
Liz Lyon

We report on a case study which examines the social science community’s capability and institutional support for data management. Fourteen researchers were invited for an in-depth qualitative survey between June 2014 and October 2015. We modify and adopt the Community Capability Model Framework (CCMF) profile tool to ask these scholars to self-assess their current data practices and whether their academic environment provides enough supportive infrastructure for data related activities. The exemplar disciplines in this report include anthropology, political sciences, and library and information science. Our findings deepen our understanding of social disciplines and identify capabilities that are well developed and those that are poorly developed. The participants reported that their institutions have made relatively slow progress on economic supports and data science training courses, but acknowledged that they are well informed and trained for participants’ privacy protection. The result confirms a prior observation from previous literature that social scientists are concerned with ethical perspectives but lack technical training and support. The results also demonstrate intra- and inter-disciplinary commonalities and differences in researcher perceptions of data-intensive capability, and highlight potential opportunities for the development and delivery of new and impactful research data management support services to social sciences researchers and faculty. 


2013 ◽  
Vol 8 (2) ◽  
pp. 5-26 ◽  
Author(s):  
Katherine G. Akers ◽  
Jennifer Doty

Academic librarians are increasingly engaging in data curation by providing infrastructure (e.g., institutional repositories) and offering services (e.g., data management plan consultations) to support the management of research data on their campuses. Efforts to develop these resources may benefit from a greater understanding of disciplinary differences in research data management needs. After conducting a survey of data management practices and perspectives at our research university, we categorized faculty members into four research domains—arts and humanities, social sciences, medical sciences, and basic sciences—and analyzed variations in their patterns of survey responses. We found statistically significant differences among the four research domains for nearly every survey item, revealing important disciplinary distinctions in data management actions, attitudes, and interest in support services. Serious consideration of both the similarities and dissimilarities among disciplines will help guide academic librarians and other data curation professionals in developing a range of data-management services that can be tailored to the unique needs of different scholarly researchers.


2019 ◽  
Author(s):  
Sara L Wilson ◽  
Micah Altman ◽  
Rafael Jaramillo

Data stewardship in experimental materials science is increasingly complex and important. Progress in data science and inverse-design of materials give reason for optimism that advances can be made if appropriate data resources are made available. Data stewardship also plays a critical role in maintaining broad support for research in the face of well-publicized replication failures (in different fields) and frequently changing attitudes, norms, and sponsor requirements for open science. The present-day data management practices and attitudes in materials science are not well understood. In this article, we collect information on the practices of a selection of materials scientists at two leading universities, using a semi-structured interview instrument. An analysis of these interviews reveals that although data management is universally seen as important, data management practices vary widely. Based on this analysis, we conjecture that broad adoption of basic file-level data sharing at the time of manuscript submission would benefit the field without imposing substantial burdens on researchers. More comprehensive solutions for lifecycle open research in materials science will have to overcome substantial differences in attitudes and practices.


2020 ◽  
Vol 2 (1-2) ◽  
pp. 238-245 ◽  
Author(s):  
Luana Sales ◽  
Patrícia Henning ◽  
Viviane Veiga ◽  
Maira Murrieta Costa ◽  
Luís Fernando Sayão ◽  
...  

The FAIR principles, an acronym for Findable, Accessible, Interoperable and Reusable, are recognised worldwide as key elements for good practice in all data management processes. To understand how the Brazilian scientific community is adhering to these principles, this article reports Brazilian adherence to the GO FAIR initiative through the creation of the GO FAIR Brazil Office and the manner in which they create their implementation networks. To contextualise this understanding, we provide a brief presentation of open data policies in Brazilian research and government, and finally, we describe a model that has been adopted for the GO FAIR Brazil implementation networks. The Brazilian Institute of Information in Science and Technology is responsible for the GO FAIR Brazil Office, which operates in all fields of knowledge and supports thematic implementation networks. Today, GO FAIR Brazil-Health is the first active implementation network in operation, which works in all health domains, serving as a model for other fields like agriculture, nuclear energy, and digital humanities, which are in the process of adherence negotiation. This report demonstrates the strong interest and effort from the Brazilian scientific communities in implementing the FAIR principles in their research data management practices.


2017 ◽  
Author(s):  
Philippa C. Griffin ◽  
Jyoti Khadake ◽  
Kate S. LeMay ◽  
Suzanna E. Lewis ◽  
Sandra Orchard ◽  
...  

AbstractThroughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain.Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on ‘omics’ datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.


2021 ◽  
Vol 5 ◽  
Author(s):  
Medha Devare ◽  
Céline Aubert ◽  
Omar Eduardo Benites Alfaro ◽  
Ivan Omar Perez Masias ◽  
Marie-Angélique Laporte

Agricultural research has been traditionally driven by linear approaches dictated by hypothesis-testing. With the advent of powerful data science capabilities, predictive, empirical approaches are possible that operate over large data pools to discern patterns. Such data pools need to contain well-described, machine-interpretable, and openly available data (represented by high-scoring Findable, Accessible, Interoperable, and Reusable—or FAIR—resources). CGIAR's Platform for Big Data in Agriculture has developed several solutions to help researchers generate open and FAIR outputs, determine their FAIRness in quantitative terms1, and to create high-value data products drawing on these outputs. By accelerating the speed and efficiency of research, these approaches facilitate innovation, allowing the agricultural sector to respond agilely to farmer challenges. In this paper, we describe the Agronomy Field Information Management System or AgroFIMS, a web-based, open-source tool that helps generate data that is “born FAIRer” by addressing data interoperability to enable aggregation and easier value derivation from data. Although license choice to determine accessibility is at the discretion of the user, AgroFIMS provides consistent and rich metadata helping users more easily comply with institutional, founder and publisher FAIR mandates. The tool enables the creation of fieldbooks through a user-friendly interface that allows the entry of metadata tied to the Dublin Core standard schema, and trial details via picklists or autocomplete that are based on semantic standards like the Agronomy Ontology (AgrO). Choices are organized by field operations or measurements of relevance to an agronomist, with specific terms drawn from ontologies. Once the user has stepped through required fields and desired modules to describe their trial management practices and measurement parameters, they can download the fieldbook to use as a standalone Excel-driven file, or employ via free Android-based KDSmart, Fieldbook, or ODK applications for digital data collection. Collected data can be imported back to AgroFIMS for statistical analysis and reports. Development plans for 2021 include new features such ability to clone fieldbooks and the creation of agronomic questionnaires. AgroFIMS will also allow archiving of FAIR data after collection and analysis from a database and to repository platforms for wider sharing.


F1000Research ◽  
2018 ◽  
Vol 6 ◽  
pp. 1618 ◽  
Author(s):  
Philippa C. Griffin ◽  
Jyoti Khadake ◽  
Kate S. LeMay ◽  
Suzanna E. Lewis ◽  
Sandra Orchard ◽  
...  

Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on ‘omics’ datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 1618 ◽  
Author(s):  
Philippa C. Griffin ◽  
Jyoti Khadake ◽  
Kate S. LeMay ◽  
Suzanna E. Lewis ◽  
Sandra Orchard ◽  
...  

Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain. Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on ‘omics’ datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.


2021 ◽  
Author(s):  
Tierney Latham ◽  
Catherine Beck ◽  
Bruce Wegter ◽  
Ahra Wu

<p>Increases in technology have rapidly advanced the capabilities and ubiquity of scientific instrumentation. Coupled with the demand for increased transparency and reproducibility in science, these advances have necessitated new systems of data management and archival practices. Laboratories are working to update their methods of data curation in line with these evolving best-practices, moving data from often disorderly private domains to publicly available, collaborative platforms. At the Hamilton Isotope Laboratory (HIL) of Hamilton College, the isotope ratio mass spectrometer (IRMS) is utilized across STEM disciplines for a combination of student, faculty, and course-related research, including both internal and external users. With over 200 sets of analytical runs processed in the past five years, documenting instrument usage and archiving the data produced is crucial to maintaining a state-of-the-art facility. However, previous to this project, the HIL faced significant barriers to proper data curation, storage, and accessibility including: a) data files were produced with variable format and nomenclature; b) data files were difficult to interpret without explanation from the lab technician; c) key metadata tying results to respective researchers and projects were missing; d) accessibility to data was limited due to storage on an individual computer; and e) data curation was an intellectual responsibility and burden for the lab technician. Additionally, as the HIL is housed within an undergraduate institution, the high rate of turnover for lab groups created additional barriers to the preservation of long-term, institutional knowledge, as students worked with the HIL for a year or less. These factors necessitate the establishment of new data management practices to ensure accessibility and longevity of scientific data and metadata. In this project, 283 Excel files of previously recorded data generated by the HIL IRMS were modified and cleaned to prepare data for submission to EarthChem, a public repository for geochemical data. Existing Excel files were manually manipulated, several original R code scripts were generated and employed, and procedures were established to backtrace projects and collect key metadata. Most critically, a new internal system of data collection was established with standardized nomenclature and framework. For future usage of the IRMS, data will be exported directly into a template compatible with EarthChem, thereby removing barriers for principal investigators (PIs) and research groups to archive their data in the public domain upon completion of their projects and publications.</p>


Sign in / Sign up

Export Citation Format

Share Document