scholarly journals COPO: a metadata platform for brokering FAIR data in the life sciences

2019 ◽  
Author(s):  
Anthony Etuk ◽  
Felix Shaw ◽  
Alejandra Gonzalez-Beltran ◽  
David Johnson ◽  
Marie-Angélique Laporte ◽  
...  

AbstractScientific innovation is increasingly reliant on data and computational resources. Much of today’s life science research involves generating, processing, and reusing heterogeneous datasets that are growing exponentially in size. Demand for technical experts (data scientists and bioinformaticians) to process these data is at an all-time high, but these are not typically trained in good data management practices. That said, we have come a long way in the last decade, with funders, publishers, and researchers themselves making the case for open, interoperable data as a key component of an open science philosophy. In response, recognition of the FAIR Principles (that data should be Findable, Accessible, Interoperable and Reusable) has become commonplace. However, both technical and cultural challenges for the implementation of these principles still exist when storing, managing, analysing and disseminating both legacy and new data.COPO is a computational system that attempts to address some of these challenges by enabling scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share it with the wider scientific community. COPO encourages data generators to adhere to appropriate metadata standards when publishing research objects, using semantic terms to add meaning to them and specify relationships between them. This allows data consumers, be they people or machines, to find, aggregate, and analyse data which would otherwise be private or invisible. Building upon existing standards to push the state of the art in scientific data dissemination whilst minimising the burden of data publication and sharing.AvailabilityCOPO is entirely open source and freely available on GitHub at https://github.com/collaborative-open-plant-omics. A public instance of the platform for use by the community, as well as more information, can be found at copo-project.org.

F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 495 ◽  
Author(s):  
Felix Shaw ◽  
Anthony Etuk ◽  
Alice Minotto ◽  
Alejandra Gonzalez-Beltran ◽  
David Johnson ◽  
...  

Scientific innovation is increasingly reliant on data and computational resources. Much of today’s life science research involves generating, processing, and reusing heterogeneous datasets that are growing exponentially in size. Demand for technical experts (data scientists and bioinformaticians) to process these data is at an all-time high, but these are not typically trained in good data management practices. That said, we have come a long way in the last decade, with funders, publishers, and researchers themselves making the case for open, interoperable data as a key component of an open science philosophy. In response, recognition of the FAIR Principles (that data should be Findable, Accessible, Interoperable and Reusable) has become commonplace. However, both technical and cultural challenges for the implementation of these principles still exist when storing, managing, analysing and disseminating both legacy and new data. COPO is a computational system that attempts to address some of these challenges by enabling scientists to describe their research objects (raw or processed data, publications, samples, images, etc.) using community-sanctioned metadata sets and vocabularies, and then use public or institutional repositories to share them with the wider scientific community. COPO encourages data generators to adhere to appropriate metadata standards when publishing research objects, using semantic terms to add meaning to them and specify relationships between them. This allows data consumers, be they people or machines, to find, aggregate, and analyse data which would otherwise be private or invisible, building upon existing standards to push the state of the art in scientific data dissemination whilst minimising the burden of data publication and sharing.


2020 ◽  
Author(s):  
Rahul Ramachandran ◽  
Kaylin Bugbee ◽  
Kevin Murphy

<p>Open science is a concept that represents a fundamental change in scientific culture. This change is characterized by openness, where research objects and results are shared as soon as possible, and connectivity to a wider audience. Understanding about what Open Science actually means  differs from various stakeholders.</p><p>Thoughts on Open Science fall into four distinct viewpoints. The first viewpoint strives to make science accessible to a larger community by focusing on allowing non-scientists to participate in the research process through citizen science project and by more effectively communicating research results to the broader public. The second viewpoint considers providing equitable knowledge access to everyone by not only considering access to journal publications but also to other objects in the research process such as data and code. The third viewpoint focuses on making both the research process and the communication of results more efficient. There are two aspects to this component which can be described as social and technical components. The social component is driven by the need to tackle complex problems that require collaboration and a team approach to science while the technical component focuses on creating tools, services and especially scientific platforms to make the scientific process more efficient. Lastly, the fourth viewpoint strives to develop new metrics to measure scientific contributions that go beyond the current metrics derived solely from scientific publications and to consider contributions from other research objects such as data, code or knowledge sharing through blogs and other social media communication mechanisms. </p><p>Technological change is a factor in all four of these viewpoints on Open Science. New capabilities in compute, storage, methodologies, publication and sharing enable technologists to better serve as the primary drivers for Open Science by providing more efficient technological solutions. Sharing knowledge, information and other research objects such as data and code has become easier with new modalities of sharing available to researchers. In addition, technology is enabling the democratization of science at two levels. First, researchers are no longer constrained by lack of infrastructure resources needed to tackle difficult problems. Second, the Citizen Science projects now involve the public at different steps of the scientific process from collecting the data to analysis.</p><p>This presentations investigates the four described viewpoints on Open Science from the perspective of any large organization involved in scientific data stewardship and management. The presentation will list possible technological strategies that organizations may adopt to further align with all aspects of the Open Science movement. </p>


2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Renata Curty

RESUMO As diretivas governamentais e institucionais em torno do compartilhamento de dados de pesquisas financiadas com dinheiro público têm impulsionado a rápida expansão de repositórios digitais de dados afim de disponibilizar esses ativos científicos para reutilização, com propósitos nem sempre antecipados, pelos pesquisadores que os produziram/coletaram. De modo contraditório, embora o argumento em torno do compartilhamento de dados seja fortemente sustentado no potencial de reúso e em suas consequentes contribuições para o avanço científico, esse tema permanece acessório às discussões em torno da ciência de dados e da ciência aberta. O presente artigo de revisão narrativa tem por objetivo lançar um olhar mais atento ao reúso de dados e explorar mais diretamente esse conceito, ao passo que propõe uma classificação inicial de cinco abordagens distintas para o reúso de dados de pesquisa (reaproveitamento, agregação, integração, metanálise e reanálise), com base em situações hipotéticas acompanhadas de casos de reúso de dados publicados na literatura científica. Também explora questões determinantes para a condição de reúso, relacionando a reusabilidade à qualidade da documentação que acompanha os dados. Oferece discussão sobre os desafios da documentação de dados, bem como algumas iniciativas e recomendações para que essas dificuldades sejam contornadas. Espera-se que os argumentos apresentados contribuam não somente para o avanço conceitual em torno do reúso e da reusabilidade de dados, mas também reverberem em ações relacionadas à documentação dos dados de modo a incrementar o potencial de reúso desses ativos científicos.Palavras-chave: Reúso de Dados; Reprodutibilidade Científica; Reusabilidade; Ciência Aberta; Dados de Pesquisa. ABSTRACT The availability of scientific assets through data repositories has been greatly increased as a result of government and institutional data sharing policies and mandates for publicly funded research, allowing data to be reused for purposes not always anticipated by primary researchers. Despite the fact that the argument favoring data sharing is strongly grounded in the possibilities of data reuse and its contributions to scientific advancement, this subject remains unobserved in discussions about data science and open science. This paper follows a narrative review method to take a closer look at data reuse in order to better conceptualize this term, while proposing an early classification of five distinct data reuse approaches (repurposing, aggregation, integration, meta-analysis and reanalysis) based on hypothetical cases and literature examples. It also explores the determinants of what constitutes reusable data, and the relationship between data reusability and documentation quality. It presents some challenges associated with data documentation and points out some initiatives and recommendations to overcome such problems. It expects to contribute not only for the conceptual advancement around the reusability and effective reuse of the data, but also to result in initiatives related to data documentation in order to increase the reuse potential of these scientific assets.Keywords:Data Reuse; Scientific Reproducibility; Reusability; Open Science; Research Data.  


2018 ◽  
Vol 22 (4) ◽  
pp. 53-63 ◽  
Author(s):  
Dmitriy A. Kachan ◽  
Alexandra V. Bogatko ◽  
Ivan N. Bogatko ◽  
Sergei V. Enin ◽  
Uladzimir G. Kulazhanka ◽  
...  

Purpose of the study. The purpose of this paper is to analyze the current state and prospects for introducing the principles of open access to scientific publications and scientific data in the sphere of science and education in Belarus. Its relevance is related to the need to develop measures to accelerate the process of digital transformation of science and education of the Republic of Belarus.Materials and methods. Information base of the research was made by publications of scientists and specialists on the issues under study, normative documents, final documents of conferences on this topic, data from the Open Data in Belarus portal, national and international aggregators of institutional repositories, open scientific data repositories.Results. The analysis of the state and prospects of introducing the principles of open access to scientific publications and scientific data into the sphere of science and education of Belarus was carried out during the research. It is shown that the digital transformation of science and education is at an early stage. The dissemination of the principles of open science and the introduction of new instruments of scientific communication in the Belarusian academic and university science are uneven, there is a need to develop a strategy in this direction. The principle of open access to publications is being most actively introduced into practice through the development of a network of university repositories. In Belarus, the open data infrastructure is at the very beginning of its formation. In this regard, there is a need to conduct additional research to identify problems associated with the discovery of scientific data. One step in the transition to open science is the unification of all the repositories on a single platform of the national aggregator. The review of national and international aggregators of institutional repositories is presented. The questions on creation of the national system-aggregator of information resources of open access in the Republic of Belarus in the context of the formation of the Republican information and educational environment are considered: the purpose of the system, the platform used, technical solutions for organizing the integration of information systems of higher education institutions and the rules of interaction of system users.Conclusions. The creation of a national system-aggregator will not only provide a single point of access to the institutional repositories of project participants, which will significantly improve the convenience and completeness of the search, but will also solve one of the most important tasks of the project - popularizing the idea of open access to scientific publications. The implementation of the proposed measures to create conditions for the discovery of scientific data in Belarus will contribute to the introduction of the principle of open access to scientific data. The considered approaches will allow accelerating the process of digital transformation of the scientific and educational sphere of Belarus.


2020 ◽  
Author(s):  
Mario Locati ◽  
Francesco Mariano Mele ◽  
Vincenzo Romano ◽  
Placido Montalto ◽  
Valentino Lauciani ◽  
...  

<p>The Istituto Nazionale di Geofisica e Vulcanologia (INGV) has a long tradition of sharing scientific data, well before the Open Science paradigm was conceived. In the last thirty years, a great deal of geophysical data generated by research projects and monitoring activities were published on the Internet, though encoded in multiple formats and made accessible using various technologies.</p><p>To organise such a complex scenario, a working group (PoliDat) for implementing an institutional data policy operated from 2015 to 2018. PoliDat published three documents: in 2016, the data policy principles; in 2017, the rules for scientific publications; in 2018, the rules for scientific data management. These documents are available online in Italian, and English (https://data.ingv.it/docs/).</p><p>According to a preliminary data survey performed between 2016 and 2017, nearly 300 different types of INGV-owned data were identified. In the survey, the compilers were asked to declare all the available scientific data differentiating by the level of intellectual contribution: level 0 identifies raw data generated by fully automated procedures, level 1 identifies data products generated by semi-automated procedures, level 2 is related to data resulting from scientific investigations, and level 3 is associated to integrated data resulting from complex analysis.</p><p>A Data Management Office (DMO) was established in November 2018 to put the data policy into practice. DMO first goal was to design and establish a Data Registry aimed to satisfy the extremely differentiated requirements of both internal and external users, either at scientific or managerial levels. The Data Registry is defined as a metadata catalogue, i.e., a container of data descriptions, not the data themselves. In addition, the DMO supports other activities dealing with scientific data, such as checking contracts, providing advice to the legal office in case of litigations, interacting with the INGV Data Transparency Office, and in more general terms, supporting the adoption of the Open Science principles.</p><p>An extensive set of metadata has been identified to accommodate multiple metadata standards. At first, a preliminary set of metadata describing each dataset is compiled by the authors using a web-based interface, then the metadata are validated by the DMO, and finally, a DataCite DOI is minted for each dataset, if not already present. The Data Registry is publicly accessible via a dedicated web portal (https://data.ingv.it). A pilot phase aimed to test the Data Registry was carried out in 2019 and involved a limited number of contributors. To this aim, a top-priority data subset was identified according to the relevance of the data within the mission of INGV and the completeness of already available information. The Directors of the Departments of Earthquakes, Volcanoes, and Environment supervised the selection of the data subset.</p><p>The pilot phase helped to test and to adjust decisions made and procedures adopted during the planning phase, and allowed us to fine-tune the tools for the data management. During the next year, the Data Registry will enter its production phase and will be open to contributions from all INGV employees.</p>


2021 ◽  
Vol 40 (2) ◽  
pp. 137-141 ◽  
Author(s):  
Jordan Mansell ◽  
Allison Harell ◽  
Elisabeth Gidengil ◽  
Patrick A. Stewart

AbstractWe introduce the Politics and the Life Sciences special issue on Psychophysiology, Cognition, and Political Differences. This issue represents the second special issue funded by the Association for Politics and the Life Sciences that adheres to the Open Science Framework for registered reports (RR). Here pre-analysis plans (PAPs) are peer-reviewed and given in-principle acceptance (IPA) prior to data being collected and/or analyzed, and are published contingent upon the preregistration of the study being followed as proposed. Bound by a common theme of the importance of incorporating psychophysiological perspectives into the study of politics, broadly defined, the articles in this special issue feature a unique set of research questions and methodologies. In the following, we summarize the findings, discuss the innovations produced by this research, and highlight the importance of open science for the future of political science research.


2018 ◽  
Vol 2 ◽  
pp. e24749
Author(s):  
Quentin Groom ◽  
Tim Adriaens ◽  
Damiano Oldoni ◽  
Lien Reyserhove ◽  
Diederik Strubbe ◽  
...  

Reducing the damage caused by invasive species requires a community approach informed by rapidly mobilized data. Even if local stakeholders work together, invasive species do not respect borders, and national, continental and global policies are required. Yet, in general, data on invasive species are slow to be mobilized, often of insufficient quality for their intended application and distributed among many stakeholders and their organizations, including scientists, land managers, and citizen scientists. The Belgian situation is typical. We struggle with the fragmentation of data sources and restrictions to data mobility. Nevertheless, there is a common view that the issue of invasive alien species needs to be addressed. In 2017 we launched the Tracking Invasive Alien Species (TrIAS) project, which envisages a future where alien species data are rapidly mobilized, the spread of exotic species is regularly monitored, and potential impacts and risks are rapidly evaluated in support of policy decisions (Vanderhoeven et al. 2017). TrIAS is building a seamless, data-driven workflow, from raw data to policy support documentation. TrIAS brings together 21 different stakeholder organizations that covering all organisms in the terrestrial, freshwater and marine environments. These organizations also include those involved in citizen science, research and wildlife management. TrIAS is an Open Science project and all the software, data and documentation are being shared openly (Groom et al. 2018). This means that the workflow can be reused as a whole or in part, either after the project or in different countries. We hope to prove that rapid data workflows are not only an indispensable tool in the control of invasive species, but also for integrating and motivating the citizens and organizations involved.


2017 ◽  
Author(s):  
Federica Rosetta

Watch the VIDEO here.Within the Open Science discussions, the current call for “reproducibility” comes from the raising awareness that results as presented in research papers are not as easily reproducible as expected, or even contradicted those original results in some reproduction efforts. In this context, transparency and openness are seen as key components to facilitate good scientific practices, as well as scientific discovery. As a result, many funding agencies now require the deposit of research data sets, institutions improve the training on the application of statistical methods, and journals begin to mandate a high level of detail on the methods and materials used. How can researchers be supported and encouraged to provide that level of transparency? An important component is the underlying research data, which is currently often only partly available within the article. At Elsevier we have therefore been working on journal data guidelines which clearly explain to researchers when and how they are expected to make their research data available. Simultaneously, we have also developed the corresponding infrastructure to make it as easy as possible for researchers to share their data in a way that is appropriate in their field. To ensure researchers get credit for the work they do on managing and sharing data, all our journals support data citation in line with the FORCE11 data citation principles – a key step in the direction of ensuring that we address the lack of credits and incentives which emerged from the Open Data analysis (Open Data - the Researcher Perspective https://www.elsevier.com/about/open-science/research-data/open-data-report ) recently carried out by Elsevier together with CWTS. Finally, the presentation will also touch upon a number of initiatives to ensure the reproducibility of software, protocols and methods. With STAR methods, for instance, methods are submitted in a Structured, Transparent, Accessible Reporting format; this approach promotes rigor and robustness, and makes reporting easier for the author and replication easier for the reader.


Sign in / Sign up

Export Citation Format

Share Document