Sharing Scientific Data: Moving Toward “Open Data”

Author(s):  
Pali U. K. De Silva ◽  
Candace K. Vance
Keyword(s):  
Author(s):  
Mariya Dimitrova ◽  
Raïssa Meyer ◽  
Pier Luigi Buttigieg ◽  
Teodor Georgiev ◽  
Georgi Zhelezov ◽  
...  

Data papers have emerged as a powerful instrument for open data publishing, obtaining credit, and establishing priority for datasets generated in scientific experiments. Academic publishing improves data and metadata quality through peer-review and increases the impact of datasets by enhancing their visibility, accessibility, and re-usability. We aimed to establish a new type of article structure and template for omics studies: the omics data paper. To improve data interoperability and further incentivise researchers to publish high-quality data sets, we created a workflow for streamlined import of omics metadata directly into a data paper manuscript. An omics data paper template was designed by defining key article sections which encourage the description of omics datasets and methodologies. The workflow was based on REpresentational State Transfer services and Xpath to extract information from the European Nucleotide Archive, ArrayExpress and BioSamples databases, which follow community-agreed standards. The workflow for automatic import of standard-compliant metadata into an omics data paper manuscript facilitates the authoring process. It demonstrates the importance and potential of creating machine-readable and standard-compliant metadata. The omics data paper structure and workflow to import omics metadata improves the data publishing landscape by providing a novel mechanism for creating high-quality, enhanced metadata records, peer reviewing and publishing of these. It constitutes a powerful addition for distribution, visibility, reproducibility and re-usability of scientific data. We hope that streamlined metadata re-use for scholarly publishing encourages authors to improve the quality of their metadata to achieve a truly FAIR data world.


BioScience ◽  
2020 ◽  
Author(s):  
Jocelyn P Colella ◽  
Ryan B Stephens ◽  
Mariel L Campbell ◽  
Brooks A Kohli ◽  
Danielle J Parsons ◽  
...  

Abstract The open-science movement seeks to increase transparency, reproducibility, and access to scientific data. As primary data, preserved biological specimens represent records of global biodiversity critical to research, conservation, national security, and public health. However, a recent decrease in specimen preservation in public biorepositories is a major barrier to open biological science. As such, there is an urgent need for a cultural shift in the life sciences that normalizes specimen deposition in museum collections. Museums embody an open-science ethos and provide long-term research infrastructure through curation, data management and security, and community-wide access to samples and data, thereby ensuring scientific reproducibility and extension. We propose that a paradigm shift from specimen ownership to specimen stewardship can be achieved through increased open-data requirements among scientific journals and institutional requirements for specimen deposition by funding and permitting agencies, and through explicit integration of specimens into existing data management plan guidelines and annual reporting.


2013 ◽  
Vol 40 (4) ◽  
pp. 253-263 ◽  
Author(s):  
A. O. Erkimbaev ◽  
V. Yu. Zitserman ◽  
G. A. Kobzev ◽  
V. A. Serebrjakov ◽  
K. B. Teymurazov

2019 ◽  
Vol 15 (2) ◽  
Author(s):  
Eduardo Couto Dalcin ◽  
João Lanna ◽  
Natália Queiroz ◽  
Rafaela Campostrini Forzza

RESUMO Desde a Declaração de Berlin sobre o Acesso Aberto ao Conhecimento em Ciências e Humanidades, publicada em 2003, a demanda por uma “ciência aberta” cuja preocupação primordial é tornar a atividade de pesquisa mais transparente, mais cola­borativa e mais eficiente, tem crescido na comunidade acadêmica. Aliado a isso, vem se consolidando a  percepção de que o acesso e compartilhamento de dados de pesquisa contribui de forma significativa para que a ciência avance e  maximize os investimentos aplicados em programas de pesquisa. Neste sentido este estudo apresenta uma proposta composta de repositórios digitais e ferramentas computacionais voltadas para publicação e compartilhamento de recursos de informação em institutos de pesquisa. A arquitetura proposta, baseada em ferramentas livres e de código aberto mostrou-se adequada à gestão e publicação de recursos de informação em instituições de pesquisa. Porém, esta abordagem apontou a necessidade de uma ferramenta de busca que integre as diferentes ferramentas, assim como da existência de um vocabulário controlado, capaz de indexar os recursos em seus diferentes contextos.Palavras-chave: Dados Abertos; Ciência Aberta; Publicação de Dados Científicos.ABSTRACT Since the Berlin Declaration on Open Access to Knowledge in Science and Humanities published in 2003, the demand for an "open science" whose primary concern is to make research activity more transparent, more collaborative and more efficient, has grown at the academy. Added to this, the perception that the access and sharing of research data contribute significantly to science advance and maximize the investments applied in research programs has been consolidated. In this sense, the present work presents a proposal composed of digital repositories and computational tools aimed at publishing and sharing of information resources in research institutes. The proposed architecture, based on free and open-source tools, proved adequate for the management and publication of information resources in research institutions. However, this approach pointed to the need for a search tool that integrates the different tools, as well as the existence of a controlled vocabulary, capable of indexing resources in their different contexts.Keywords: Open Data; Open Science; Scientific Data Publishing.


Toxins ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 692
Author(s):  
Panagiota Katikou

Currently, digital technologies influence information dissemination in all business sectors, with great emphasis put on exploitation strategies. Public administrations often use information systems and establish open data repositories, primarily supporting their operation but also serving as data providers, facilitating decision-making. As such, risk analysis in the public health sector, including food safety authorities, often relies on digital technologies and open data sources. Global food safety challenges include marine biotoxins (MBs), being contaminants whose mitigation largely depends on risk analysis. Ciguatera Fish Poisoning (CFP), in particular, is a MB-related seafood intoxication attributed to the consumption of fish species that are prone to accumulate ciguatoxins. Historically, CFP occurred endemically in tropical/subtropical areas, but has gradually emerged in temperate regions, including European waters, necessitating official policy adoption to manage the potential risks. Researchers and policy-makers highlight scientific data inadequacy, under-reporting of outbreaks and information source fragmentation as major obstacles in developing CFP mitigation strategies. Although digital technologies and open data sources provide exploitable scientific information for MB risk analysis, their utilization in counteracting CFP-related hazards has not been addressed to date. This work thus attempts to answer the question, “What is the current extent of digital technologies’ and open data sources’ utilization within risk analysis tasks in the MBs field, particularly on CFP?”, by conducting a systematic literature review of the available scientific and grey literature. Results indicate that the use of digital technologies and open data sources in CFP is not negligible. However, certain gaps are identified regarding discrepancies in terminology, source fragmentation and a redundancy and downplay of social media utilization, in turn constituting a future research agenda for this under-researched topic.


2021 ◽  
Author(s):  
Gustavo Caetano Borges ◽  
Julio César dos Reis ◽  
Claudia Bauzer Medeiros

Scientific research in all fields has advanced in complexity and in the amount of data generated. The heterogeneity of data repositories, data meaning and their metadata standards makes this problem even more significant. In spite of several proposals to find and retrieve research data from public repositories, there is still need for more comprehensive retrieval solutions. In this article, we specify and develop a mechanism to search for scientific data that takes advantage of metadata records and semantic methods. We present the conception of our architecture and how we have implemented it in a use case in agriculture.


2018 ◽  
Author(s):  
Seth Carbon ◽  
Robin Champieux ◽  
Julie McMurry ◽  
Lilly Winfree ◽  
Letisha R. Wyat ◽  
...  

ABSTRACTData is the foundation of science, and there is an increasing focus on how data can be reused and enhanced to drive scientific discoveries. However, most seemingly “open data” do not provide legal permissions for reuse and redistribution. Not being able to integrate and redistribute our collective data resources blocks innovation, and stymies the creation of life-improving diagnostic and drug selection tools. To help the biomedical research and research support communities (e.g. libraries, funders, repositories, etc.) understand and navigate the data licensing landscape, the (Re)usable Data Project (RDP) (http://reusabledata.org) assesses the licensing characteristics of data resources and how licensing behaviors impact reuse. We have created a ruleset to determine the reusability of data resources and have applied it to 56 scientific data resources (i.e. databases) to date. The results show significant reuse and interoperability barriers. Inspired by game-changing projects like Creative Commons, the Wikipedia Foundation, and the Free Software movement, we hope to engage the scientific community in the discussion regarding the legal use and reuse of scientific data, including the balance of openness and how to create sustainable data resources in an increasingly competitive environment.


2018 ◽  
Vol 60 (3) ◽  
pp. 192-198
Author(s):  
Dorota Grygoruk

Abstract The development of information technology makes it possible to collect and analyse more and more data resources. The results of research, regardless of the discipline, constitute one of main sources of data. Currently, the research results are increasingly being published in the Open Access model. The Open Access concept has been accepted and recommended worldwide by many institutions financing and implementing research. Initially, the idea of openness concerned only the results of research and scientific publications; at present, more attention is paid to the problem of sharing scientific data, including raw data. Proceedings towards open data are intricate, as data specificity requires the development of an appropriate legal, technical and organizational model, followed by the implementation of data management policies at both the institutional and national levels. The aim of this publication was to present the development of the open data concept in the context of open access idea and problems related to defining data in the process of data sharing and data management.


2021 ◽  
Author(s):  
Marc Schaming ◽  
Mathieu Turlure ◽  
Jean Schmittbuhl ◽  
Beata Orleka-Sikora ◽  
Stanislaw Lasoki

<p>The Data Centre for Deep Geothermal Energy (CDGP – Centre de Données de Géothermie Profonde, https://cdgp.u-strasbg.fr) was launched in 2016 by the LabEx G-Eau-Thermie Profonde - now ITI Géosciences pour la transition énergétique GeoT, https://iti-geot.unistra.fr/ - to preserve, archive and distribute data acquired on geothermal sites in Alsace. At the moment, it archives and gives access to data from Soultz-sous-Forêts (1988-2010), Rittershoffen (2012-2014) and Vendenheim (2016-2021).</p><p>Access to patrimonial data like those from Soultz-sous-Forêts (SSF, 1993, 2000) or from Rittershoffen allows reprocessing of data, validation of new ideas. Cauchie et al. (2020) reinvestigated earthquakes during SSF 1993 stimulation and discussed implications for detecting the transition between events related to pre-existing faults and the onset of fresh fractures. Vallier et al. (2019) used a simplified 2D thermo-hydro-mechanical model of SSF reservoir to infer that the sediments–granite interface has a weak influence on the hydrothermal circulation, or that the brine viscosity has a huge impact on the hydrothermal circulation. Koepke et al. (2020) applied pseudo-probabilistic fracture network method to the seismicity induced during the SSF 2000 stimulation to confirm the existence of a large prominent fault. Drif et al. (2020) used data from Vendenheim area to determine the seismic moment, the source size, the average stress drop and the focal mechanism associated to the M3 event in November 2019.</p><p>Some of the CDGP data are also available on the EPOS Thematic Core Service Anthropogenic Hazards platform (https://tcs.ah-epos.eu/, Orlecka-Sikora et al., 2020), with other geothermal episodes, and with applications to process and analyse the data. This platform is a functional e-research infrastructure that allows free experimentations in a virtual laboratory, promoting interdisciplinary collaborations between stakeholders (the scientific community, industrial partners and society).</p><p>Cauchie, L., Lengliné, O. & Schmittbuhl, J., 2020 - Seismic asperity size evolution during fluid injection: case study of the 1993 Soultz-sous-Forêts injection. Geophysical Journal International 221, 968–980.<br>Drif, K., Lengline, O., Lambotte, S., Kinscher, J. & Schmittbuhl, J., 2020 - Source parameters of the Ml3.0 StrasbourgEarthquake (12th November 2019). Communication at EGW2020, http://labex-geothermie.unistra.fr/wp-content/uploads/2020/12/abstracts-egw2020-en.pdf#page=68.<br>Koepke, R., Gaucher, E. & Kohl, T., 2020 - Pseudo-probabilistic identification of fracture network in seismic clouds driven by source parameters. Geophys J Int 223, 2066–2084.<br>Orlecka-Sikora B., Lasocki S., Kocot J., Szepieniec T., Grasso J-R., Garcia-Aristizabal A., Schaming M., Urban P., Jones G., Stimpson, I., Dineva S., Sałek P., Leptokaropoulos K., Lizurek G., Olszewska D., Schmittbuhl J., Kwiatek G., Blanke A., Saccorotti G., Chodzińska K., Rudziński Ł., Dobrzycka I., Mutke G., Barański A., Pierzyna A., Kozlovskaya E., Nevalainen J.,  Kinscher J., Sileny J., Sterzel M., Cielesta, S., Fischer T., 2020 -An open data infrastructure for the study of anthropogenic hazards linked to georesource exploitation. Scientific Data 7, 89. doi:10.1038/s41597-020-0429-3.<br>Vallier, B., Magnenet, V., Schmittbuhl, J. & Fond, C, 2019 - Large scale hydro-thermal circulation in the deep geothermal reservoir of Soultz-sous-Forêts (France). Geothermics <strong>78</strong>, 154–169.</p>


Sign in / Sign up

Export Citation Format

Share Document