scholarly journals The ATMODAT Standard enhances FAIRness of Atmospheric Model data

2021 ◽  
Author(s):  
Angelika Heil ◽  
Anette Ganske ◽  
Andrea Lammert ◽  
Daniel Heydebreck ◽  
Hannes Thiemann

<p>Atmospheric Model data form the basis to understand and predict weather, climate and air quality phenomena. Access to this data is not only of interest to a wide scientific community but also to public services, companies, politicians and citizens. One way to make the data available is to publish them via a data repository. To ensure that datasets in a repository are indeed <strong>F</strong>indable, <strong>A</strong>ccessible, <strong>I</strong>nteroperable, and <strong>R</strong>eusable (i.e. FAIR<sup>1</sup>), it is essential that the data are stored together with detailed metadata and that the file structure and metadata follow an established standard. Furthermore, datasets are easier to find and reuse if  the corresponding metadata is machine-readable and uses a standardised vocabulary. While data standardization is well established in large, internationally coordinated model intercomparison projects (e.g. for climate models in CMIP<sup>2</sup>), joint standards are still lacking in many atmospheric modelling sub-disciplines, such as e.g. urban climate or cloud-resolving modelling. </p><p>The AtMoDat project (<strong>At</strong>mospheric <strong>Mo</strong>del <strong>Dat</strong>a)<sup>3</sup>, led by a team of atmospheric scientists and infrastructure providers, aims to improve the overall FAIRness of atmospheric model data and thus promote their re-use. Within the project, the ATMODAT standard<sup>4</sup> has been developed which includes precise recommendations to achieve enhanced FAIRness of atmospheric model data in repositories. A prerequisite of this standard is that the data are published with a DataCite DOI<sup>5</sup>. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages, file formats (netCDF) and the structure within files. Human- and machine-readable landing pages holding discipline-specific metadata are a core element of this standard. </p><p>The ATMODAT standard is easy to implement and provides checklists for data curators and data producers. In addition, to facilitate the compliance check with the ATMODAT standard, the <em>atmodat data checker</em><sup>6</sup> has been developed. A dataset that complies with this standard will follow the FAIR principles and its metadata will be of high quality. If this compliance has been verified by the respective repository, the dataset can be labelled with the <strong>Ea</strong>rth <strong>Sy</strong>stem <strong>Da</strong>ta <strong>B</strong>randing (EASYDAB)<sup>7</sup>. This branding makes it easy for users to verify that the data are properly curated and the metadata has been quality assured.</p><p><sup>1</sup>  Juckes et al., 2020: https://doi.org/10.5194/gmd-13-201-2020 <br><sup>2</sup>  <span>Eyring</span> et al., 2016: https://doi.org/10.5194/gmd-9-1937-2016<br><sup>3</sup>  www.atmodat.de<br><sup>4</sup>  https://doi.org/10.35095/WDCC/atmodat_standard_en_v3_0<br><sup>5</sup>  https://datacite.org<br><sup>6</sup>  https://github.com/AtMoDat/atmodat_data_checker <br><sup>7</sup>  https://easydab.de</p>

2021 ◽  
Author(s):  
Anette Ganske ◽  
Amandine Kaiser ◽  
Angelina Kraft ◽  
Daniel Heydebreck ◽  
Andrea Lammert ◽  
...  

<p>As in many scientific disciplines, there are a variety of activities in Earth system sciences that address the important aspects of good research data management. What has not been sufficiently investigated and dealt with so far is the easy discoverability and re-use of quality-checked data. This aspect is taken up by the EASYDAB label.</p><p>EASYDAB<sup>1</sup> is a currently developed branding for FAIR and open data from the Earth System Sciences. The branding can be adopted by institutions running a data repository which stores data from the Earth System Sciences. EASYDAB is always connected to a research data publication with DataCite DOIs. Data published under EASYDAB are characterized by a high maturity, extensive metadata information and compliance with a comprehensive discipline-specific standard. For these datasets, the EASYDAB logo is added to the landing page of the data repository. Thereby, repositories can indicate their efforts to publish data with high maturity.</p><p>The first standard made for EASYDAB is the ATMODAT standard<sup>2</sup>, which has been developed within the AtMoDat<sup>3</sup> project (Atmospheric Model Data). It incorporates concrete recommendations and requirements related to the maturity, publication and enhanced FAIRness of atmospheric model data. The requirements are for rich metadata with controlled vocabularies, structured landing pages, file formats (netCDF) and the structure within files. Human- and machine-readable landing pages are a core element of the ATMODAT standard and should hold and present discipline-specific metadata on simulation and variable level. </p><p>The ATMODAT standard includes checklists for the data producer and the data curator so that the compliance with the standard can easily be obtained by both sides. To facilitate automatic checking of the netCDF files headers, a checker program will also be provided and published with DOI. Moreover, a checker for the compliance with the requirements for the DOI Metadata will be developed and made openly available. </p><p>The integration of standards from other disciplines in the Earth System Sciences, such as oceanography, into EASYDAB is helpful and desirable to improve the re-use of reviewed, high-quality data. </p><p> <sup>1</sup>www.easydab.de</p><p><sup>2</sup>https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=atmodat_standard_en_v3_0</p><p><sup>3</sup>www.atmodat.de</p>


2021 ◽  
Author(s):  
Francesca Frexia ◽  
Cecilia Mascia ◽  
Luca Lianas ◽  
Giovanni Delussu ◽  
Alessandro Sulis ◽  
...  

AbstractThe FAIR Principles are a set of recommendations that aim to underpin knowledge discovery and integration by making the research outcomes Findable, Accessible, Interoperable and Reusable. These guidelines encourage the accurate recording and exchange of structured data, coupled with contextual information about their creation, expressed in domain-specific standards and machine readable formats. This paper analyses the potential support to FAIRness of the openEHR e-health standard, by theoretically assessing the compliance with each of the 15 FAIR principles of a hypothetical Clinical Data Repository (CDR) developed according to the openEHR specifications. Our study highlights how the openEHR approach, thanks to its computable semantics-oriented design, is inherently FAIR-enabling and is a promising implementation strategy for creating FAIR-compliant CDRs.


2021 ◽  
Author(s):  
Andrea Lammert ◽  
Anette Ganske ◽  
Amandine Kaiser ◽  
Angelina Kraft

<p>Due to the increasing amount of data produced in science, concepts for data reusability are of immense importance. One aspect is the publication of data in a way that ensures that it is findable, reusable, traceable and comparable (FAIR<sup>1</sup> principles). However, putting these principles into practice often causes significant difficulties for researchers. Therefore some repositories accept datasets described only with the minimum metadata required for DOI allocation. Unfortunately, this contains not  enough information to conform to the FAIR principles - many research data cannot be reused despite having a DOI. In contrast, other repositories aid the researchers by providing advice and strictly controlling the data and their metadata. To simplify the process of defining the needed amount of metadata and of controlling the data and metadata, the AtMoDat<sup>2</sup> (Atmospheric Model Data) project developed a detailed standard for the FAIR publication of atmospheric model data.</p><p>For this purpose we have developed a concept for the “ideal” description of atmospheric model data. A prerequisite for this is the data publication with a DataCite DOI. The ATMODAT standard<sup>3</sup> was developed to implement this concept. The standard defines the data format as NetCDF, mandatory metadata (for DOI, landing page and data header), and naming conventions used in climate research - the Climate and Forecast conventions (CF-conventions<sup>4</sup>). However, many variable names used in urban climate research, for example, are not part of the CF-conventions. For this, standard names have to be defined together with the community and the inclusion in the list of CF-conventions has to be requested. Furthermore we developed and published Python routines which allow data producers as well as repositories to check model output data against the standard. </p><p>The ATMODAT standard will first be applied by the project partners of the two participating universities (University of Hamburg and Leipzig). Here, climate model data are processed with a post-processor in preparation for publication. Subsequently, the files including the specified metadata for the DataCite metadata schema will be published by the World Data Center for Climate<sup>5</sup> (WDCC). Data fulfilling the AtMoDat standard will be marked at the landing page by a special EASYDAB<sup>6</sup> (Earth System Data Branding) logo. EASYDAB is a currently developed branding for FAIR and open data from the Earth System Sciences. This indicates to future data users that the dataset is a verified dataset that can be easily reused. The standardization of the data and the further steps are easily transferable to data from other disciplines.</p><p>1 Wilkinson, M., Dumontier, M., Aalbersberg, I. et al.: The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18 </p><p>2 https://www.atmodat.de/</p><p>3 https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=atmodat_standard_en_v3_0</p><p>4 https://cfconventions.org/</p><p>5 https://cera-www.dkrz.de/WDCC/ui/cerasearch/</p><p>6 https://www.easydab.de/</p><p> </p>


2014 ◽  
Vol 27 (10) ◽  
pp. 3848-3868 ◽  
Author(s):  
John T. Allen ◽  
David J. Karoly ◽  
Kevin J. Walsh

Abstract The influence of a warming climate on the occurrence of severe thunderstorm environments in Australia was explored using two global climate models: Commonwealth Scientific and Industrial Research Organisation Mark, version 3.6 (CSIRO Mk3.6), and the Cubic-Conformal Atmospheric Model (CCAM). These models have previously been evaluated and found to be capable of reproducing a useful climatology for the twentieth-century period (1980–2000). Analyzing the changes between the historical period and high warming climate scenarios for the period 2079–99 has allowed estimation of the potential convective future for the continent. Based on these simulations, significant increases to the frequency of severe thunderstorm environments will likely occur for northern and eastern Australia in a warmed climate. This change is a response to increasing convective available potential energy from higher continental moisture, particularly in proximity to warm sea surface temperatures. Despite decreases to the frequency of environments with high vertical wind shear, it appears unlikely that this will offset increases to thermodynamic energy. The change is most pronounced during the peak of the convective season, increasing its length and the frequency of severe thunderstorm environments therein, particularly over the eastern parts of the continent. The implications of this potential increase are significant, with the overall frequency of potential severe thunderstorm days per year likely to rise over the major population centers of the east coast by 14% for Brisbane, 22% for Melbourne, and 30% for Sydney. The limitations of this approach are then discussed in the context of ways to increase the confidence of predictions of future severe convection.


2021 ◽  
Author(s):  
Christian Zeman ◽  
Christoph Schär

<p>Since their first operational application in the 1950s, atmospheric numerical models have become essential tools in weather and climate prediction. As such, they are a constant subject to changes, thanks to advances in computer systems, numerical methods, and the ever increasing knowledge about the atmosphere of Earth. Many of the changes in today's models relate to seemingly unsuspicious modifications, associated with minor code rearrangements, changes in hardware infrastructure, or software upgrades. Such changes are meant to preserve the model formulation, yet the verification of such changes is challenged by the chaotic nature of our atmosphere - any small change, even rounding errors, can have a big impact on individual simulations. Overall this represents a serious challenge to a consistent model development and maintenance framework.</p><p>Here we propose a new methodology for quantifying and verifying the impacts of minor atmospheric model changes, or its underlying hardware/software system, by using ensemble simulations in combination with a statistical hypothesis test. The methodology can assess effects of model changes on almost any output variable over time, and can also be used with different hypothesis tests.</p><p>We present first applications of the methodology with the regional weather and climate model COSMO. The changes considered include a major system upgrade of the supercomputer used, the change from double to single precision floating-point representation, changes in the update frequency of the lateral boundary conditions, and tiny changes to selected model parameters. While providing very robust results, the methodology also shows a large sensitivity to more significant model changes, making it a good candidate for an automated tool to guarantee model consistency in the development cycle.</p>


2014 ◽  
Vol 1 (2) ◽  
pp. 1283-1312
Author(s):  
M. Abbas ◽  
A. Ilin ◽  
A. Solonen ◽  
J. Hakkarainen ◽  
E. Oja ◽  
...  

Abstract. In this work, we consider the Bayesian optimization (BO) approach for tuning parameters of complex chaotic systems. Such problems arise, for instance, in tuning the sub-grid scale parameterizations in weather and climate models. For such problems, the tuning procedure is generally based on a performance metric which measures how well the tuned model fits the data. This tuning is often a computationally expensive task. We show that BO, as a tool for finding the extrema of computationally expensive objective functions, is suitable for such tuning tasks. In the experiments, we consider tuning parameters of two systems: a simplified atmospheric model and a low-dimensional chaotic system. We show that BO is able to tune parameters of both the systems with a low number of objective function evaluations and without the need of any gradient information.


2018 ◽  
Vol 10 (3) ◽  
pp. 1605-1612 ◽  
Author(s):  
Christophe Genthon ◽  
Alexis Berne ◽  
Jacopo Grazioli ◽  
Claudio Durán Alarcón ◽  
Christophe Praz ◽  
...  

Abstract. Compared to the other continents and lands, Antarctica suffers from a severe shortage of in situ observations of precipitation. APRES3 (Antarctic Precipitation, Remote Sensing from Surface and Space) is a program dedicated to improving the observation of Antarctic precipitation, both from the surface and from space, to assess climatologies and evaluate and ameliorate meteorological and climate models. A field measurement campaign was deployed at Dumont d'Urville station at the coast of Adélie Land in Antarctica, with an intensive observation period from November 2015 to February 2016 using X-band and K-band radars, a snow gauge, snowflake cameras and a disdrometer, followed by continuous radar monitoring through 2016 and beyond. Among other results, the observations show that a significant fraction of precipitation sublimates in a dry surface katabatic layer before it reaches and accumulates at the surface, a result derived from profiling radar measurements. While the bulk of the data analyses and scientific results are published in specialized journals, this paper provides a compact description of the dataset now archived in the PANGAEA data repository (https://www.pangaea.de, https://doi.org/10.1594/PANGAEA.883562) and made open to the scientific community to further its exploitation for Antarctic meteorology and climate research purposes.


2019 ◽  
Vol 46 (8) ◽  
pp. 622-638
Author(s):  
Joachim Schöpfel ◽  
Dominic Farace ◽  
Hélène Prost ◽  
Antonella Zane

Data papers have been defined as scholarly journal publications whose primary purpose is to describe research data. Our survey provides more insights about the environment of data papers, i.e., disciplines, publishers and business models, and about their structure, length, formats, metadata, and licensing. Data papers are a product of the emerging ecosystem of data-driven open science. They contribute to the FAIR principles for research data management. However, the boundaries with other categories of academic publishing are partly blurred. Data papers are (can be) generated automatically and are potentially machine-readable. Data papers are essentially information, i.e., description of data, but also partly contribute to the generation of knowledge and data on its own. Part of the new ecosystem of open and data-driven science, data papers and data journals are an interesting and relevant object for the assessment and understanding of the transition of the former system of academic publishing.


2020 ◽  
Vol 29 (6) ◽  
pp. 483-491
Author(s):  
Anette Ganske ◽  
Daniel Heydebreck ◽  
Heinke Höck ◽  
Angelina Kraft ◽  
Johannes Quaas ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document