A Standard for the FAIR publication of Atmospheric Model Data developed by the AtMoDat Project

Author(s):  
Andrea Lammert ◽  
Anette Ganske ◽  
Amandine Kaiser ◽  
Angelina Kraft

<p>Due to the increasing amount of data produced in science, concepts for data reusability are of immense importance. One aspect is the publication of data in a way that ensures that it is findable, reusable, traceable and comparable (FAIR<sup>1</sup> principles). However, putting these principles into practice often causes significant difficulties for researchers. Therefore some repositories accept datasets described only with the minimum metadata required for DOI allocation. Unfortunately, this contains not  enough information to conform to the FAIR principles - many research data cannot be reused despite having a DOI. In contrast, other repositories aid the researchers by providing advice and strictly controlling the data and their metadata. To simplify the process of defining the needed amount of metadata and of controlling the data and metadata, the AtMoDat<sup>2</sup> (Atmospheric Model Data) project developed a detailed standard for the FAIR publication of atmospheric model data.</p><p>For this purpose we have developed a concept for the “ideal” description of atmospheric model data. A prerequisite for this is the data publication with a DataCite DOI. The ATMODAT standard<sup>3</sup> was developed to implement this concept. The standard defines the data format as NetCDF, mandatory metadata (for DOI, landing page and data header), and naming conventions used in climate research - the Climate and Forecast conventions (CF-conventions<sup>4</sup>). However, many variable names used in urban climate research, for example, are not part of the CF-conventions. For this, standard names have to be defined together with the community and the inclusion in the list of CF-conventions has to be requested. Furthermore we developed and published Python routines which allow data producers as well as repositories to check model output data against the standard. </p><p>The ATMODAT standard will first be applied by the project partners of the two participating universities (University of Hamburg and Leipzig). Here, climate model data are processed with a post-processor in preparation for publication. Subsequently, the files including the specified metadata for the DataCite metadata schema will be published by the World Data Center for Climate<sup>5</sup> (WDCC). Data fulfilling the AtMoDat standard will be marked at the landing page by a special EASYDAB<sup>6</sup> (Earth System Data Branding) logo. EASYDAB is a currently developed branding for FAIR and open data from the Earth System Sciences. This indicates to future data users that the dataset is a verified dataset that can be easily reused. The standardization of the data and the further steps are easily transferable to data from other disciplines.</p><p>1 Wilkinson, M., Dumontier, M., Aalbersberg, I. et al.: The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18 </p><p>2 https://www.atmodat.de/</p><p>3 https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=atmodat_standard_en_v3_0</p><p>4 https://cfconventions.org/</p><p>5 https://cera-www.dkrz.de/WDCC/ui/cerasearch/</p><p>6 https://www.easydab.de/</p><p> </p>

2021 ◽  
Author(s):  
Anette Ganske ◽  
Amandine Kaiser ◽  
Angelina Kraft ◽  
Daniel Heydebreck ◽  
Andrea Lammert ◽  
...  

<p>As in many scientific disciplines, there are a variety of activities in Earth system sciences that address the important aspects of good research data management. What has not been sufficiently investigated and dealt with so far is the easy discoverability and re-use of quality-checked data. This aspect is taken up by the EASYDAB label.</p><p>EASYDAB<sup>1</sup> is a currently developed branding for FAIR and open data from the Earth System Sciences. The branding can be adopted by institutions running a data repository which stores data from the Earth System Sciences. EASYDAB is always connected to a research data publication with DataCite DOIs. Data published under EASYDAB are characterized by a high maturity, extensive metadata information and compliance with a comprehensive discipline-specific standard. For these datasets, the EASYDAB logo is added to the landing page of the data repository. Thereby, repositories can indicate their efforts to publish data with high maturity.</p><p>The first standard made for EASYDAB is the ATMODAT standard<sup>2</sup>, which has been developed within the AtMoDat<sup>3</sup> project (Atmospheric Model Data). It incorporates concrete recommendations and requirements related to the maturity, publication and enhanced FAIRness of atmospheric model data. The requirements are for rich metadata with controlled vocabularies, structured landing pages, file formats (netCDF) and the structure within files. Human- and machine-readable landing pages are a core element of the ATMODAT standard and should hold and present discipline-specific metadata on simulation and variable level. </p><p>The ATMODAT standard includes checklists for the data producer and the data curator so that the compliance with the standard can easily be obtained by both sides. To facilitate automatic checking of the netCDF files headers, a checker program will also be provided and published with DOI. Moreover, a checker for the compliance with the requirements for the DOI Metadata will be developed and made openly available. </p><p>The integration of standards from other disciplines in the Earth System Sciences, such as oceanography, into EASYDAB is helpful and desirable to improve the re-use of reviewed, high-quality data. </p><p> <sup>1</sup>www.easydab.de</p><p><sup>2</sup>https://cera-www.dkrz.de/WDCC/ui/cerasearch/entry?acronym=atmodat_standard_en_v3_0</p><p><sup>3</sup>www.atmodat.de</p>


2016 ◽  
Author(s):  
S. Tilmes ◽  
J.-F. Lamarque ◽  
L. K. Emmons ◽  
D. E. Kinnison ◽  
D. Marsh ◽  
...  

Abstract. The Community Earth System Model, CESM1 CAM4-chem has been used to perform the Chemistry Climate Model Initiative (CCMI) reference and sensitivity simulations. In this model, the Community Atmospheric Model Version 4 (CAM4) is fully coupled to tropospheric and stratospheric chemistry. Details and specifics of each configuration, including new developments and improvements are described. CESM1 CAM4-chem is a low top model that reaches up to approximately 40 km and uses a horizontal resolution of 1.9° latitude and 2.5° longitude. For the specified dynamics experiments, the model is nudged to Modern-Era Retrospective Analysis For Research And Applications (MERRA) reanalysis. We summarize the performance of the three reference simulations suggested by CCMI, with a focus on the observed period. Comparisons with elected datasets are employed to demonstrate the general performance of the model. We highlight new datasets that are suited for multi-model evaluation studies. Most important improvements of the model are the treatment of stratospheric aerosols and the corresponding adjustments for radiation and optics, the updated chemistry scheme including improved polar chemistry and stratospheric dynamics, and improved dry deposition rates. These updates lead to a very good representation of tropospheric ozone within 20 % of values from available observations for most regions. In particular, the trend and magnitude of surface ozone has been much improved compared to earlier versions of the model. Furthermore, stratospheric column ozone of the Southern Hemisphere in winter and spring is reasonably well represented. All experiments still underestimate CO most significantly in Northern Hemisphere spring and show a significant underestimation of hydrocarbons based on surface observations.


2022 ◽  
Author(s):  
Anette Ganske ◽  
Angelika Heil ◽  
Andrea Lammert ◽  
Hannes Thiemann
Keyword(s):  

2019 ◽  
Author(s):  
Miguel D. Mahecha ◽  
Fabian Gans ◽  
Gunnar Brandt ◽  
Rune Christiansen ◽  
Sarah E. Cornell ◽  
...  

Abstract. Understanding Earth system dynamics in the light of ongoing human intervention and dependency remains a major scientific challenge. The unprecedented availability of data streams describing different facets of the Earth now offers fundamentally new avenues to address this quest. However, several practical hurdles, especially the lack of data interoperability, limit the joint potential of these data streams. Today many initiatives within and beyond the Earth system sciences are exploring new approaches to overcome these hurdles and meet the growing inter-disciplinary need for data-intensive research; using data cubes is one promising avenue. Here, we introduce the concept of Earth system data cubes and how to operate on them in a formal way. The idea is that treating multiple data dimensions, such as spatial, temporal, variable, frequency and other grids alike, allows effective application of user-defined functions to co-interpret Earth observations and/or model-data. An implementation of this concept combines analysis-ready data cubes with a suitable analytic interface. In three case studies we demonstrate how the concept and its implementation facilitate the execution of complex workflows for research across multiple variables, spatial and temporal scales: (1) summary statistics for ecosystem and climate dynamics; (2) intrinsic dimensionality analysis on multiple time-scales; and (3) data-model integration. We discuss the emerging perspectives for investigating global interacting and coupled phenomena in observed or simulated data. Latest developments in machine learning, causal inference, and model data integration can be seamlessly implemented in the proposed framework, supporting rapid progress in data-intensive research across disciplinary boundaries.


2020 ◽  
Vol 11 (1) ◽  
pp. 201-234 ◽  
Author(s):  
Miguel D. Mahecha ◽  
Fabian Gans ◽  
Gunnar Brandt ◽  
Rune Christiansen ◽  
Sarah E. Cornell ◽  
...  

Abstract. Understanding Earth system dynamics in light of ongoing human intervention and dependency remains a major scientific challenge. The unprecedented availability of data streams describing different facets of the Earth now offers fundamentally new avenues to address this quest. However, several practical hurdles, especially the lack of data interoperability, limit the joint potential of these data streams. Today, many initiatives within and beyond the Earth system sciences are exploring new approaches to overcome these hurdles and meet the growing interdisciplinary need for data-intensive research; using data cubes is one promising avenue. Here, we introduce the concept of Earth system data cubes and how to operate on them in a formal way. The idea is that treating multiple data dimensions, such as spatial, temporal, variable, frequency, and other grids alike, allows effective application of user-defined functions to co-interpret Earth observations and/or model–data integration. An implementation of this concept combines analysis-ready data cubes with a suitable analytic interface. In three case studies, we demonstrate how the concept and its implementation facilitate the execution of complex workflows for research across multiple variables, and spatial and temporal scales: (1) summary statistics for ecosystem and climate dynamics; (2) intrinsic dimensionality analysis on multiple timescales; and (3) model–data integration. We discuss the emerging perspectives for investigating global interacting and coupled phenomena in observed or simulated data. In particular, we see many emerging perspectives of this approach for interpreting large-scale model ensembles. The latest developments in machine learning, causal inference, and model–data integration can be seamlessly implemented in the proposed framework, supporting rapid progress in data-intensive research across disciplinary boundaries.


2020 ◽  
Author(s):  
Daniel Neumann ◽  
Anette Ganske ◽  
Vivien Voss ◽  
Angelina Kraft ◽  
Heinke Höck ◽  
...  

<p>The generation of high quality research data is expensive. The FAIR principles were established to foster the reuse of such data for the benefit of the scientific community and beyond. Publishing research data with metadata and DataCite DOIs in public repositories makes them findable and accessible (FA of FAIR). However, DOIs and basic metadata do not guarantee the data are actually reusable without discipline-specific knowledge: if data are saved in proprietary or undocumented file formats, if detailed discipline-specific metadata are missing and if quality information on the data and metadata are not provided. In this contribution, we present ongoing work in the AtMoDat project, -a consortium of atmospheric scientists and infrastructure providers, which aims on improving the reusability of atmospheric model data.<br>  <br>Consistent standards are necessary to simplify the reuse of research data. Although standardization of file structure and metadata is well established for some subdomains of the earth system modeling community – e.g. CMIP –, several other subdomains are lacking such standardization. Hence, scientists from the Universities of Hamburg and Leipzig and infrastructure operators cooperate in the AtMoDat project in order to advance standardization for model output files in specific subdomains of the atmospheric modeling community. Starting from the demanding CMIP6 standard, the aim is to establish an easy-to-use standard that is at least compliant with the Climate and Forecast (CF) conventions. In parallel, an existing netCDF file convention checker is extended to check for the new standards. This enhanced checker is designed to support the creation of compliant files and thus lower the hurdle for data producers to comply with the new standard. The transfer of this approach to further sub-disciplines of the earth system modeling community will be supported by a best-practice guide and other documentation. A showcase of a standard for the urban atmospheric modeling community will be presented in this session. The standard is based on CF Conventions and adapts several global attributes and controlled vocabularies from the well-established CMIP6 standard.<br>  <br>Additionally, the AtMoDat project aims on introducing a generic quality indicator into the DataCite metadata schema to foster further reuse of data. This quality indicator should require a discipline-specific implementation of a quality standard linked to the indicator. We will present the concept of the generic quality indicator in general and in the context of urban atmospheric modeling data. </p>


2021 ◽  
Author(s):  
Angelika Heil ◽  
Anette Ganske ◽  
Andrea Lammert ◽  
Daniel Heydebreck ◽  
Hannes Thiemann

<p>Atmospheric Model data form the basis to understand and predict weather, climate and air quality phenomena. Access to this data is not only of interest to a wide scientific community but also to public services, companies, politicians and citizens. One way to make the data available is to publish them via a data repository. To ensure that datasets in a repository are indeed <strong>F</strong>indable, <strong>A</strong>ccessible, <strong>I</strong>nteroperable, and <strong>R</strong>eusable (i.e. FAIR<sup>1</sup>), it is essential that the data are stored together with detailed metadata and that the file structure and metadata follow an established standard. Furthermore, datasets are easier to find and reuse if  the corresponding metadata is machine-readable and uses a standardised vocabulary. While data standardization is well established in large, internationally coordinated model intercomparison projects (e.g. for climate models in CMIP<sup>2</sup>), joint standards are still lacking in many atmospheric modelling sub-disciplines, such as e.g. urban climate or cloud-resolving modelling. </p><p>The AtMoDat project (<strong>At</strong>mospheric <strong>Mo</strong>del <strong>Dat</strong>a)<sup>3</sup>, led by a team of atmospheric scientists and infrastructure providers, aims to improve the overall FAIRness of atmospheric model data and thus promote their re-use. Within the project, the ATMODAT standard<sup>4</sup> has been developed which includes precise recommendations to achieve enhanced FAIRness of atmospheric model data in repositories. A prerequisite of this standard is that the data are published with a DataCite DOI<sup>5</sup>. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages, file formats (netCDF) and the structure within files. Human- and machine-readable landing pages holding discipline-specific metadata are a core element of this standard. </p><p>The ATMODAT standard is easy to implement and provides checklists for data curators and data producers. In addition, to facilitate the compliance check with the ATMODAT standard, the <em>atmodat data checker</em><sup>6</sup> has been developed. A dataset that complies with this standard will follow the FAIR principles and its metadata will be of high quality. If this compliance has been verified by the respective repository, the dataset can be labelled with the <strong>Ea</strong>rth <strong>Sy</strong>stem <strong>Da</strong>ta <strong>B</strong>randing (EASYDAB)<sup>7</sup>. This branding makes it easy for users to verify that the data are properly curated and the metadata has been quality assured.</p><p><sup>1</sup>  Juckes et al., 2020: https://doi.org/10.5194/gmd-13-201-2020 <br><sup>2</sup>  <span>Eyring</span> et al., 2016: https://doi.org/10.5194/gmd-9-1937-2016<br><sup>3</sup>  www.atmodat.de<br><sup>4</sup>  https://doi.org/10.35095/WDCC/atmodat_standard_en_v3_0<br><sup>5</sup>  https://datacite.org<br><sup>6</sup>  https://github.com/AtMoDat/atmodat_data_checker <br><sup>7</sup>  https://easydab.de</p>


2020 ◽  
Author(s):  
Lee de Mora ◽  
Alistair Sellar ◽  
Andrew Yool ◽  
Julien Palmieri ◽  
Robin S. Smith ◽  
...  

<p>With the ever-growing interest from the general public towards understanding climate science, it is becoming increasingly important that we present this information in ways accessible to non-experts. In this pilot study, we use time series data from the first United Kingdom Earth System model (UKESM1) to create six procedurally generated musical pieces and use them to explain the process of modelling the earth system and to engage with the wider community. </p><p>Scientific data is almost always represented graphically either in figures or in videos. By adding audio to the visualisation of model data, the combination of music and imagery provides additional contextual clues to aid in the interpretation. Furthermore, the audiolisation of model data can be employed to generate interesting and captivating music, which can not  only reach a wider audience, but also hold the attention of the listeners for extended periods of time.</p><p>Each of the six pieces presented in this work was themed around either a scientific principle or a practical aspect of earth system modelling. These pieces demonstrate the concepts of a spin up, a pre-industrial control run, multiple historical experiments, and the use of several future climate scenarios to a wider audience. They also show the ocean acidification over the historical period, the changes in circulation, the natural variability of the pre-industrial simulations, and the expected rise in sea surface temperature over the 20th century. </p><p>Each of these pieces were arranged using different musical progression, style and tempo. All six pieces were performed by the digital piano synthesizer, TiMidity++, and were published on the lead author's YouTube channel. The videos all show the progression of the data in time with the music and a brief description of the methodology is posted alongside the video. </p><p>To disseminate these works, links to each piece were published on the lead author's personal and professional social media accounts. The reach of these works was also analysed using YouTube's channel monitoring toolkit for content creators, YouTube studio.</p>


2010 ◽  
Vol 12 (1) ◽  
pp. 1-8 ◽  
Author(s):  
Yunqiang ZHU ◽  
Jiulin SUN ◽  
Shunbao LIAO ◽  
Yapeng YANG ◽  
Huazhong ZHU ◽  
...  

2010 ◽  
Vol 11 (1) ◽  
pp. 1-9 ◽  
Author(s):  
Yunqiang ZHU ◽  
Min FENG ◽  
Jia SONG ◽  
Runda LIU

Sign in / Sign up

Export Citation Format

Share Document