Bridging the gap between Big Earth data users and future (cloud-based) data systems - Towards a better understanding of user requirements of cloud-based data systems

Author(s):  
Julia Wagemann ◽  
Stephan Siemen ◽  
Jörg Bendix ◽  
Bernhard Seeger

<p>The European Commission’s Earth Observation programme Copernicus produces an unprecedented amount of openly available multi-dimensional environmental data. However, data ‘accessibility’ remains one of the biggest obstacles for users of open Big Earth Data and hinders full data exploitation. Data services have to evolve from pure download services to offer an easier and more on-demand data access. There are currently different concepts explored to make Big Earth Data better accessible for users, e.g. virtual research infrastructures, data cube technologies, standardised web services or cloud processing services, such as the Google Earth Engine or the Copernicus Climate Data Store Toolbox. Each offering provides different types of data, tools and functionalities. Data services are often developed solely satisfying specific user requirements and needs.</p><p>For this reason, we conducted a user requirements survey between November 2018 and June 2019 among users of Big Earth Data (including users of Earth Observation data, meteorological and environmental forecasts and other geospatial data) to better understand user requirements of Big Earth Data. To reach an active data user community for this survey, we partnered with ECMWF, which has 40 years of experience in providing data services for weather forecast data and environmental data sets of the Copernicus Programme.</p><p>We were interested in which datasets users currently use, which datasets they would like to use in the future and the reasons why they have not yet explored certain datasets. We were interested in the tools and software they use to process the data and what challenges they face in accessing and handling Big Earth Data. Another part focused on future (cloud-based) data services and there, we were interested in the users’ motivation to migrate their data processing tasks to cloud-based data services and asked them what aspects of these services they consider being important.</p><p>While preliminary results of the study were released last year, this year the final study results are presented. A specific focus will be put on users’ expectation of future (cloud-based) data services aligned with recommendations for data users and data providers alike to ensure the full exploitation of Big Earth Data in the future.</p>

2019 ◽  
Vol 3 ◽  
pp. 965
Author(s):  
Safran Yusri ◽  
Vincentius P. Siregar ◽  
Suharsono Suharsono

Long term Earth observation data stored in Google Earth Engine (GEE) can be ingested and derived to biologically relevant environmental variables that can used as the predictors of a species niche. The aim of this research was to create a script using GEE to generate biologically meaningful environmental variables from various Earth observation data and models in Indonesia. Elevation and bathymetry raster data from GEBCO were land masked and benthic terrain modelling were done in order to get the aspect, depth, curvature, and slope. HYCOM and MODIS AQUA dataset were filtered using spatial (Indonesia and surrounding region) and temporal filter (from 2002–2017), and reduced to biologically meaningful variables, the maximum, minimum, and mean. Water speed vector (northward and eastward) data were also converted in to scalar unit. In order to fill data gaps, kriging was done using Bayesian slope. Results shows the water depth in Indonesia ranges from 0 – 6827 m, with slope ranging from 0 – 34.33°, aspect from 0 – 359.99°, and curvature from 0 – 0.94. Variables representing water energy, mean sea surface elevation ranges from 0 – 0.85 m, and mean scalar water velocity 0 – 4 m/s. Mean surface salinity ranges from 20.09 – 35.32‰. Variables representing water quality includes mean of particulate organic carbon which ranges from 25.31 – 953.47‰ and mean of clorophyll-A concentration from 0.05 – 13.63‰. These data can be used as the input for species distribution models or spatially explicit decision support systems such as Marxan for spatial planning and zonation in Marine and Coastal Zone Management Plan.


2020 ◽  
Vol 12 (8) ◽  
pp. 1253 ◽  
Author(s):  
Vitor Gomes ◽  
Gilberto Queiroz ◽  
Karine Ferreira

In recent years, Earth observation (EO) satellites have generated big amounts of geospatial data that are freely available for society and researchers. This scenario brings challenges for traditional spatial data infrastructures (SDI) to properly store, process, disseminate and analyze these big data sets. To meet these demands, novel technologies have been proposed and developed, based on cloud computing and distributed systems, such as array database systems, MapReduce systems and web services to access and process big Earth observation data. Currently, these technologies have been integrated into cutting edge platforms in order to support a new generation of SDI for big Earth observation data. This paper presents an overview of seven platforms for big Earth observation data management and analysis—Google Earth Engine (GEE), Sentinel Hub, Open Data Cube (ODC), System for Earth Observation Data Access, Processing and Analysis for Land Monitoring (SEPAL), openEO, JEODPP, and pipsCloud. We also provide a comparison of these platforms according to criteria that represent capabilities of the EO community interest.


1999 ◽  
Vol 33 (3) ◽  
pp. 55-66 ◽  
Author(s):  
L. Charles Sun

An interactive data access and retrieval system, developed at the U.S. National Oceanographic Data Genter (NODG) and available at <ext-link ext-link-type="uri" href="http://www.node.noaa.gov">http://www.node.noaa.gov</ext-link>, is presented in this paper. The purposes of this paper are: (1) to illustrate the procedures of quality control and loading oceanographic data into the NODG ocean databases and (2) to describe the development of a system to manage, visualize, and disseminate the NODG data holdings over the Internet. The objective of the system is to provide ease of access to data that will be required for data assimilation models. With advances in scientific understanding of the ocean dynamics, data assimilation models require the synthesis of data from a variety of resources. Modern intelligent data systems usually involve integrating distributed heterogeneous data and information sources. As the repository for oceanographic data, NOAA’s National Oceanographic Data Genter (NODG) is in a unique position to develop such a data system. In support of the data assimilation needs, NODG has developed a system to facilitate browsing of the oceanographic environmental data and information that is available on-line at NODG. Users may select oceanographic data based on geographic areas, time periods and measured parameters. Once the selection is complete, users may produce a station location plot, produce plots of the parameters or retrieve the data.


2021 ◽  
Vol 3 ◽  
Author(s):  
Ufuoma Ovienmhada ◽  
Fohla Mouftaou ◽  
Danielle Wood

Earth Observation (EO) data can enhance understanding of human-environmental systems for the creation of climate data services, or Decision Support Systems (DSS), to improve monitoring, prediction and mitigation of climate harm. However, EO data is not always incorporated into the workflow for decision-makers for a multitude of reasons including awareness, accessibility and collaboration models. The purpose of this study is to demonstrate a collaborative model that addresses historical power imbalances between communities. This paper highlights a case study of a climate harm mitigation DSS collaboration between the Space Enabled Research Group at the MIT Media Lab and Green Keeper Africa (GKA), an enterprise located in Benin. GKA addresses the management of an invasive plant species that threatens ecosystem health and economic activities on Lake Nokoué. They do this through a social entrepreneurship business model that aims to advance both economic empowerment and environmental health. In demonstrating a Space Enabled-GKA collaboration model that advances GKA's business aims, this study first considers several popular service and technology design methods and offer critiques of each method in terms of their ability to address inclusivity in complex systems. These critiques lead to the selection of the Systems Architecture Framework (SAF) as the technology design method for the case study. In the remainder of the paper, the SAF is applied to the case study to demonstrate how the framework coproduces knowledge that would inform a DSS with Earth Observation data. The paper offers several practical considerations and values related to epistemology, data collection, prioritization and methodology for performing inclusive design of climate data services.


Author(s):  
S. Jutz ◽  
M.P. Milagro-Pérez

<span>The European Union-led Copernicus programme, born with the aim of developing space-based global environmental monitoring services to ensure a European autonomous capacity for Earth Observation, comprises a Space Component, Core Services, and In-situ measurements. The Space Component, coordinated by ESA, has seven Sentinel satellites in orbit, with further missions planned, and is complemented by contributing missions, in-situ sensors and numerical models, and delivers many terabytes of accurate climate and environmental data, free and open, every day to hundreds of thousands of users. This makes Copernicus the biggest provider of Earth Observation data in the world.</span>


2021 ◽  
Author(s):  
Edzer Pebesma ◽  
Patrick Griffiths ◽  
Christian Briese ◽  
Alexander Jacob ◽  
Anze Skerlevaj ◽  
...  

&lt;p&gt;The OpenEO API allows the analysis of large amounts of Earth Observation data using a high-level abstraction of data and processes. Rather than focusing on the management of virtual machines and millions of imagery files, it allows to create jobs that take a spatio-temporal section of an image collection (such as Sentinel L2A), and treat it as a data cube. Processes iterate or aggregate over pixels, spatial areas, spectral bands, or time series, while working at arbitrary spatial resolution. This pattern, pioneered by Google Earth Engine&amp;#8482; (GEE), lets the user focus on the science rather than on data management.&lt;/p&gt;&lt;p&gt;The openEO H2020 project (2017-2020) has developed the API as well as an ecosystem of software around it, including clients (JavaScript, Python, R, QGIS, browser-based), back-ends that translate API calls into existing image analysis or GIS software or services (for Sentinel Hub, WCPS, Open Data Cube, GRASS GIS, GeoTrellis/GeoPySpark, and GEE) as well as a hub that allows querying and searching openEO providers for their capabilities and datasets. The project demonstrated this software in a number of use cases, where identical processing instructions were sent to different implementations, allowing comparison of returned results.&lt;/p&gt;&lt;p&gt;A follow-up, ESA-funded project &amp;#8220;openEO Platform&amp;#8221; realizes the API and progresses the software ecosystem into operational services and applications that are accessible to everyone, that involve federated deployment (using the clouds managed by EODC, Terrascope, CreoDIAS and EuroDataCube), that will provide payment models (&amp;#8220;pay per compute job&amp;#8221;) conceived and implemented following the user community needs and that will use the EOSC (European Open Science Cloud) marketplace for dissemination and authentication. A wide range of large-scale cases studies will demonstrate the ability of the openEO Platform to scale to large data volumes.&amp;#160; The case studies to be addressed include on-demand ARD generation for SAR and multi-spectral data, agricultural demonstrators like crop type and condition monitoring, forestry services like near real time forest damage assessment as well as canopy cover mapping, environmental hazard monitoring of floods and air pollution as well as security applications in terms of vessel detection in the mediterranean sea.&lt;/p&gt;&lt;p&gt;While the landscape of cloud-based EO platforms and services has matured and diversified over the past decade, we believe there are strong advantages for scientists and government agencies to adopt the openEO approach. Beyond the absence of vendor/platform lock-in or EULA&amp;#8217;s we mention the abilities to (i) run arbitrary user code (e.g. written in R or Python) close to the data, (ii) carry out scientific computations on an entirely open source software stack, (iii) integrate different platforms (e.g., different cloud providers offering different datasets), and (iv) help create and extend this software ecosystem. openEO uses the OpenAPI standard, aligns with modern OGC API standards, and uses the STAC (SpatioTemporal Asset Catalog) to describe image collections and image tiles.&lt;/p&gt;


Author(s):  
Wenli Yang

Global long term Earth Observation (EO) provides valuable information about the land, ocean, and atmosphere of the Earth. EO data are often archived in specialized data systems managed by the data collector’s system. For the data to be fully utilized, one of the most important aspects is to adopt technologies that will enable users to easily find and obtain needed data in a form that can be readily used with little or no manipulation. Many efforts have been made in this direction but few, if any, data providers can deliver on-demand and operational data to users in customized form. Geospatial Web Service has been considered a promising solution to this problem. This chapter discusses the potential for operational and scalable delivery of on-demand personalized EO data using the interoperable Web Coverage Service (WCS) developed by the Open Geospatial Consortium (OGC).


Author(s):  
Michael J. Williamson ◽  
Emma J. Tebbs ◽  
Henry J. Thompson ◽  
Terrence P. Dawson ◽  
Catherine E. I. Head ◽  
...  

Coral reefs are critical ecosystems globally for marine fauna, biodiversity and through the services they provide to humanity. However, they are significantly threatened by anthropogenic stressors, such as climate change. By combining 9 environmental variables and ecological and health-based thresholds obtained from the available literature, we develop, using fuzzy logic (discontinuous functions), a Coral Reef Stress Exposure Index (CRSEI) for remotely monitoring coral reef exposure to environmental stressors. Our approach capitalises on the abundance of readily available satellite Earth Observation (EO) data available in the Google Earth Engine (GEE) cloud-based geospatial processing platform. CRSEI values from 3157 distinct reefs were generated and mapped across 12 important coral reef ecosystem regions. Quantitative analyses indicated that the index detected significant temporal differences in stress and was, therefore, able to capture historic change at a global scale. We also applied the CRSEI to three case-study reef ecosystems, previously well-monitored for stress and disturbance using other methods. PCA analysis indicated that depth, current, sea surface temperature (SST) and SST anomaly accounted for the greatest contribution to the variance in stress in these three regions. The CRSEI corroborated temporal and spatial differences in stress exposure from known disturbances within these reference regions, in addition to identifying the potential drivers of inter- and intra-region differences in stress, namely depth, degree heating weeks and SST anomaly. We discuss how the index can be further improved in future with site-specific thresholds for each stress variable, and the incorporation of additional variables not currently available in GEE. This index provides an open access tool, built around a free and powerful processing platform, that has broad potential to assist in the regular monitoring of our increasingly imperilled coral reef ecosystems, and, in particular, those that are remote or inaccessible.


2021 ◽  
Author(s):  
Marcus Strobl ◽  
Elnaz Azmi ◽  
Sibylle K. Hassler ◽  
Mirko Mälicke ◽  
Jörg Meyer ◽  
...  

&lt;p&gt;The virtual research environment V-FOR-WaTer aims at simplifying data access for environmental sciences, fostering data publications and facilitating data analyses. By giving scientists from universities, research facilities and state offices easy access to data, appropriate pre-processing and analysis tools and workflows, we want to accelerate scientific work and facilitate the reproducibility of analyses.&lt;/p&gt;&lt;p&gt;The prototype of the virtual research environment consists of a database with a detailed metadata scheme that is adapted to water and terrestrial environmental data. Present datasets in the web portal originate from university projects and state offices. We are also finalising the connection of V-FOR-WaTer to GFZ Data Services, an established repository for geoscientific data. This will ease publication of data from the portal and in turn give access to datasets stored in this repository. Key to being compatible with GFZ Data Services and other systems is the compliance of the metadata scheme with international standards (INSPIRE, ISO19115).&lt;/p&gt;&lt;p&gt;The web portal is designed to facilitate typical workflows in environmental sciences. Map operations and filter options ensure easy selection of the data, while the workspace area provides tools for data pre-processing, scaling, and common hydrological applications. The toolbox also contains more specific tools, e.g. for geostatistics and soon for evapotranspiration. It is easily extendable and will ultimately also include user-developed tools, reflecting the current research topics and methodologies in the hydrology community. Tools are accessed through Web Processing Services (WPS) and can be joined, saved and shared as workflows, enabling more complex analyses and ensuring reproducibility of the results.&lt;/p&gt;


Sign in / Sign up

Export Citation Format

Share Document