A review and framework for the evaluation of pixel-level uncertainty estimates in satellite aerosol remote sensing

Abstract. Recent years have seen the increasing inclusion of per-retrieval prognostic (predictive) uncertainty estimates within satellite aerosol optical depth (AOD) data sets, providing users with quantitative tools to assist in the optimal use of these data. Prognostic estimates contrast with diagnostic (i.e. relative to some external truth) ones, which are typically obtained using sensitivity and/or validation analyses. Up to now, however, the quality of these uncertainty estimates has not been routinely assessed. This study presents a review of existing prognostic and diagnostic approaches for quantifying uncertainty in satellite AOD retrievals, and it presents a general framework to evaluate them based on the expected statistical properties of ensembles of estimated uncertainties and actual retrieval errors. It is hoped that this framework will be adopted as a complement to existing AOD validation exercises; it is not restricted to AOD and can in principle be applied to other quantities for which a reference validation data set is available. This framework is then applied to assess the uncertainties provided by several satellite data sets (seven over land, five over water), which draw on methods from the empirical to sensitivity analyses to formal error propagation, at 12 Aerosol Robotic Network (AERONET) sites. The AERONET sites are divided into those for which it is expected that the techniques will perform well and those for which some complexity about the site may provide a more severe test. Overall, all techniques show some skill in that larger estimated uncertainties are generally associated with larger observed errors, although they are sometimes poorly calibrated (i.e. too small or too large in magnitude). No technique uniformly performs best. For powerful formal uncertainty propagation approaches such as optimal estimation, the results illustrate some of the difficulties in appropriate population of the covariance matrices required by the technique. When the data sets are confronted by a situation strongly counter to the retrieval forward model (e.g. potentially mixed land–water surfaces or aerosol optical properties outside the family of assumptions), some algorithms fail to provide a retrieval, while others do but with a quantitatively unreliable uncertainty estimate. The discussion suggests paths forward for the refinement of these techniques.

Download Full-text

A review and framework for the evaluation of pixel-level uncertainty estimates in satellite aerosol remote sensing

10.5194/amt-2019-318 ◽

2019 ◽

Cited By ~ 1

Author(s):

Andrew M. Sayer ◽

Yves Govaerts ◽

Pekka Kolmonen ◽

Antti Lipponen ◽

Marta Luffarelli ◽

...

Keyword(s):

Uncertainty Propagation ◽

Optimal Estimation ◽

Sensitivity Analyses ◽

Data Sets ◽

Validation Data ◽

Data Set ◽

Formal Error ◽

Uncertainty Estimates ◽

Diagnostic Approaches

Abstract. Recent years have seen the increasing inclusion of per-retrieval prognostic (predictive) uncertainty estimates within satellite aerosol optical depth (AOD) data sets, providing users with quantitative tools to assist in optimal use of these data. Prognostic estimates contrast with diagnostic (i.e. relative to some external truth) ones, which are typically obtained using sensitivity and/or validation analyses. Up to now, however, the quality of these uncertainty estimates has not been routinely assessed. This study presents a review of existing prognostic and diagnostic approaches for quantifying uncertainty in satellite AOD retrievals, and presents a general framework to evaluate them, based on the expected statistical properties of ensembles of estimated uncertainties and actual retrieval errors. It is hoped that this framework will be adopted as a complement to existing AOD validation exercises; it is not restricted to AOD and can in principle be applied to other quantities for which a reference validation data set is available. This framework is then applied to assess the uncertainties provided by several satellite data sets (seven over land, five over water), which draw on methods from the empirical to sensitivity analyses to formal error propagation, at 12 Aerosol Robotic Network (AERONET) sites. The AERONET sites are divided into those where it is expected that the techniques will perform well, and those for which some complexity about the site may provide a more severe test. Overall all techniques show some skill in that larger estimated uncertainties are generally associated with larger observed errors, although they are sometimes poorly calibrated (i.e. too small/large in magnitude). No technique uniformly performs best. For powerful formal uncertainty propagation approaches such as Optimal Estimation the results illustrate some of the difficulties in appropriate population of the covariance matrices required by the technique. When the data sets are confronted by a situation strongly counter to the retrieval forward model (e.g. potential mixed land/water surfaces, or aerosol optical properties outside of the family of assumptions), some algorithms fail to provide a retrieval, while others do but with a quantitatively unreliable uncertainty estimate. The discussion suggests paths forward for refinement of these techniques.

Download Full-text

Methods for estimating uncertainty in factor analytic solutions

Atmospheric Measurement Techniques ◽

10.5194/amt-7-781-2014 ◽

2014 ◽

Vol 7 (3) ◽

pp. 781-797 ◽

Cited By ~ 174

Author(s):

P. Paatero ◽

S. Eberly ◽

S. G. Brown ◽

G. A. Norris

Keyword(s):

Environmental Protection Agency ◽

Synthetic Data ◽

Analytic Solutions ◽

Data Sets ◽

Random Errors ◽

Data Set ◽

Factor Analytic ◽

Uncertainty Estimates ◽

Multilinear Engine ◽

Analytic Models

Abstract. The EPA PMF (Environmental Protection Agency positive matrix factorization) version 5.0 and the underlying multilinear engine-executable ME-2 contain three methods for estimating uncertainty in factor analytic models: classical bootstrap (BS), displacement of factor elements (DISP), and bootstrap enhanced by displacement of factor elements (BS-DISP). The goal of these methods is to capture the uncertainty of PMF analyses due to random errors and rotational ambiguity. It is shown that the three methods complement each other: depending on characteristics of the data set, one method may provide better results than the other two. Results are presented using synthetic data sets, including interpretation of diagnostics, and recommendations are given for parameters to report when documenting uncertainty estimates from EPA PMF or ME-2 applications.

Download Full-text

A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA

International Journal of Computing ◽

10.47839/ijc.10.1.733 ◽

2011 ◽

pp. 24-32 ◽

Cited By ~ 1

Author(s):

Nicoleta Rogovschi ◽

Mustapha Lebbah ◽

Younès Bennani

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Mixed Data ◽

Categorical Variables ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Public Data ◽

Self Organizing

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.

Download Full-text

Influence of the weights in IHS and Brovey methods for pan-sharpening WorldView-3 satellite images

International Journal of Engineering & Technology ◽

10.14419/ijet.v6i3.7702 ◽

2017 ◽

Vol 6 (3) ◽

pp. 71 ◽

Cited By ~ 5

Author(s):

Claudio Parente ◽

Massimiliano Pepe

Keyword(s):

Satellite Images ◽

Urban Landscape ◽

Spectral Response ◽

Rural Landscape ◽

Spectral Radiance ◽

Data Sets ◽

Data Set ◽

Inertial Moment ◽

The Impact

The purpose of this paper is to investigate the impact of weights in pan-sharpening methods applied to satellite images. Indeed, different data sets of weights have been considered and compared in the IHS and Brovey methods. The first dataset contains the same weight for each band while the second takes in account the weighs obtained by spectral radiance response; these two data sets are most common in pan-sharpening application. The third data set is resulting by a new method. It consists to compute the inertial moment of first order of each band taking in account the spectral response. For testing the impact of the weights of the different data sets, WorlView-3 satellite images have been considered. In particular, two different scenes (the first in urban landscape, the latter in rural landscape) have been investigated. The quality of pan-sharpened images has been analysed by three different quality indexes: Root mean square error (RMSE), Relative average spectral error (RASE) and Erreur Relative Global Adimensionnelle de Synthèse (ERGAS).

Download Full-text

Validation of SCIAMACHY AMC-DOAS water vapour columns

Atmospheric Chemistry and Physics ◽

10.5194/acp-5-1835-2005 ◽

2005 ◽

Vol 5 (7) ◽

pp. 1835-1841 ◽

Cited By ~ 31

Author(s):

S. Noël ◽

M. Buchwitz ◽

H. Bovensmann ◽

J. P. Burrows

Keyword(s):

Water Vapour ◽

Visible Spectral Region ◽

Optical Absorption Spectroscopy ◽

Data Sets ◽

Monthly Means ◽

Data Set ◽

Weather Forecasts ◽

Microwave Imager ◽

Scanning Imaging

Abstract. A first validation of water vapour total column amounts derived from measurements of the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) in the visible spectral region has been performed. For this purpose, SCIAMACHY water vapour data have been determined for the year 2003 using an extended version of the Differential Optical Absorption Spectroscopy (DOAS) method, called Air Mass Corrected (AMC-DOAS). The SCIAMACHY results are compared with corresponding water vapour measurements by the Special Sensor Microwave Imager (SSM/I) and with model data from the European Centre for Medium-Range Weather Forecasts (ECMWF). In confirmation of previous results it could be shown that SCIAMACHY derived water vapour columns are typically slightly lower than both SSM/I and ECMWF data, especially over ocean areas. However, these deviations are much smaller than the observed scatter of the data which is caused by the different temporal and spatial sampling and resolution of the data sets. For example, the overall difference with ECMWF data is only -0.05 g/cm2 whereas the typical scatter is in the order of 0.5 g/cm2. Both values show almost no variation over the year. In addition, first monthly means of SCIAMACHY water vapour data have been computed. The quality of these monthly means is currently limited by the availability of calibrated SCIAMACHY spectra. Nevertheless, first comparisons with ECMWF data show that SCIAMACHY (and similar instruments) are able to provide a new independent global water vapour data set.

Download Full-text

SaltSeg: Automatic 3D salt segmentation using a deep convolutional neural network

Interpretation ◽

10.1190/int-2018-0235.1 ◽

2019 ◽

Vol 7 (3) ◽

pp. SE113-SE122 ◽

Cited By ~ 26

Author(s):

Yunzhi Shi ◽

Xinming Wu ◽

Sergey Fomel

Keyword(s):

Large Scale ◽

Model Building ◽

Ground Truth ◽

Velocity Model ◽

Training Data ◽

Data Sets ◽

Validation Data ◽

Data Set ◽

Seismic Image ◽

Data Generator

Salt boundary interpretation is important for the understanding of salt tectonics and velocity model building for seismic migration. Conventional methods consist of computing salt attributes and extracting salt boundaries. We have formulated the problem as 3D image segmentation and evaluated an efficient approach based on deep convolutional neural networks (CNNs) with an encoder-decoder architecture. To train the model, we design a data generator that extracts randomly positioned subvolumes from large-scale 3D training data set followed by data augmentation, then feed a large number of subvolumes into the network while using salt/nonsalt binary labels generated by thresholding the velocity model as ground truth labels. We test the model on validation data sets and compare the blind test predictions with the ground truth. Our results indicate that our method is capable of automatically capturing subtle salt features from the 3D seismic image with less or no need for manual input. We further test the model on a field example to indicate the generalization of this deep CNN method across different data sets.

Download Full-text

A PROBABILISTIC SELF-ORGANIZING MAP FOR BINARY DATA TOPOGRAPHIC CLUSTERING

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026808002351 ◽

2008 ◽

Vol 07 (04) ◽

pp. 363-383 ◽

Cited By ~ 10

Author(s):

MUSTAPHA LEBBAH ◽

YOUNÈS BENNANI ◽

NICOLETA ROGOVSCHI

Keyword(s):

Binary Data ◽

Learning Algorithm ◽

Data Sets ◽

Self Organizing Map ◽

Data Set ◽

Binary Coding ◽

Public Data ◽

Multivariate Binary Data ◽

Self Organizing

This paper introduces a probabilistic self-organizing map for topographic clustering, analysis and visualization of multivariate binary data or categorical data using binary coding. We propose a probabilistic formalism dedicated to binary data in which cells are represented by a Bernoulli distribution. Each cell is characterized by a prototype with the same binary coding as used in the data space and the probability of being different from this prototype. The learning algorithm, Bernoulli on self-organizing map, that we propose is an application of the EM standard algorithm. We illustrate the power of this method with six data sets taken from a public data set repository. The results show a good quality of the topological ordering and homogenous clustering.

Download Full-text

DBSCANI: Noise-Resistant Method for Missing Value Imputation

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0172 ◽

2016 ◽

Vol 25 (3) ◽

pp. 431-440 ◽

Cited By ~ 1

Author(s):

Archana Purwar ◽

Sandeep Kumar Singh

Keyword(s):

Spatial Data ◽

Missing Values ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Data Sets ◽

Quality Of Data ◽

Data Set ◽

Dbscan Clustering ◽

Density Based Clustering

AbstractThe quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.

Download Full-text

Methods for estimating uncertainty in factor analytic solutions

Atmospheric Measurement Techniques Discussions ◽

10.5194/amtd-6-7593-2013 ◽

2013 ◽

Vol 6 (4) ◽

pp. 7593-7631 ◽

Cited By ~ 9

Author(s):

P. Paatero ◽

S. Eberly ◽

S. G. Brown ◽

G. A. Norris

Keyword(s):

Synthetic Data ◽

Analytic Solutions ◽

The Other ◽

Data Sets ◽

Random Errors ◽

Data Set ◽

Factor Analytic ◽

Uncertainty Estimates ◽

Multilinear Engine ◽

Analytic Models

Abstract. EPA PMF version 5.0 and the underlying multilinear engine executable ME-2 contain three methods for estimating uncertainty in factor analytic models: classical bootstrap (BS), displacement of factor elements (DISP), and bootstrap enhanced by displacement of factor elements (BS-DISP). The goal of these methods is to capture the uncertainty of PMF analyses due to random errors and rotational ambiguity. It is shown that the three methods complement each other: depending on characteristics of the data set, one method may provide better results than the other two. Results are presented using synthetic data sets, including interpretation of diagnostics, and recommendations are given for parameters to report when documenting uncertainty estimates from EPA PMF or ME-2 applications.

Download Full-text

GLEAM v3: satellite-based land evaporation and root-zone soil moisture

10.5194/gmd-2016-162 ◽

2016 ◽

Cited By ~ 25

Author(s):

Brecht Martens ◽

Diego G. Miralles ◽

Hans Lievens ◽

Robin van der Schalie ◽

Richard A. M. de Jeu ◽

...

Keyword(s):

Soil Moisture ◽

Eddy Covariance ◽

Optical Depth ◽

Root Zone ◽

European Space Agency ◽

Data Sets ◽

Data Set ◽

Previous Version ◽

The One

Abstract. The Global Land Evaporation Amsterdam Model (GLEAM) is a set of algorithms dedicated to the estimation of terrestrial evaporation and root-zone soil moisture from satellite data. Ever since its development in 2011, the model has been regularly revised aiming at the optimal incorporation of new satellite-observed geophysical variables, and improving the representation of physical processes. In this study, the next version of this model (v3) is presented. Key changes relative to the previous version include: (1) a revised formulation of the evaporative stress, (2) an optimized drainage algorithm, and (3) a new soil moisture data assimilation system. GLEAM v3 is used to produce three new data sets of terrestrial evaporation and root-zone soil moisture, including a 35-year data set spanning the period 1980–2014 (v3.0a, based on satellite-observed soil moisture, vegetation optical depth and snow water equivalents, reanalysis air temperature and radiation, and a multi-source precipitation product), and two fully satellite-based data sets. The latter two share most of their forcing, except for the vegetation optical depth and soil moisture products, which are based on observations from different passive and active C- and L-band microwave sensors (European Space Agency Climate Change Initiative data sets) for the first data set (v3.0b, spanning the period 2003–2015) and observations from the Soil Moisture and Ocean Salinity satellite in the second data set (v3.0c, spanning the period 2011–2015). These three data sets are described in detail, compared against analogous data sets generated using the previous version of GLEAM (v2), and validated against measurements from 64 eddy-covariance towers and 2338 soil moisture sensors across a broad range of ecosystems. Results indicate that the quality of the v3 soil moisture is consistently better than the one from v2: average correlations against in situ surface soil moisture measurements increase from 0.61 to 0.64 in case of the v3.0a data set and the representation of soil moisture in the second layer improves as well, with correlations increasing from 0.47 to 0.53. Similar improvements are observed for the two fully satellite-based data sets. Despite regional differences, the quality of the evaporation fluxes remains overall similar as the one obtained using the previous version of GLEAM, with average correlations against eddy-covariance measurements between 0.78 and 0.80 for the three different data sets. These global data sets of terrestrial evaporation and root-zone soil moisture are now openly available at http://GLEAM.eu and may be used for large-scale hydrological applications, climate studies and research on land-atmosphere feedbacks.

Download Full-text