scholarly journals An objective prior error quantification for regional atmospheric inverse applications

2015 ◽  
Vol 12 (12) ◽  
pp. 9393-9441
Author(s):  
P. Kountouris ◽  
C. Gerbig ◽  
K.-U. Totsche ◽  
A.-J. Dolman ◽  
A.-G.-C.-A. Meesters ◽  
...  

Abstract. Assigning proper prior uncertainties for inverse modeling of CO2 is of high importance, both to regularize the otherwise ill-constrained inverse problem, and to quantitatively characterize the magnitude and structure of the error between prior and "true" flux. We use surface fluxes derived from three biosphere models VPRM, ORCHIDEE, and 5PM, and compare them against daily averaged fluxes from 53 Eddy Covariance sites across Europe for the year 2007, and against repeated aircraft flux measurements encompassing spatial transects. In addition we create synthetic observations to substitute observed by modeled fluxes to explore the potential to infer prior uncertainties from model-model residuals. To ensure the realism of the synthetic data analysis, a random measurement noise was added to the tower fluxes which were used as reference. The temporal autocorrelation time for tower model-data residuals was found to be around 35 days for both VPRM and ORCHIDEE, but significantly different for the 5PM model with 76 days. This difference is caused by a few sites with large model-data bias. The spatial correlation of the model-data residuals for all models was found to be very short, up to few tens of km. Long spatial correlation lengths up to several hundreds of km were determined when synthetic data were used. Results from repeated aircraft transects in south-western France, are consistent with those obtained from the tower sites in terms of spatial autocorrelation (35 km on average) while temporal autocorrelation is markedly lower (13 days). Our findings suggest that the different prior models have a common temporal error structure. Separating the analysis of the statistics for the model data residuals by seasons did not result in any significant differences of the spatial correlation lengths.

2015 ◽  
Vol 12 (24) ◽  
pp. 7403-7421 ◽  
Author(s):  
P. Kountouris ◽  
C. Gerbig ◽  
K.-U. Totsche ◽  
A. J. Dolman ◽  
A. G. C. A. Meesters ◽  
...  

Abstract. Assigning proper prior uncertainties for inverse modelling of CO2 is of high importance, both to regularise the otherwise ill-constrained inverse problem and to quantitatively characterise the magnitude and structure of the error between prior and "true" flux. We use surface fluxes derived from three biosphere models – VPRM, ORCHIDEE, and 5PM – and compare them against daily averaged fluxes from 53 eddy covariance sites across Europe for the year 2007 and against repeated aircraft flux measurements encompassing spatial transects. In addition we create synthetic observations using modelled fluxes instead of the observed ones to explore the potential to infer prior uncertainties from model–model residuals. To ensure the realism of the synthetic data analysis, a random measurement noise was added to the modelled tower fluxes which were used as reference. The temporal autocorrelation time for tower model–data residuals was found to be around 30 days for both VPRM and ORCHIDEE but significantly different for the 5PM model with 70 days. This difference is caused by a few sites with large biases between the data and the 5PM model. The spatial correlation of the model–data residuals for all models was found to be very short, up to few tens of kilometres but with uncertainties up to 100 % of this estimation. Propagating this error structure to annual continental scale yields an uncertainty of 0.06 Gt C and strongly underestimates uncertainties typically used from atmospheric inversion systems, revealing another potential source of errors. Long spatial e-folding correlation lengths up to several hundreds of kilometres were determined when synthetic data were used. Results from repeated aircraft transects in south-western France are consistent with those obtained from the tower sites in terms of spatial autocorrelation (35 km on average) while temporal autocorrelation is markedly lower (13 days). Our findings suggest that the different prior models have a common temporal error structure. Separating the analysis of the statistics for the model data residuals by seasons did not result in any significant differences of the spatial e-folding correlation lengths.


2020 ◽  
Vol 27 (8) ◽  
pp. 1244-1251
Author(s):  
Romain Bey ◽  
Romain Goussault ◽  
François Grolleau ◽  
Mehdi Benchoufi ◽  
Raphaël Porcher

Abstract Objective We introduce fold-stratified cross-validation, a validation methodology that is compatible with privacy-preserving federated learning and that prevents data leakage caused by duplicates of electronic health records (EHRs). Materials and Methods Fold-stratified cross-validation complements cross-validation with an initial stratification of EHRs in folds containing patients with similar characteristics, thus ensuring that duplicates of a record are jointly present either in training or in validation folds. Monte Carlo simulations are performed to investigate the properties of fold-stratified cross-validation in the case of a model data analysis using both synthetic data and MIMIC-III (Medical Information Mart for Intensive Care-III) medical records. Results In situations in which duplicated EHRs could induce overoptimistic estimations of accuracy, applying fold-stratified cross-validation prevented this bias, while not requiring full deduplication. However, a pessimistic bias might appear if the covariate used for the stratification was strongly associated with the outcome. Discussion Although fold-stratified cross-validation presents low computational overhead, to be efficient it requires the preliminary identification of a covariate that is both shared by duplicated records and weakly associated with the outcome. When available, the hash of a personal identifier or a patient’s date of birth provides such a covariate. On the contrary, pseudonymization interferes with fold-stratified cross-validation, as it may break the equality of the stratifying covariate among duplicates. Conclusion Fold-stratified cross-validation is an easy-to-implement methodology that prevents data leakage when a model is trained on distributed EHRs that contain duplicates, while preserving privacy.


Geophysics ◽  
1988 ◽  
Vol 53 (4) ◽  
pp. 492-500 ◽  
Author(s):  
Xianhuai Zhu ◽  
George A. McMechan

Prestack computations for cross‐hole data are relatively expensive, as they are for prestack surface survey data. It is therefore of interest to develop methodologies for modeling and processing stacked cross‐hole data. In this context, stacking is over sources, not midpoints. Modeling with a line source produces data that are equivalent (by Huygen’s principle) to those obtained by stacking over a line of point sources. Reverse‐time finite‐difference migration may be applied to the resulting stacked section by generalizing the excitation‐ time imaging condition for a point source to a line source. Illustrations include successful applications to both synthetic data and scale‐model data.


2014 ◽  
Vol 51 (8) ◽  
pp. 844-857 ◽  
Author(s):  
S. Firouzianbandpey ◽  
D.V. Griffiths ◽  
L.B. Ibsen ◽  
L.V. Andersen

The main topic of this study is to assess the anisotropic spatial correlation lengths of a sand layer deposit based on cone penetration testing with pore pressure measurement (CPTu) data. Spatial correlation length can be an important factor in reliability analysis of geotechnical systems, yet it is rarely estimated during routine site investigations. Results from two different sites in the north of Denmark are reported in this paper, indicating quite strong anisotropy due to the depositional process, with significantly shorter spatial correlation lengths in the vertical direction. It is observed that the normalized cone resistance is a better estimator of spatial trends than the normalized friction ratio.


Elem Sci Anth ◽  
2019 ◽  
Vol 7 ◽  
Author(s):  
Lewis Kunik ◽  
Derek V. Mallia ◽  
Kevin R. Gurney ◽  
Daniel L. Mendoza ◽  
Tomohiro Oda ◽  
...  

Top-down, data-driven models possess ample power to improve the accuracy of bottom-up carbon dioxide (CO2) emission inventories, and more work is needed to explore the merger of top-down and bottom-up estimates to better inform the metrics used to monitor global CO2 fluxes. Here we present a Bayesian inverse modeling framework over Salt Lake City, Utah, which utilizes available CO2 emission inventories to establish a synthetic data simulation aimed at exploring model uncertainties. Prescribing a high-resolution, urban-scale data product (Hestia) as the “true” emissions in the model, we combine prior emissions with an atmospheric transport model to derive modeled afternoon CO2 enhancements at six monitoring sites within the Salt Lake Valley during the month of September 2015. A global high-resolution gridded emissions data product (ODIAC) is used as the prior, and objective uncertainty structures are defined for both the a priori estimates and the transport model-data relationship which consider non-negligible spatial and temporal covariances. Optimized (posterior) emissions over the Salt Lake Valley agree closely with the assumed “true” emissions during afternoon times, while results including unconstrained times (e.g. night-time) lack such agreement. Both spatial and temporal correlations of prior errors were found to be necessary for obtaining a robust posterior estimate. Model sensitivity analyses are performed, which examine correlation length and time scales, model-data mismatch error, and measurement site network variability. Through these analyses, one measurement site is identified as being particularly prone to introducing bias into posterior emissions due to influences from a nearby point source. Increasing model-data mismatch error at this site is shown to reduce bias in the posterior without significantly compromising agreement with monthly averaged true emissions.


Geosciences ◽  
2019 ◽  
Vol 9 (1) ◽  
pp. 45
Author(s):  
Marwan Charara ◽  
Christophe Barnes

Full-waveform inversion for borehole seismic data is an ill-posed problem and constraining the problem is crucial. Constraints can be imposed on the data and model space through covariance matrices. Usually, they are set to a diagonal matrix. For the data space, signal polarization information can be used to evaluate the data uncertainties. The inversion forces the synthetic data to fit the polarization of observed data. A synthetic inversion for a 2D-2C data estimating a 1D elastic model shows a clear improvement, especially at the level of the receivers. For the model space, horizontal and vertical spatial correlations using a Laplace distribution can be used to fill the model space covariance matrix. This approach reduces the degree of freedom of the inverse problem, which can be quantitatively evaluated. Strong horizontal spatial correlation distances favor a tabular geological model whenever it does not contradict the data. The relaxation of the spatial correlation distances from large to small during the iterative inversion process allows the recovery of geological objects of the same size, which regularizes the inverse problem. Synthetic constrained and unconstrained inversions for 2D-2C crosswell data show the clear improvement of the inversion results when constraints are used.


Geophysics ◽  
2019 ◽  
Vol 84 (5) ◽  
pp. C217-C227 ◽  
Author(s):  
Baoqing Tian ◽  
Jiangjie Zhang

High-resolution imaging has become more popular recently in exploration geophysics. Conventionally, geophysicists image the subsurface using the isotropy approximation. When considering the anisotropy effects, one can expect to obtain an imaging profile with higher accuracy than the isotropy approach allows. Orthorhombic anisotropy is considered an ideal approximation in the realistic case. It has been used in the industry for several years. Although being attractive, broad application of orthorhombic anisotropy has many problems to solve. We have developed a novel approach of prestack time migration in the orthorhombic case. The traveltime and amplitude of a wave propagating in orthorhombic media are calculated directly by launching new anisotropic velocity and anisotropic parameters. We validate our methods with synthetic data. We also highlight our methods with model data set and real data. The results found that our methods work well for prestack time migration in orthorhombic media.


2020 ◽  
Vol 15 (2) ◽  
Author(s):  
Yadollah Mehrabi ◽  
Amir Kavousi ◽  
Ahmad-Reza Baghestani ◽  
Mojtaba Soltani-Kermanshahi

In numerous practical applications, data from neighbouring small areas present spatial correlation. More recently, an extension of the Fay–Herriot model through the Simultaneously Auto- Rregressive (SAR) process has been considered. The Conditional Auto-Regressive (CAR) structure is also a popular choice. The reasons of using these structures are theoretical properties, computational advantages and relative ease of interpretation. However, the assumption of the non-singularity of matrix (Im-ρW) is a problem. We introduce here a novel structure of the covariance matrix when approaching spatiality in small area estimation (SAE) comparing that with the commonly used SAR process. As an example, we present synthetic data on grape production with spatial correlation for 274 municipalities in the region of Tuscany as base data simulating data at each area and comparing the results. The SAR process had the smallest Root Average Mean Square Error (RAMSE) for all conditions. The RAMSE also generally decreased with increasing sample size. In addition, the RAMSE valuess did not show a specific behaviour but only spatially correlation coefficient changes led to a stronger decrease of RAMSE values than the SAR model when our new structure was applied. The new approach presented here is more flexible than the SAR process without severe increasing RAMSE values.


Sign in / Sign up

Export Citation Format

Share Document