Climate field completion via Markov random fields – Application to the HadCRUT4.6 temperature dataset
AbstractSurface temperature is a vital metric of Earth’s climate state, but is incompletely observed in both space and time: over half of monthly values are missing from the widely used HadCRUT4.6 global surface temperature dataset. Here we apply GraphEM, a recently developed imputation method, to construct a spatially complete estimate of HadCRUT4.6 temperatures. GraphEM leverages Gaussian Markov random fields (aka Gaussian graphical models) to better estimate covariance relationships within a climate field, detecting anisotropic features such as land/ocean contrasts, orography, ocean currents and wave-propagation pathways. This detection leads to improved estimates of missing values compared to methods (such as kriging) that assume isotropic covariance relationships, as we show with real and synthetic data.This interpolated analysis of HadCRUT4.6 data is available as a 100-member ensemble, propagating information about sampling variability available from the original HadCRUT4.6 dataset. A comparison of NINO3.4 and global mean monthly temperature series with published datasets reveals similarities and differences due in part to the spatial interpolation method. Notably, the GraphEM-completed HadCRUT4.6 global temperature displays a stronger early twenty-first century warming trend than its uninterpolated counterpart, consistent with recent analyses using other datasets. Known events like the 1877/1878 El Niño are recovered with greater fidelity than with kriging, and result in different assessments of changes in ENSO variability through time. Gaussian Markov random fields provide a more geophysically-motivated way to impute missing values in climate fields, and the associated graph provides a powerful tool to analyze the structure of teleconnection patterns. We close with a discussion of wider applications of Markov random fields in climate science.