A Framework for Evaluating Climate Model Performance Metrics
Abstract Given the large amount of climate model output generated from the series of simulations from phase 5 of the Coupled Model Intercomparison Project (CMIP5), a standard set of performance metrics would facilitate model intercomparison and tracking performance improvements. However, no framework exists for the evaluation of performance metrics. The proposed framework systematically integrates observations into metric assessment to quantitatively evaluate metrics. An optimal metric is defined in this framework as one that measures a behavior that is strongly linked to model quality in representing mean-state present-day climate. The goal of the framework is to objectively and quantitatively evaluate the ability of a performance metric to represent overall model quality. The framework is demonstrated, and the design principles are discussed using a novel set of performance metrics, which assess the simulation of top-of-atmosphere (TOA) and surface radiative flux variance and probability distributions within 34 CMIP5 models against Clouds and the Earth’s Radiant Energy System (CERES) observations and GISS Surface Temperature Analysis (GISTEMP). Of the 44 tested metrics, the optimal metrics are found to be those that evaluate global-mean TOA radiation flux variance.