scholarly journals Analysis of Ensemble Mean Forecasts: The Blessings of High Dimensionality

2019 ◽  
Vol 147 (5) ◽  
pp. 1699-1712 ◽  
Author(s):  
Bo Christiansen

Abstract In weather and climate sciences ensemble forecasts have become an acknowledged community standard. It is often found that the ensemble mean not only has a low error relative to the typical error of the ensemble members but also that it outperforms all the individual ensemble members. We analyze ensemble simulations based on a simple statistical model that allows for bias and that has different variances for observations and the model ensemble. Using generic simplifying geometric properties of high-dimensional spaces we obtain analytical results for the error of the ensemble mean. These results include a closed form for the rank of the ensemble mean among the ensemble members and depend on two quantities: the ensemble variance and the bias both normalized with the variance of observations. The analytical results are used to analyze the GEFS reforecast where the variances and bias depend on lead time. For intermediate lead times between 20 and 100 h the two terms are both around 0.5 and the ensemble mean is only slightly better than individual ensemble members. For lead times larger than 240 h the variance term is close to 1 and the bias term is near 0.5. For these lead times the ensemble mean outperforms almost all individual ensemble members and its relative error comes close to −30%. These results are in excellent agreement with the theory. The simplifying properties of high-dimensional spaces can be applied not only to the ensemble mean but also to, for example, the ensemble spread.

2017 ◽  
Vol 8 (2) ◽  
pp. 429-438 ◽  
Author(s):  
Francine J. Schevenhoven ◽  
Frank M. Selten

Abstract. Weather and climate models have improved steadily over time as witnessed by objective skill scores, although significant model errors remain. Given these imperfect models, predictions might be improved by combining them dynamically into a so-called supermodel. In this paper a new training scheme to construct such a supermodel is explored using a technique called cross pollination in time (CPT). In the CPT approach the models exchange states during the prediction. The number of possible predictions grows quickly with time, and a strategy to retain only a small number of predictions, called pruning, needs to be developed. The method is explored using low-order dynamical systems and applied to a global atmospheric model. The results indicate that the CPT training is efficient and leads to a supermodel with improved forecast quality as compared to the individual models. Due to its computational efficiency, the technique is suited for application to state-of-the art high-dimensional weather and climate models.


2020 ◽  
Vol 148 (3) ◽  
pp. 1177-1203 ◽  
Author(s):  
Nicholas A. Gasperoni ◽  
Xuguang Wang ◽  
Yongming Wang

Abstract A gridpoint statistical interpolation (GSI)-based hybrid ensemble–variational (EnVar) scheme was extended for convective scales—including radar reflectivity assimilation—and implemented in real-time spring forecasting experiments. This study compares methods to address model error during the forecast under the context of multiscale initial condition error sampling provided by the EnVar system. A total of 10 retrospective cases were used to explore the optimal design of convection-allowing ensemble forecasts. In addition to single-model single-physics (SMSP) configurations, ensemble forecast experiments compared multimodel (MM) and multiphysics (MP) approaches. Stochastic physics was also applied to MP for further comparison. Neighborhood-based verification of precipitation and composite reflectivity showed each of these model error techniques to be superior to SMSP configurations. Comparisons of MM and MP approaches had mixed findings. The MM approach had better overall skill in heavy-precipitation forecasts; however, MP ensembles had better skill for light (2.54 mm) precipitation and reduced ensemble mean error of other diagnostic fields, particularly near the surface. The MM experiment had the largest spread in precipitation, and for most hours in other fields; however, rank histograms and spaghetti contours showed significant clustering of the ensemble distribution. MP plus stochastic physics was able to significantly increase spread with time to be competitive with MM by the end of the forecast. The results generally suggest that an MM approach is best for early forecast lead times up to 6–12 h, while a combination of MP and stochastic physics approaches is preferred for forecasts beyond 6–12 h.


2021 ◽  
Vol 2 (4) ◽  
pp. 1209-1224
Author(s):  
Cameron Bertossa ◽  
Peter Hitchcock ◽  
Arthur DeGaetano ◽  
Riwal Plougonven

Abstract. Bimodality and other types of non-Gaussianity arise in ensemble forecasts of the atmosphere as a result of nonlinear spread across ensemble members. In this paper, bimodality in 50-member ECMWF ENS-extended ensemble forecasts is identified and characterized. Forecasts of 2 m temperature are found to exhibit widespread bimodality well over a derived false-positive rate. In some regions bimodality occurs in excess of 30 % of forecasts, with the largest rates occurring during lead times of 2 to 3 weeks. Bimodality occurs more frequently in the winter hemisphere with indications of baroclinicity being a factor to its development. Additionally, bimodality is more common over the ocean, especially the polar oceans, which may indicate development caused by boundary conditions (such as sea ice). Near the equatorial region, bimodality remains common during either season and follows similar patterns to the Intertropical Convergence Zone (ITCZ), suggesting convection as a possible source for its development. Over some continental regions the modes of the forecasts are separated by up to 15 °C. The probability density for the modes can be up to 4 times greater than at the minimum between the modes, which lies near the ensemble mean. The widespread presence of such bimodality has potentially important implications for decision makers acting on these forecasts. Bimodality also has implications for assessing forecast skill and for statistical postprocessing: several commonly used skill-scoring methods and ensemble dressing methods are found to perform poorly in the presence of bimodality, suggesting the need for improvements in how non-Gaussian ensemble forecasts are evaluated.


2007 ◽  
Vol 24 (11) ◽  
pp. 1895-1909 ◽  
Author(s):  
Robert A. Iacovazzi ◽  
Changyong Cao

Abstract Systematic biases between brightness temperature (Tb) measurements made from concurrently operational Advanced Microwave Sounding Unit-A (AMSU-A) instruments can introduce errors into weather and climate applications. For this reason, in this study the ability of the simultaneous nadir overpass (SNO) method to estimate relative Tb biases between operational Earth Observing System (EOS) Aqua and Polar-orbiting Operational Environmental Satellites (POES) NOAA-15, NOAA-16, and NOAA-18 AMSU-A instruments is evaluated. From an analysis of SNO events occurring from 21 May 2005 to 31 July 2006, AMSU-A SNO-ensemble mean Tb biases could not be statistically determined for window channels, while significant bias detection to within about 0.02 K is accomplished in some low-noise sounding channels. These results are shown to be a consequence of the decrease of the earth-scene Tb variability with increasing atmospheric zenith opacity, which is a function of microwave frequency. Examination of SNO-ensemble mean Tb biases for two independent AMSU-A instrument components—AMSU-A1–1 and AMSU-A1–2—exposed a significant cold (warm) bias on the order of 0.4 K (0.2 K) in the AMSU-A1–1 unit on board the NOAA-18 (Aqua) satellite. This analysis also revealed on average a significant cold bias on the order of 0.1 K in the NOAA-16 AMSU-A1–2 component. Furthermore, the individual SNO mean Tb biases were often found to be a function of the SNO earth-scene average Tb, which is a manifestation of instrument calibration errors. On the other hand, it was found that determining the root cause of such errors is inhibited by the lack of postlaunch quality control of the AMSU-A calibration-related hardware. Based on the results of this study, a need to reduce impacts of surface emissivity and temperature inhomogeneities on the SNO method in microwave radiometer window channels becomes evident. In addition, the unparalleled ability of the SNO method to isolate and quantify intersatellite, instrument-related Tb biases is demonstrated in most sounding channels, which is necessary to improve weather and climate applications.


2018 ◽  
Vol 31 (4) ◽  
pp. 1587-1596 ◽  
Author(s):  
Bo Christiansen

When comparing climate models to observations, it is often observed that the mean over many models has smaller errors than most or all of the individual models. This paper will show that a general consequence of the nonintuitive geometric properties of high-dimensional spaces is that the ensemble mean often outperforms the individual ensemble members. This also explains why the ensemble mean often has an error that is 30% smaller than the median error of the individual ensemble members. The only assumption that needs to be made is that the observations and the models are independently drawn from the same distribution. An important and relevant property of high-dimensional spaces is that independent random vectors are almost always orthogonal. Furthermore, while the lengths of random vectors are large and almost equal, the ensemble mean is special, as it is located near the otherwise vacant center. The theory is first explained by an analysis of Gaussian- and uniformly distributed vectors in high-dimensional spaces. A subset of 17 models from the CMIP5 multimodel ensemble is then used to demonstrate the validity and robustness of the theory in realistic settings.


2014 ◽  
Vol 71 (9) ◽  
pp. 3554-3567 ◽  
Author(s):  
Jie Feng ◽  
Ruiqiang Ding ◽  
Deqiang Liu ◽  
Jianping Li

Abstract Nonlinear local Lyapunov vectors (NLLVs) are developed to indicate orthogonal directions in phase space with different perturbation growth rates. In particular, the first few NLLVs are considered to be an appropriate orthogonal basis for the fast-growing subspace. In this paper, the NLLV method is used to generate initial perturbations and implement ensemble forecasts in simple nonlinear models (the Lorenz63 and Lorenz96 models) to explore the validity of the NLLV method. The performance of the NLLV method is compared comprehensively and systematically with other methods such as the bred vector (BV) and the random perturbation (Monte Carlo) methods. In experiments using the Lorenz63 model, the leading NLLV (LNLLV) captured a more precise direction, and with a faster growth rate, than any individual bred vector. It may be the larger projection on fastest-growing analysis errors that causes the improved performance of the new method. Regarding the Lorenz96 model, two practical measures—namely the spread–skill relationship and the Brier score—were used to assess the reliability and resolution of these ensemble schemes. Overall, the ensemble spread of NLLVs is more consistent with the errors of the ensemble mean, which indicates the better performance of NLLVs in simulating the evolution of analysis errors. In addition, the NLLVs perform significantly better than the BVs in terms of reliability and the random perturbations in resolution.


2017 ◽  
Vol 18 (11) ◽  
pp. 2873-2891 ◽  
Author(s):  
Yu Zhang ◽  
Limin Wu ◽  
Michael Scheuerer ◽  
John Schaake ◽  
Cezar Kongoli

Abstract This article compares the skill of medium-range probabilistic quantitative precipitation forecasts (PQPFs) generated via two postprocessing mechanisms: 1) the mixed-type meta-Gaussian distribution (MMGD) model and 2) the censored shifted Gamma distribution (CSGD) model. MMGD derives the PQPF by conditioning on the mean of raw ensemble forecasts. CSGD, on the other hand, is a regression-based mechanism that estimates PQPF from a prescribed distribution by adjusting the climatological distribution according to the mean, spread, and probability of precipitation (POP) of raw ensemble forecasts. Each mechanism is applied to the reforecast of the Global Ensemble Forecast System (GEFS) to yield a postprocessed PQPF over lead times between 24 and 72 h. The outcome of an evaluation experiment over the mid-Atlantic region of the United States indicates that the CSGD approach broadly outperforms the MMGD in terms of both the ensemble mean and the reliability of distribution, although the performance gap tends to be narrow, and at times mixed, at higher precipitation thresholds (>5 mm). Analysis of a rare storm event demonstrates the superior reliability and sharpness of the CSGD PQPF and underscores the issue of overforecasting by the MMGD PQPF. This work suggests that the CSGD’s incorporation of ensemble spread and POP does help enhance its skill, particularly for light forecast amounts, but CSGD’s model structure and its use of optimization in parameter estimation likely play a more determining role in its outperformance.


2017 ◽  
Author(s):  
Francine Schevenhoven ◽  
Frank Selten

Abstract. Weather and climate models have improved steadily over time as witnessed by objective skill scores, although significant model errors remain. Given these imperfect models, predictions might be improved by combining them dynamically into a so-called supermodel. In this paper a new training scheme to construct such a supermodel is explored using a technique called Cross Pollination in Time (CPT). In the CPT approach the models exchange states during the prediction. The number of possible predictions grows quickly with time and a strategy to retain only a small number of predictions, called pruning, needs to be developed. The method is explored using low-order dynamical systems and applied to a global atmospheric model. The results indicate that the CPT training is efficient and leads to a supermodel with improved forecast quality as compared to the individual models. Due to its computational efficiency, the technique is suited for application to state-of-the art high-dimensional weather and climate models.


2015 ◽  
Vol 30 (5) ◽  
pp. 1397-1403 ◽  
Author(s):  
Charles R. Sampson ◽  
John A. Knaff

Abstract The National Hurricane Center (NHC) has been forecasting gale force wind radii for many years, and more recently (starting in 2004) began routine postanalysis or “best tracking” of the maximum radial extent of gale [34 knots (kt; 1 kt = 0.514 m s−1)] force winds in compass quadrants surrounding the tropical cyclone (wind radii). At approximately the same time, a statistical wind radii forecast, based solely on climatology and persistence, was implemented so that wind radii forecasts could be evaluated for skill. If the best-track gale radii are used as ground truth (even accounting for random errors in the analyses), the skill of the NHC forecasts appears to be improving at 2- and 3-day lead times, suggesting that the guidance has also improved. In this paper several NWP models are evaluated for their skill, an equally weighted average or “consensus” of the model forecasts is constructed, and finally the consensus skill is evaluated. The results are similar to what is found with tropical cyclone track and intensity in that the consensus skill is comparable to or better than that of the individual models. Furthermore, the consensus skill is high enough to be of potential use as forecast guidance or as a proxy for official gale force wind radii forecasts at the longer lead times.


Animals ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1661
Author(s):  
Elżbieta Bednarek ◽  
Anna Sławinska

Boarhounds are hunting dogs bred for hunting wild boar, including terriers, dachshunds, and hounds. Hunt trials evaluate the individual hunting potential and trainability of the boarhounds in ten different competitions. The aim of this study was to determine the factors influencing the hunt trials for boarhounds in a large cohort of hunting dogs. The analysis was conducted based on the results of hunt trials for boarhounds conducted in 2005–2015. The database contained 1867 individuals belonging to 39 breeds. Effects of sex, age, breed group, and breed were estimated by non-parametric analysis of variance. Sex influenced (p < 0.01) the total score, and in almost all competitions dogs performed better than bitches. Age affected (p < 0.01 or p < 0.05) all competitions, indicating that the dogs perform better with age. The results analyzed by the breed group showed that the dachshunds performed better in courage (p < 0.01) and searching (p < 0.05). Breed influenced (p < 0.01) almost all scores except obedience and tracking on the lead. The best performing breed was Alpine Dachsbracke. In conclusion, all analyzed factors influenced the results of the hunt trials. The factors with the largest impact were breed and age, which reflect both the hunting potential and the level of training of the boarhounds.


Sign in / Sign up

Export Citation Format

Share Document