Characterization of ERA5 daily precipitation using the extended generalized Pareto distribution

Author(s):  
Pauline Rivoire ◽  
Olivia Martius ◽  
Philippe Naveau

<p>Both mean and extreme precipitation are highly relevant and a probability distribution that models the entire precipitation distribution therefore provides important information. Very low and extremely high precipitation amounts have traditionally been modeled separately. Gamma distributions are often used to model low and moderate precipitation amounts and extreme value theory allows to model the upper tail of the distribution. However, difficulties arise when making a link between upper and lower tail. One solution is to define a threshold that separates the distribution into extreme and non-extreme values, but the assignment of such a threshold for many locations is not trivial. </p><p>Here we apply the Extended Generalized Pareto Distribution (EGPD) used by Tencaliec & al. 2019. This method overcomes the problem of finding a threshold between upper and lower tails thanks to a transition function (G) that describes the transition between the empirical distribution of precipitation and a Pareto distribution. The transition cumulative distribution function G has to be constrained by the upper tail and lower tail behavior. G can be estimated using Bernstein polynomials.</p><p>EGPD is used here to characterize ERA-5 precipitation. ERA-5 is a new ECMWF climate re-analysis dataset that provides a numerical description of the recent climate by combining a numerical weather model with observations. The data set is global with a spatial resolution of 0.25° and currently covers the period from 1979 to present.</p><p>ERA-5 daily precipitation is compared to EOBS, a gridded dataset spatially interpolated from observations over Europe, and to CMORPH, a satellite-based global precipitation product. Simultaneous occurrence of extreme events is assessed with a hit rate. An intensity comparison is conducted with return levels confidence intervals and a Kullback Leibler divergence test, both derived from the EGPD.</p><p>Overall, extreme event occurrences between ERA5 and EOBS over Europe appear to agree. The presence of overlap between 95% confidence intervals on return levels highly depends on the season and the probability of occurrence.</p>

2021 ◽  
Author(s):  
Pauline Rivoire ◽  
Olivia Martius ◽  
Philippe Naveau

<p>Both mean and extreme precipitation are highly relevant and a probability distribution that models the entire precipitation distribution therefore provides important information. Gamma distributions are often used to model low and moderate precipitation amounts and extreme value theory allows to model the upper tail of the distribution. We apply the Extended Generalized Pareto Distribution (EGPD). Thanks to a transition function, this method overcomes the problem of finding a threshold between upper and lower tails. The transition cumulative distribution function of the EGPD is constrained on the upper tail and lower tail to enable a GPD behavior for both small and large extremes.</p><p>EGPD is used here to characterize ERA-5 precipitation. ERA-5 is a new ECMWF climate re-analysis dataset that provides a numerical description of the recent climate by combining a numerical weather model with observations. The data set is global with a spatial resolution of 0.25° and currently covers the period from 1979 to present. ERA-5 precipitation is computed from model forecasts and therefore needs validation against observational datasets. ERA-5 daily precipitation is compared to EOBS precipitation, a gridded dataset spatially interpolated from observations over Europe, and to CMORPH precipitation, a global satellite-based dataset. Simultaneous occurrence of extreme events is assessed with a hit rate. An intensity comparison is conducted with quantiles confidence intervals and a Kullback Leibler divergence test, both derived from the EGPD.</p><p>Overall, good agreements but also strong mismatches between ERA-5 and the observational datasets can be found, depending on the feature of interest in precipitation data. This work highlights both. For example, extreme event occurrences between ERA5 and the observational datasets appear to agree. The overlap between 95% confidence intervals on quantiles depends on the season and the probability of occurrence. Over Europe, the best agreement results are generally reached in regions with high station density in EOBS. The global intensity comparison between ERA5 and CMORPH shows a good agreement for moderate quantiles, except for some mountainous regions, but presents a large signal of disagreement in the tropics for large quantiles.</p>


Author(s):  
Jiajia Gao ◽  
Jun Du ◽  
Xiaoqing Huang

Abstract The daily precipitation data of the years 1955–2017 from May to September were retrieved; then a Generalized Pareto Distribution (GPD) and maximum likelihood methods were adopted to understand trends and calculate the reappearance period of heavy precipitation in the Tibetan Plateau (TP). The daily precipitation values at 22 stations in the TP were found to conform to the model, and theoretical and measured frequencies were consistent. According to the spatial distribution of the maximum precipitation value, the extreme values of Shigatse and Lhasa showed large fluctuations, and the probability of record-breaking precipitation events was low. In the western part of Nagqu, the probability of extreme precipitation was relatively low, and that of record-breaking precipitation was relatively high. The peak values of extreme precipitation in the flood season in the TP generally exhibited a decreasing trend from southeast to northwest, and the extreme value of the flood season that reappeared in the southeast region was approximatelytwice that of the northwest region. The maximum rainfall in most areas will exceed 20 mm in the next 5–10 years, and the maximum rainfall in Shigatse will reach 52.7 mm. After 15 years of recurrence in various regions, the peak rainfall in the flood season has become low. Most of the regions in the model have different responses to ENSO and Indian Ocean monsoon indices with external forcing factors.


Mathematics ◽  
2018 ◽  
Vol 6 (12) ◽  
pp. 319 ◽  
Author(s):  
Xuehua Hu ◽  
Wenhao Gui

In this paper, first we consider the maximum likelihood estimators for two unknown parameters, reliability and hazard functions of the generalized Pareto distribution under progressively Type II censored sample. Next, we discuss the asymptotic confidence intervals for two unknown parameters, reliability and hazard functions by using the delta method. Then, based on the bootstrap algorithm, we obtain another two pairs of approximate confidence intervals. Furthermore, by applying the Markov Chain Monte Carlo techniques, we derive the Bayesian estimates of the two unknown parameters, reliability and hazard functions under various balanced loss functions and the corresponding confidence intervals. A simulation study was conducted to compare the performances of the proposed estimators. A real dataset analysis was carried out to illustrate the proposed methods.


2012 ◽  
Vol 1 (33) ◽  
pp. 42
Author(s):  
Pietro Bernardara ◽  
Franck Mazas ◽  
Jérôme Weiss ◽  
Marc Andreewsky ◽  
Xavier Kergadallan ◽  
...  

In the general framework of over-threshold modelling (OTM) for estimating extreme values of met-ocean variables, such as waves, surges or water levels, the threshold selection logically requires two steps: the physical declustering of time series of the variable in order to obtain samples of independent and identically distributed data then the application of the extreme value theory, which predicts the convergence of the upper part of the sample toward the Generalized Pareto Distribution. These two steps were often merged and confused in the past. A clear framework for distinguishing them is presented here. A review of the methods available in literature to carry out these two steps is given here together with the illustration of two simple and practical examples.


2021 ◽  
Author(s):  
Abubakar Haruna ◽  
Juliette Blanchet ◽  
Anne-Catherine Favre

Abstract. In this article, we compare the performances of three regionalization approaches in improving the at-site estimates of daily precipitation. The first method is built on the idea of conventional RFA (Regional Frequency Analysis) but is based on a fast algorithm that defines distinct homogeneous regions relying on their upper tail similarity. It uses only the precipitation data at hand without the need for any additional covariate. The second is based on the region-of-influence (ROI) approach in which neighborhoods, containing similar sites, are defined for each station. The third is a spatial method that adopts Generalized Additive Model (GAM) forms for the model parameters. In line with our goal of modeling the whole range of positive precipitation, the chosen marginal distribution model is the Extended Generalized Pareto Distribution (EGPD) on which we apply the three methods. We consider a dense network composed of 1176 daily stations located within Switzerland and in neighboring countries. We compute different criteria to assess the models' performances both in the bulk of the distribution as well as in the upper tail. The results show that all the regional methods offered improved robustness over the local EGPD model. While the GAM method is more robust and reliable in the upper tail, the ROI method is better in the bulk of the distribution.


2020 ◽  
Vol 9 (4) ◽  
pp. 505-514
Author(s):  
Lina Tanasya ◽  
Di Asih I Maruddani ◽  
Tarno Tarno

Stock is a type of investment in financial assets that are many interested by investors. When investing, investors must calculate the expected return on stocks and notice risks that will occur. There are several methods can be used to measure the level of risk one of which is Value at Risk (VaR), but these method often doesn’t fulfill coherence as a risk measure because it doesn’t fulfill the nature of subadditivity. Therefore, the Expected Shortfall (ES) method is used to accommodate these weakness. Stock return data is time series data which has heteroscedasticity and heavy tailed, so time series models used to overcome the problem of heteroscedasticity is GARCH, while the theory for analyzing heavy tailed is Extreme Value Theory (EVT). In this study, there is also a leverage effect so used the asymmetric GARCH model with Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model and the EVT theory with Generalized Pareto Distribution (GPD) to calculate ES of the stock return from PT. Bank Central Asia Tbk for the period May 1, 2012-January 31, 2020. The best model chosen was ARIMA(1,0,1) GJR-GARCH(1,2). At the 95% confidence level, the risk obtained by investors using a combination of GJR-GARCH and GPD calculations for the next day is 0.7147% exceeding the VaR value of 0.6925%. 


Author(s):  
Audrene Edwards ◽  
Kumer Das

The study of extremes has attracted the attention of scientists, engineers, actuaries, policy makers, and statisticians for many years. Extreme value theory (EVT) deals with the extreme deviations from the median of probability distributions and is used to study rare but extreme events. EVT’s main results characterize the distribution of the sample maximum or the distribution of values above a given threshold. In this study, EVT has been used to construct a model on the extreme and rare earthquakes that have happened in the United States from 1700 to 2011.The primary goal of fitting such a model is to estimate the amount of losses due to those extreme events and the probabilities of such events. Several diagnostic methods (for example, QQ plot and Mean Excess Plot) have been used to justify that the data set follows generalized Pareto distribution (GPD). Three estimation techniques have been employed to estimate parameters. The consistency and reliability of estimated parameters have been observed for different threshold values. The purpose of this study is manifold: first, we investigate whether the data set follows GPD, by using graphical interpretation and hypothesis testing. Second, we estimate GPD parameters using three different estimation techniques. Third, we compare consistency and reliability of estimated parameters for different threshold values. Last, we investigate the bias of estimated parameters using a simulation study. The result is particularly useful because it can be used in many applications (for example, disaster management, engineering design, insurance industry, hydrology, ocean engineering, and traffic management) with a minimal set of assumptions about the true underlying distribution of a data set. KEYWORDS: Extreme Value Theory; QQ Plot; Mean Excess Plot; Mean Residual Plot; Peak Over Threshold; Generalized Pareto Distribution; Maximum Likelihood Method; Method of Moments; Probability-Weighted Moments; Shapiro-Wilk test; Anderson- Darling Test


Sign in / Sign up

Export Citation Format

Share Document