Characterization of ERA5 daily precipitation using the extended generalized Pareto distribution

Mapping Intimacies ◽

10.5194/egusphere-egu2020-7198 ◽

2020 ◽

Author(s):

Pauline Rivoire ◽

Olivia Martius ◽

Philippe Naveau

Keyword(s):

Confidence Intervals ◽

Pareto Distribution ◽

Daily Precipitation ◽

Generalized Pareto Distribution ◽

Value Theory ◽

Cumulative Distribution ◽

Simultaneous Occurrence ◽

Lower Tail ◽

Generalized Pareto ◽

Return Levels

Both mean and extreme precipitation are highly relevant and a probability distribution that models the entire precipitation distribution therefore provides important information. Very low and extremely high precipitation amounts have traditionally been modeled separately. Gamma distributions are often used to model low and moderate precipitation amounts and extreme value theory allows to model the upper tail of the distribution. However, difficulties arise when making a link between upper and lower tail. One solution is to define a threshold that separates the distribution into extreme and non-extreme values, but the assignment of such a threshold for many locations is not trivial.&#160;Here we apply the Extended Generalized Pareto Distribution (EGPD) used by Tencaliec & al. 2019. This method overcomes the problem of finding a threshold between upper and lower tails thanks to a transition function (G) that describes the transition between the empirical distribution of precipitation and a Pareto distribution. The transition cumulative distribution function G has to be constrained by the upper tail and lower tail behavior. G can be estimated using Bernstein polynomials.EGPD is used here to characterize ERA-5 precipitation. ERA-5 is a new ECMWF climate re-analysis dataset that provides a numerical description of the recent climate by combining a numerical weather model with observations. The data set is global with a spatial resolution of 0.25&#176; and currently covers the period from 1979 to present.ERA-5 daily precipitation is compared to EOBS, a gridded dataset spatially interpolated from observations over Europe, and to CMORPH, a satellite-based global precipitation product. Simultaneous occurrence of extreme events is assessed with a hit rate. An intensity comparison is conducted with return levels confidence intervals and a Kullback Leibler divergence test, both derived from the EGPD.Overall, extreme event occurrences between ERA5 and EOBS over Europe appear to agree. The presence of overlap between 95% confidence intervals on return levels highly depends on the season and the probability of occurrence.

Download Full-text

A comparison of moderate and extreme ERA-5 daily precipitation with two observational data sets

10.5194/egusphere-egu21-666 ◽

2021 ◽

Author(s):

Pauline Rivoire ◽

Olivia Martius ◽

Philippe Naveau

Keyword(s):

Confidence Intervals ◽

Daily Precipitation ◽

Extreme Event ◽

Value Theory ◽

Cumulative Distribution ◽

Simultaneous Occurrence ◽

Data Set ◽

Mountainous Regions ◽

Recent Climate ◽

Leibler Divergence

Both mean and extreme precipitation are highly relevant and a probability distribution that models the entire precipitation distribution therefore provides important information. Gamma distributions are often used to model low and moderate precipitation amounts and extreme value theory allows to model the upper tail of the distribution. We apply the Extended Generalized Pareto Distribution (EGPD). Thanks to a transition function, this method overcomes the problem of finding a threshold between upper and lower tails. The transition cumulative distribution function of the EGPD is constrained on the upper tail and lower tail to enable a GPD behavior for both small and large extremes.EGPD is used here to characterize ERA-5 precipitation. ERA-5 is a new ECMWF climate re-analysis dataset that provides a numerical description of the recent climate by combining a numerical weather model with observations. The data set is global with a spatial resolution of 0.25&#176; and currently covers the period from 1979 to present. ERA-5 precipitation is computed from model forecasts and therefore needs validation against observational datasets. ERA-5 daily precipitation is compared to EOBS precipitation, a gridded dataset spatially interpolated from observations over Europe, and to CMORPH precipitation, a global satellite-based dataset. Simultaneous occurrence of extreme events is assessed with a hit rate. An intensity comparison is conducted with quantiles confidence intervals and a Kullback Leibler divergence test, both derived from the EGPD.Overall, good agreements but also strong mismatches between ERA-5 and the observational datasets can be found, depending on the feature of interest in precipitation data. This work highlights both. For example, extreme event occurrences between ERA5 and the observational datasets appear to agree. The overlap between 95% confidence intervals on quantiles depends on the season and the probability of occurrence. Over Europe, the best agreement results are generally reached in regions with high station density in EOBS. The global intensity comparison between ERA5 and CMORPH shows a good agreement for moderate quantiles, except for some mountainous regions, but presents a large signal of disagreement in the tropics for large quantiles.

Download Full-text

Spatial distribution of extreme precipitation in the Tibetan Plateau and effects of external forcing factors based on Generalized Pareto distribution

Water Science & Technology Water Supply ◽

10.2166/ws.2020.365 ◽

2020 ◽

Author(s):

Jiajia Gao ◽

Jun Du ◽

Xiaoqing Huang

Keyword(s):

Spatial Distribution ◽

Extreme Precipitation ◽

Pareto Distribution ◽

Daily Precipitation ◽

Generalized Pareto Distribution ◽

The Tibetan Plateau ◽

External Forcing ◽

Maximum Rainfall ◽

Flood Season ◽

Generalized Pareto

Abstract The daily precipitation data of the years 1955–2017 from May to September were retrieved; then a Generalized Pareto Distribution (GPD) and maximum likelihood methods were adopted to understand trends and calculate the reappearance period of heavy precipitation in the Tibetan Plateau (TP). The daily precipitation values at 22 stations in the TP were found to conform to the model, and theoretical and measured frequencies were consistent. According to the spatial distribution of the maximum precipitation value, the extreme values of Shigatse and Lhasa showed large fluctuations, and the probability of record-breaking precipitation events was low. In the western part of Nagqu, the probability of extreme precipitation was relatively low, and that of record-breaking precipitation was relatively high. The peak values of extreme precipitation in the flood season in the TP generally exhibited a decreasing trend from southeast to northwest, and the extreme value of the flood season that reappeared in the southeast region was approximatelytwice that of the northwest region. The maximum rainfall in most areas will exceed 20 mm in the next 5–10 years, and the maximum rainfall in Shigatse will reach 52.7 mm. After 15 years of recurrence in various regions, the peak rainfall in the flood season has become low. Most of the regions in the model have different responses to ENSO and Indian Ocean monsoon indices with external forcing factors.

Download Full-text

Bayesian and Non-Bayesian Inference for the Generalized Pareto Distribution Based on Progressive Type II Censored Sample

Mathematics ◽

10.3390/math6120319 ◽

2018 ◽

Vol 6 (12) ◽

pp. 319 ◽

Cited By ~ 1

Author(s):

Xuehua Hu ◽

Wenhao Gui

Keyword(s):

Confidence Intervals ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Delta Method ◽

Type Ii ◽

Unknown Parameters ◽

Hazard Functions ◽

Generalized Pareto ◽

Censored Sample ◽

Type Ii Censored Sample

In this paper, first we consider the maximum likelihood estimators for two unknown parameters, reliability and hazard functions of the generalized Pareto distribution under progressively Type II censored sample. Next, we discuss the asymptotic confidence intervals for two unknown parameters, reliability and hazard functions by using the delta method. Then, based on the bootstrap algorithm, we obtain another two pairs of approximate confidence intervals. Furthermore, by applying the Markov Chain Monte Carlo techniques, we derive the Bayesian estimates of the two unknown parameters, reliability and hazard functions under various balanced loss functions and the corresponding confidence intervals. A simulation study was conducted to compare the performances of the proposed estimators. A real dataset analysis was carried out to illustrate the proposed methods.

Download Full-text

Exponentiated generalized Pareto distribution: Properties and applications towards extreme value theory

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2018.1441418 ◽

2018 ◽

Vol 48 (8) ◽

pp. 2014-2038 ◽

Cited By ~ 3

Author(s):

Seyoon Lee ◽

Joseph H. T. Kim

Keyword(s):

Extreme Value Theory ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Value Theory ◽

Extreme Value ◽

Generalized Pareto

Download Full-text

ON THE TWO STEP THRESHOLD SELECTION FOR OVER-THRESHOLD MODELLING

Coastal Engineering Proceedings ◽

10.9753/icce.v33.management.42 ◽

2012 ◽

Vol 1 (33) ◽

pp. 42

Author(s):

Pietro Bernardara ◽

Franck Mazas ◽

Jérôme Weiss ◽

Marc Andreewsky ◽

Xavier Kergadallan ◽

...

Keyword(s):

Pareto Distribution ◽

Extreme Values ◽

Generalized Pareto Distribution ◽

Value Theory ◽

Water Levels ◽

Distributed Data ◽

Threshold Selection ◽

The Past ◽

Generalized Pareto ◽

Selection For

In the general framework of over-threshold modelling (OTM) for estimating extreme values of met-ocean variables, such as waves, surges or water levels, the threshold selection logically requires two steps: the physical declustering of time series of the variable in order to obtain samples of independent and identically distributed data then the application of the extreme value theory, which predicts the convergence of the upper part of the sample toward the Generalized Pareto Distribution. These two steps were often merged and confused in the past. A clear framework for distinguishing them is presented here. A review of the methods available in literature to carry out these two steps is given here together with the illustration of two simple and practical examples.

Download Full-text

Performance-based comparison of regionalization methods to improve the at-site estimates of daily precipitation

10.5194/hess-2021-546 ◽

2021 ◽

Author(s):

Abubakar Haruna ◽

Juliette Blanchet ◽

Anne-Catherine Favre

Keyword(s):

Pareto Distribution ◽

Daily Precipitation ◽

Additive Model ◽

Generalized Pareto Distribution ◽

Regional Frequency Analysis ◽

Distribution Model ◽

Model Parameters ◽

Generalized Pareto ◽

Spatial Method ◽

Homogeneous Regions

Abstract. In this article, we compare the performances of three regionalization approaches in improving the at-site estimates of daily precipitation. The first method is built on the idea of conventional RFA (Regional Frequency Analysis) but is based on a fast algorithm that defines distinct homogeneous regions relying on their upper tail similarity. It uses only the precipitation data at hand without the need for any additional covariate. The second is based on the region-of-influence (ROI) approach in which neighborhoods, containing similar sites, are defined for each station. The third is a spatial method that adopts Generalized Additive Model (GAM) forms for the model parameters. In line with our goal of modeling the whole range of positive precipitation, the chosen marginal distribution model is the Extended Generalized Pareto Distribution (EGPD) on which we apply the three methods. We consider a dense network composed of 1176 daily stations located within Switzerland and in neighboring countries. We compute different criteria to assess the models' performances both in the bulk of the distribution as well as in the upper tail. The results show that all the regional methods offered improved robustness over the local EGPD model. While the GAM method is more robust and reliable in the upper tail, the ROI method is better in the bulk of the distribution.

Download Full-text

Extreme Value Theory—Application of the Peaks Over Threshold Method and the Generalized Pareto Distribution to Athletics Decathlon and Heptathlon

10.1007/978-981-16-5063-5_75 ◽

2021 ◽

pp. 907-924

Author(s):

Domingos Silva ◽

Frederico Caeiro

Keyword(s):

Extreme Value Theory ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Value Theory ◽

Extreme Value ◽

Peaks Over Threshold ◽

Threshold Method ◽

Generalized Pareto ◽

Theory Application

Download Full-text

Confidence Intervals of the Generalized Pareto Distribution Parameters Based on Upper Record Values

Acta Mathematicae Applicatae Sinica English Series ◽

10.1007/s10255-019-0860-4 ◽

2019 ◽

Vol 35 (4) ◽

pp. 909-918 ◽

Cited By ~ 1

Author(s):

Xu Zhao ◽

Wei-hu Cheng ◽

Yang Zhang ◽

Shao-jie Wei ◽

Zhen-hai Yang

Keyword(s):

Confidence Intervals ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Record Values ◽

Generalized Pareto ◽

Distribution Parameters ◽

Upper Record Values ◽

Upper Record

Download Full-text

EXPECTED SHORTFALL DENGAN PENDEKATAN GLOSTEN-JAGANNATHAN-RUNKLE GARCH DAN GENERALIZED PARETO DISTRIBUTION

Jurnal Gaussian ◽

10.14710/j.gauss.v9i4.29447 ◽

2020 ◽

Vol 9 (4) ◽

pp. 505-514

Author(s):

Lina Tanasya ◽

Di Asih I Maruddani ◽

Tarno Tarno

Keyword(s):

Time Series ◽

Stock Return ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Garch Model ◽

Value Theory ◽

Expected Shortfall ◽

Series Data ◽

Generalized Pareto ◽

Heavy Tailed

Stock is a type of investment in financial assets that are many interested by investors. When investing, investors must calculate the expected return on stocks and notice risks that will occur. There are several methods can be used to measure the level of risk one of which is Value at Risk (VaR), but these method often doesn’t fulfill coherence as a risk measure because it doesn’t fulfill the nature of subadditivity. Therefore, the Expected Shortfall (ES) method is used to accommodate these weakness. Stock return data is time series data which has heteroscedasticity and heavy tailed, so time series models used to overcome the problem of heteroscedasticity is GARCH, while the theory for analyzing heavy tailed is Extreme Value Theory (EVT). In this study, there is also a leverage effect so used the asymmetric GARCH model with Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) model and the EVT theory with Generalized Pareto Distribution (GPD) to calculate ES of the stock return from PT. Bank Central Asia Tbk for the period May 1, 2012-January 31, 2020. The best model chosen was ARIMA(1,0,1) GJR-GARCH(1,2). At the 95% confidence level, the risk obtained by investors using a combination of GJR-GARCH and GPD calculations for the next day is 0.7147% exceeding the VaR value of 0.6925%.

Download Full-text

Using Statistical Approaches to Model Natural Disasters

American Journal of Undergraduate Research ◽

10.33697/ajur.2016.019 ◽

2016 ◽

Vol 13 (2) ◽

Cited By ~ 3

Author(s):

Audrene Edwards ◽

Kumer Das

Keyword(s):

Extreme Events ◽

Extreme Value Theory ◽

Pareto Distribution ◽

Generalized Pareto Distribution ◽

Value Theory ◽

Extreme Value ◽

Threshold Values ◽

Data Set ◽

Generalized Pareto ◽

Estimated Parameters

The study of extremes has attracted the attention of scientists, engineers, actuaries, policy makers, and statisticians for many years. Extreme value theory (EVT) deals with the extreme deviations from the median of probability distributions and is used to study rare but extreme events. EVT’s main results characterize the distribution of the sample maximum or the distribution of values above a given threshold. In this study, EVT has been used to construct a model on the extreme and rare earthquakes that have happened in the United States from 1700 to 2011.The primary goal of fitting such a model is to estimate the amount of losses due to those extreme events and the probabilities of such events. Several diagnostic methods (for example, QQ plot and Mean Excess Plot) have been used to justify that the data set follows generalized Pareto distribution (GPD). Three estimation techniques have been employed to estimate parameters. The consistency and reliability of estimated parameters have been observed for different threshold values. The purpose of this study is manifold: first, we investigate whether the data set follows GPD, by using graphical interpretation and hypothesis testing. Second, we estimate GPD parameters using three different estimation techniques. Third, we compare consistency and reliability of estimated parameters for different threshold values. Last, we investigate the bias of estimated parameters using a simulation study. The result is particularly useful because it can be used in many applications (for example, disaster management, engineering design, insurance industry, hydrology, ocean engineering, and traffic management) with a minimal set of assumptions about the true underlying distribution of a data set. KEYWORDS: Extreme Value Theory; QQ Plot; Mean Excess Plot; Mean Residual Plot; Peak Over Threshold; Generalized Pareto Distribution; Maximum Likelihood Method; Method of Moments; Probability-Weighted Moments; Shapiro-Wilk test; Anderson- Darling Test

Download Full-text