Extreme Value Analysis for Time-variable Mixed Workload

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.17671 ◽

2021 ◽

Author(s):

Szilárd Bozóki ◽

András Pataricza

Keyword(s):

Extreme Values ◽

Real Life ◽

Extreme Value ◽

Time Variable ◽

Extreme Value Analysis ◽

Computing Systems ◽

Value Analysis ◽

Root Cause ◽

Extreme Value Theorem ◽

Dynamic Time

Proper timeliness is vital for a lot of real-world computing systems. Understanding the phenomena of extreme workloads is essential because unhandled, extreme workloads could cause violation of timeliness requirements, service degradation, and even downtime. Extremity can have multiple roots: (1) service requests can naturally produce extreme workloads; (2) bursts could randomly occur on a probabilistic basis in case of a mixed workload in multiservice systems; (3) workload spikes typically happen in deadline bound tasks.Extreme Value Analysis (EVA) is a statistical method for modeling the extremely deviant values corresponding to the largest values. The foundation mathematics of EVA, the Extreme Value Theorem, requires the dataset to be independent and identically distributed. However, this is not generally true in practice because, usually, real-life processes are a mixture of sources with identifiable patterns. For example, seasonality and periodic fluctuations are regularly occurring patterns. Deadlines can be purely periodic, e.g., monthly tax submissions, or time variable, e.g., university homework submission with variable semester time schedules.We propose to preprocess the data using time series decomposition to separate the stochastic process causing extreme values. Moreover, we focus on the case where the root cause of the extreme values is the same mechanism: a deadline. We exploit known deadlines using dynamic time warp to search for the recurring similar workload peak patterns varying in time and amplitude.

Download Full-text

Covariate Extreme Value Analysis Using Wave Spectral Partitioning

Journal of Atmospheric and Oceanic Technology ◽

10.1175/jtech-d-19-0198.1 ◽

2020 ◽

Vol 37 (5) ◽

pp. 873-888 ◽

Cited By ~ 1

Author(s):

Jesús Portilla-Yandún ◽

Edwin Jácome

Keyword(s):

Extreme Values ◽

Wind Waves ◽

Spectral Energy Distribution ◽

Wave Spectrum ◽

Wave Climate ◽

Extreme Value ◽

Extreme Value Analysis ◽

Spectral Energy ◽

Value Analysis ◽

Spectral Partitioning

AbstractAn important requirement in extreme value analysis (EVA) is for the working variable to be identically distributed. However, this is typically not the case in wind waves, because energy components with different origins belong to separate data populations, with different statistical properties. Although this information is available in the wave spectrum, the working variable in EVA is typically the total significant wave height Hs, a parameter that does not contain information of the spectral energy distribution, and therefore does not fulfill this requirement. To gain insight in this aspect, we develop here a covariate EVA application based on spectral partitioning. We observe that in general the total Hs is inappropriate for EVA, leading to potential over- or underestimation of the projected extremes. This is illustrated with three representative cases under significantly different wave climate conditions. It is shown that the covariate analysis provides a meaningful understanding of the individual behavior of the wave components, in regard to the consequences for projecting extreme values.

Download Full-text

EXTREME VALUE ANALYSIS WITHOUT THE LARGEST VALUES: WHAT CAN BE DONE?

Probability in the Engineering and Informational Sciences ◽

10.1017/s0269964818000542 ◽

2019 ◽

Vol 34 (2) ◽

pp. 200-220

Author(s):

Jingjing Zou ◽

Richard A. Davis ◽

Gennady Samorodnitsky

Keyword(s):

Order Statistics ◽

Extreme Values ◽

Sample Path ◽

Estimation Procedure ◽

Extreme Value ◽

Hill Estimator ◽

Extreme Value Analysis ◽

Value Analysis ◽

Heavy Tailed ◽

The Hill

AbstractIn this paper, we are concerned with the analysis of heavy-tailed data when a portion of the extreme values is unavailable. This research was motivated by an analysis of the degree distributions in a large social network. The degree distributions of such networks tend to have power law behavior in the tails. We focus on the Hill estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for these data exhibited a smooth and increasing “sample path” as a function of the number of upper order statistics used in constructing the estimator. This behavior became more apparent as we artificially removed more of the upper order statistics. Building on this observation we introduce a new version of the Hill estimator. It is a function of the number of the upper order statistics used in the estimation, but also depends on the number of unavailable extreme values. We establish functional convergence of the normalized Hill estimator to a Gaussian process. An estimation procedure is developed based on the limit theory to estimate the number of missing extremes and extreme value parameters including the tail index and the bias of Hill's estimator. We illustrate how this approach works in both simulations and real data examples.

Download Full-text

Frequentist and Bayesian extreme value analysis on the wildfire events in Greece

10.5194/egusphere-egu2020-13014 ◽

2020 ◽

Author(s):

Nikos Koutsias ◽

Frank A. Coutelieris

Keyword(s):

Extreme Value Theory ◽

Extreme Values ◽

Value Theory ◽

Extreme Value ◽

Extreme Value Analysis ◽

Burned Area ◽

Accurate Assessment ◽

Peaks Over Threshold ◽

Bayesian Approaches ◽

Value Analysis

<p>A statistical analysis on the wildfire events, that have taken place in Greece during the period 1985-2007, for the assessment of the extremes has been performed. The total burned area of each fire was considered here as a key variable to express the significance of a given event. The data have been analyzed through the extreme value theory, which has been in general proved a powerful tool for the accurate assessment of the return period of extreme events. Both frequentist and Bayesian approaches have been used for comparison and evaluation purposes. Precisely, the Generalized Extreme Value (GEV) distribution along with Peaks over Threshold (POT) have been compared with the Bayesian Extreme Value modelling. Furthermore, the correlation of the burned area with the potential extreme values for other key parameters (e.g. wind, temperature, humidity, etc.) has been also investigated.</p>

Download Full-text

Electro-Mechanical Admittance-Based Damage Detection Using Extreme Value Statistics

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.385-387.561 ◽

2008 ◽

Vol 385-387 ◽

pp. 561-564 ◽

Cited By ~ 1

Author(s):

Costas P. Providakis

Keyword(s):

Health Monitoring ◽

Extreme Values ◽

Simulated Data ◽

Extreme Value ◽

Extreme Value Statistics ◽

Extreme Value Analysis ◽

Engineering Structures ◽

Value Analysis ◽

Electromechanical Impedance ◽

Tail Distribution

This paper presents the use of statistically rigorous algorithms combined with electromechanical (E/M) impedance approach for health monitoring of engineering structures. In particular, a statistical pattern recognition procedure is developed, based on frequency domain data of electromechanical impedance, to establish a decision boundary for damage identification. In order to diagnose damage with statistical confidence, health monitoring is cast in the context of outlier detection framework. Inappropriate modeling of tail distribution of outliers imposes potentially misleading behavior associated with damage. The present paper attempts to address the problem of establishing decision boundaries based on extreme value statistics so that the extreme values of outliers associated with tail distribution can be properly modeled. The validity of the proposed method is demonstrated using finite element method (FEM) simulated data while a comparison is performed for the extreme value analysis results contrasted with the standard approach where it is assumed that the damage-sensitive features are normally distributed.

Download Full-text

Non-stationary Extreme Value Analysis: a simplified approach for Earth science applications

10.5194/hess-2016-65 ◽

2016 ◽

Cited By ~ 6

Author(s):

Lorenzo Mentaschi ◽

Michalis Vousdoukas ◽

Evangelos Voukouvalas ◽

Ludovica Sartini ◽

Luc Feyen ◽

...

Keyword(s):

Time Series ◽

Extreme Values ◽

Extreme Value ◽

Stationary Time Series ◽

Alternative Methods ◽

Extreme Value Analysis ◽

Accurate Estimation ◽

Residual Water ◽

Value Analysis ◽

Simplified Approach

Abstract. Statistical approaches to study extreme events require by definition long time series of data. The climate is subject to natural and anthropogenic variations at different temporal scales, leaving their footprint on the frequency and intensity of climatic and hydrological extremes, therefore assumption of stationarity is violated and alternative methods to conventional stationary Extreme Value Analysis (EVA) need to be adopted. In this study we introduce the Transformed-Stationary (TS) methodology for non-stationary EVA. This approach consists in (i) transforming a non-stationary time series into a stationary one to which the stationary EVA theory can be applied; and (ii) reverse-transforming the result into a non-stationary extreme value distribution. As a transformation we propose and discuss a simple time-varying normalization of the signal and show that it allows a comprehensive formulation of non stationary GEV/GPD models with constant shape parameter. A validation of the methodology is carried out on time series of significant wave height, residual water level, and river discharge, which show varying degrees of long-term and seasonal variability. The results from the proposed approach are comparable with the ones from (a) a stationary EVA on quasi-stationary slices of non stationary series and (b) the previously applied non stationary EVA approach. However, the proposed technique comes with advantages in both cases, as in contrast to (a) it uses the whole time horizon of the series for the estimation of the extremes, allowing for a more accurate estimation of large return levels; and with respect to (b) it decouples the detection of non-stationary patterns from the fitting of the extreme values distribution. As a result the steps of the analysis are simplified and intermediate diagnostics are possible. In particular the transformation can be carried out by means of simple statistical techniques such as low-pass filters based on running mean and standard deviation, and the fitting procedure is a stationary one with a few degrees of freedom and easy to implement and control. An open-source MATLAB toolbox has been developed to cover this methodology, available at https://bitbucket.org/menta78/tseva.

Download Full-text

Application of Non-stationary Extreme Value Analysis to Satellite-Observed Sea Surface Temperature Data for Past Decades

Frontiers in Marine Science ◽

10.3389/fmars.2021.798408 ◽

2022 ◽

Vol 8 ◽

Author(s):

Eun-Young Lee ◽

Kyung-Ae Park

Keyword(s):

Surface Temperature ◽

Sea Surface Temperature ◽

Climate Model ◽

Extreme Values ◽

Extreme Value ◽

Early Summer ◽

Sea Surface ◽

Extreme Value Analysis ◽

Value Analysis ◽

The Past

Extreme value analysis (EVA) has been extensively used to understand and predict long-term return extreme values. This study provides the first approach to EVA using satellite-observed sea surface temperature (SST) data over the past decades. Representative EVA methods were compared to select an appropriate method to derive SST extremes of the East/Japan Sea (EJS). As a result, the peaks-over-threshold (POT) method showed better performance than the other methods. The Optimum Interpolation Sea Surface Temperature (OISST) database was used to calculate the 100-year-return SST values in the EJS. The calculated SST extremes were 1.60–3.44°C higher than the average value of the upper 5th-percentile satellite-observed SSTs over the past decades (1982–2018). The monthly distribution of the SST extremes was similar to the known seasonal variation of SSTs in the EJS, but enhanced extreme SSTs exceeding 2°C appeared in early summer and late autumn. The calculated 100-year-return SSTs were compared with the simulation results of the Coupled Model Intercomparison Project 5 (CMIP5) climate model. As a result, the extreme SSTs were slightly smaller than the maximum SSTs of the model data with a negative bias of –0.36°C. This study suggests that the POT method can improve our understanding of future oceanic warming based on statistical approaches using SSTs observed by satellites over the past decades.

Download Full-text

Extreme value analysis of wildfires in Canadian boreal forest ecosystems

Canadian Journal of Forest Research ◽

10.1139/x11-102 ◽

2011 ◽

Vol 41 (9) ◽

pp. 1836-1851 ◽

Cited By ~ 11

Author(s):

Yueyang Jiang ◽

Qianlai Zhuang

Keyword(s):

Extreme Values ◽

Extreme Value ◽

Extreme Value Analysis ◽

Burned Area ◽

Cumulative Probability ◽

Large Fire ◽

Data Set ◽

Value Analysis ◽

Gev Distribution ◽

Large Fires

Large fires are a major disturbance in Canadian forests and exert significant effects on both the climate system and ecosystems. During the last century, extremely large fires accounted for the majority of Canadian burned area. By making an instaneous change over a vast area of ecosystems, extreme fires often have significant social, economic, and ecological consequences. Since extreme values of fire size always situate in the upper tail of a cumulative probability distribution, the mean and variance alone are not sufficient to fully characterize those extreme events. To characterize the large fire behaviors in the upper tail, the authors in this study applied three extreme value distribution functions: (i) the generalized extreme value (GEV) distribution, (ii) the generalized Pareto distribution (GPD), and (iii) the GEV distribution with a Poisson point process (PP) representation to fit the Canadian historical fire data of the period 1959–2010. The analysis was conducted with the whole data set and different portions of the data set according to ignition sources (lightning-caused or human-caused) and ecozone classification. It is found that (i) all three extreme statistical models perform well to characterize extreme fire events, but the GPD and PP models need extra care to fit the nonstationary fire data, (ii) anthropogenic and natural extreme fires have significantly different extreme statistics, and (iii) fires in different ecozones exhibit very different characteristics in the view of statistics. Further, estimated fire return levels are comparable with observations in terms of the magnitude and frequency of an extreme event. These statistics of extreme values provide valuable information for future quantification of large fire risks and forest management in the region.

Download Full-text

Modeling European hot spells using extreme value analysis

Climate Research ◽

10.3354/cr01191 ◽

2014 ◽

Vol 58 (3) ◽

pp. 193-207 ◽

Cited By ~ 13

Author(s):

C Photiadou ◽

MR Jones ◽

D Keellings ◽

CF Dewes

Keyword(s):

Extreme Value ◽

Extreme Value Analysis ◽

Value Analysis ◽

Hot Spells

Download Full-text

Uncertainties in return values from extreme value analysis of peaks over threshold using the generalised Pareto distribution

Ocean Engineering ◽

10.1016/j.oceaneng.2020.107725 ◽

2021 ◽

pp. 107725

Author(s):

Philip Jonathan ◽

David Randell ◽

Jenny Wadsworth ◽

Jonathan Tawn

Keyword(s):

Pareto Distribution ◽

Extreme Value ◽

Extreme Value Analysis ◽

Peaks Over Threshold ◽

Value Analysis ◽

Generalised Pareto Distribution

Download Full-text

Threshold selection in univariate extreme value analysis

Extremes ◽

10.1007/s10687-021-00405-7 ◽

2021 ◽

Author(s):

Laura Fee Schneider ◽

Andrea Krajina ◽

Tatyana Krivobokova

Keyword(s):

Mean Squared Error ◽

Extreme Value ◽

Hill Estimator ◽

Small Samples ◽

Extreme Value Analysis ◽

Threshold Selection ◽

Value Analysis ◽

Wide Range ◽

Asymptotic Mean Squared Error ◽

The Hill

AbstractThreshold selection plays a key role in various aspects of statistical inference of rare events. In this work, two new threshold selection methods are introduced. The first approach measures the fit of the exponential approximation above a threshold and achieves good performance in small samples. The second method smoothly estimates the asymptotic mean squared error of the Hill estimator and performs consistently well over a wide range of processes. Both methods are analyzed theoretically, compared to existing procedures in an extensive simulation study and applied to a dataset of financial losses, where the underlying extreme value index is assumed to vary over time.

Download Full-text