scholarly journals Extreme Value Analysis for Time-variable Mixed Workload

Author(s):  
Szilárd Bozóki ◽  
András Pataricza

Proper timeliness is vital for a lot of real-world computing systems. Understanding the phenomena of extreme workloads is essential because unhandled, extreme workloads could cause violation of timeliness requirements, service degradation, and even downtime. Extremity can have multiple roots: (1) service requests can naturally produce extreme workloads; (2) bursts could randomly occur on a probabilistic basis in case of a mixed workload in multiservice systems; (3) workload spikes typically happen in deadline bound tasks.Extreme Value Analysis (EVA) is a statistical method for modeling the extremely deviant values corresponding to the largest values. The foundation mathematics of EVA, the Extreme Value Theorem, requires the dataset to be independent and identically distributed. However, this is not generally true in practice because, usually, real-life processes are a mixture of sources with identifiable patterns. For example, seasonality and periodic fluctuations are regularly occurring patterns. Deadlines can be purely periodic, e.g., monthly tax submissions, or time variable, e.g., university homework submission with variable semester time schedules.We propose to preprocess the data using time series decomposition to separate the stochastic process causing extreme values. Moreover, we focus on the case where the root cause of the extreme values is the same mechanism: a deadline. We exploit known deadlines using dynamic time warp to search for the recurring similar workload peak patterns varying in time and amplitude.

2020 ◽  
Vol 37 (5) ◽  
pp. 873-888 ◽  
Author(s):  
Jesús Portilla-Yandún ◽  
Edwin Jácome

AbstractAn important requirement in extreme value analysis (EVA) is for the working variable to be identically distributed. However, this is typically not the case in wind waves, because energy components with different origins belong to separate data populations, with different statistical properties. Although this information is available in the wave spectrum, the working variable in EVA is typically the total significant wave height Hs, a parameter that does not contain information of the spectral energy distribution, and therefore does not fulfill this requirement. To gain insight in this aspect, we develop here a covariate EVA application based on spectral partitioning. We observe that in general the total Hs is inappropriate for EVA, leading to potential over- or underestimation of the projected extremes. This is illustrated with three representative cases under significantly different wave climate conditions. It is shown that the covariate analysis provides a meaningful understanding of the individual behavior of the wave components, in regard to the consequences for projecting extreme values.


2019 ◽  
Vol 34 (2) ◽  
pp. 200-220
Author(s):  
Jingjing Zou ◽  
Richard A. Davis ◽  
Gennady Samorodnitsky

AbstractIn this paper, we are concerned with the analysis of heavy-tailed data when a portion of the extreme values is unavailable. This research was motivated by an analysis of the degree distributions in a large social network. The degree distributions of such networks tend to have power law behavior in the tails. We focus on the Hill estimator, which plays a starring role in heavy-tailed modeling. The Hill estimator for these data exhibited a smooth and increasing “sample path” as a function of the number of upper order statistics used in constructing the estimator. This behavior became more apparent as we artificially removed more of the upper order statistics. Building on this observation we introduce a new version of the Hill estimator. It is a function of the number of the upper order statistics used in the estimation, but also depends on the number of unavailable extreme values. We establish functional convergence of the normalized Hill estimator to a Gaussian process. An estimation procedure is developed based on the limit theory to estimate the number of missing extremes and extreme value parameters including the tail index and the bias of Hill's estimator. We illustrate how this approach works in both simulations and real data examples.


2020 ◽  
Author(s):  
Nikos Koutsias ◽  
Frank A. Coutelieris

<p>A statistical analysis on the wildfire events, that have taken place in Greece during the period 1985-2007, for the assessment of the extremes has been performed. The total burned area of each fire was considered here as a key variable to express the significance of a given event. The data have been analyzed through the extreme value theory, which has been in general proved a powerful tool for the accurate assessment of the return period of extreme events. Both frequentist and Bayesian approaches have been used for comparison and evaluation purposes. Precisely, the Generalized Extreme Value (GEV) distribution along with Peaks over Threshold (POT) have been compared with the Bayesian Extreme Value modelling. Furthermore, the correlation of the burned area with the potential extreme values for other key parameters (e.g. wind, temperature, humidity, etc.) has been also investigated.</p>


2008 ◽  
Vol 385-387 ◽  
pp. 561-564 ◽  
Author(s):  
Costas P. Providakis

This paper presents the use of statistically rigorous algorithms combined with electromechanical (E/M) impedance approach for health monitoring of engineering structures. In particular, a statistical pattern recognition procedure is developed, based on frequency domain data of electromechanical impedance, to establish a decision boundary for damage identification. In order to diagnose damage with statistical confidence, health monitoring is cast in the context of outlier detection framework. Inappropriate modeling of tail distribution of outliers imposes potentially misleading behavior associated with damage. The present paper attempts to address the problem of establishing decision boundaries based on extreme value statistics so that the extreme values of outliers associated with tail distribution can be properly modeled. The validity of the proposed method is demonstrated using finite element method (FEM) simulated data while a comparison is performed for the extreme value analysis results contrasted with the standard approach where it is assumed that the damage-sensitive features are normally distributed.


2016 ◽  
Author(s):  
Lorenzo Mentaschi ◽  
Michalis Vousdoukas ◽  
Evangelos Voukouvalas ◽  
Ludovica Sartini ◽  
Luc Feyen ◽  
...  

Abstract. Statistical approaches to study extreme events require by definition long time series of data. The climate is subject to natural and anthropogenic variations at different temporal scales, leaving their footprint on the frequency and intensity of climatic and hydrological extremes, therefore assumption of stationarity is violated and alternative methods to conventional stationary Extreme Value Analysis (EVA) need to be adopted. In this study we introduce the Transformed-Stationary (TS) methodology for non-stationary EVA. This approach consists in (i) transforming a non-stationary time series into a stationary one to which the stationary EVA theory can be applied; and (ii) reverse-transforming the result into a non-stationary extreme value distribution. As a transformation we propose and discuss a simple time-varying normalization of the signal and show that it allows a comprehensive formulation of non stationary GEV/GPD models with constant shape parameter. A validation of the methodology is carried out on time series of significant wave height, residual water level, and river discharge, which show varying degrees of long-term and seasonal variability. The results from the proposed approach are comparable with the ones from (a) a stationary EVA on quasi-stationary slices of non stationary series and (b) the previously applied non stationary EVA approach. However, the proposed technique comes with advantages in both cases, as in contrast to (a) it uses the whole time horizon of the series for the estimation of the extremes, allowing for a more accurate estimation of large return levels; and with respect to (b) it decouples the detection of non-stationary patterns from the fitting of the extreme values distribution. As a result the steps of the analysis are simplified and intermediate diagnostics are possible. In particular the transformation can be carried out by means of simple statistical techniques such as low-pass filters based on running mean and standard deviation, and the fitting procedure is a stationary one with a few degrees of freedom and easy to implement and control. An open-source MATLAB toolbox has been developed to cover this methodology, available at https://bitbucket.org/menta78/tseva.


2022 ◽  
Vol 8 ◽  
Author(s):  
Eun-Young Lee ◽  
Kyung-Ae Park

Extreme value analysis (EVA) has been extensively used to understand and predict long-term return extreme values. This study provides the first approach to EVA using satellite-observed sea surface temperature (SST) data over the past decades. Representative EVA methods were compared to select an appropriate method to derive SST extremes of the East/Japan Sea (EJS). As a result, the peaks-over-threshold (POT) method showed better performance than the other methods. The Optimum Interpolation Sea Surface Temperature (OISST) database was used to calculate the 100-year-return SST values in the EJS. The calculated SST extremes were 1.60–3.44°C higher than the average value of the upper 5th-percentile satellite-observed SSTs over the past decades (1982–2018). The monthly distribution of the SST extremes was similar to the known seasonal variation of SSTs in the EJS, but enhanced extreme SSTs exceeding 2°C appeared in early summer and late autumn. The calculated 100-year-return SSTs were compared with the simulation results of the Coupled Model Intercomparison Project 5 (CMIP5) climate model. As a result, the extreme SSTs were slightly smaller than the maximum SSTs of the model data with a negative bias of –0.36°C. This study suggests that the POT method can improve our understanding of future oceanic warming based on statistical approaches using SSTs observed by satellites over the past decades.


2011 ◽  
Vol 41 (9) ◽  
pp. 1836-1851 ◽  
Author(s):  
Yueyang Jiang ◽  
Qianlai Zhuang

Large fires are a major disturbance in Canadian forests and exert significant effects on both the climate system and ecosystems. During the last century, extremely large fires accounted for the majority of Canadian burned area. By making an instaneous change over a vast area of ecosystems, extreme fires often have significant social, economic, and ecological consequences. Since extreme values of fire size always situate in the upper tail of a cumulative probability distribution, the mean and variance alone are not sufficient to fully characterize those extreme events. To characterize the large fire behaviors in the upper tail, the authors in this study applied three extreme value distribution functions: (i) the generalized extreme value (GEV) distribution, (ii) the generalized Pareto distribution (GPD), and (iii) the GEV distribution with a Poisson point process (PP) representation to fit the Canadian historical fire data of the period 1959–2010. The analysis was conducted with the whole data set and different portions of the data set according to ignition sources (lightning-caused or human-caused) and ecozone classification. It is found that (i) all three extreme statistical models perform well to characterize extreme fire events, but the GPD and PP models need extra care to fit the nonstationary fire data, (ii) anthropogenic and natural extreme fires have significantly different extreme statistics, and (iii) fires in different ecozones exhibit very different characteristics in the view of statistics. Further, estimated fire return levels are comparable with observations in terms of the magnitude and frequency of an extreme event. These statistics of extreme values provide valuable information for future quantification of large fire risks and forest management in the region.


2014 ◽  
Vol 58 (3) ◽  
pp. 193-207 ◽  
Author(s):  
C Photiadou ◽  
MR Jones ◽  
D Keellings ◽  
CF Dewes

Extremes ◽  
2021 ◽  
Author(s):  
Laura Fee Schneider ◽  
Andrea Krajina ◽  
Tatyana Krivobokova

AbstractThreshold selection plays a key role in various aspects of statistical inference of rare events. In this work, two new threshold selection methods are introduced. The first approach measures the fit of the exponential approximation above a threshold and achieves good performance in small samples. The second method smoothly estimates the asymptotic mean squared error of the Hill estimator and performs consistently well over a wide range of processes. Both methods are analyzed theoretically, compared to existing procedures in an extensive simulation study and applied to a dataset of financial losses, where the underlying extreme value index is assumed to vary over time.


Sign in / Sign up

Export Citation Format

Share Document