scholarly journals A New Method to Detect Outliers in High-frequency Time Series

2018 ◽  
Vol 8 (1) ◽  
pp. 16
Author(s):  
Ilaria Lucrezia Amerise ◽  
Agostino Tarsitano

The objective of this research is to develop a fast, simple method for detecting and replacing extreme spikes in high-frequency time series data. The method primarily consists  of a nonparametric procedure that pursues a balance between fidelity to observed data and smoothness. Furthermore, through examination of the absolute difference between original and smoothed values, the technique is also able to detect and, where necessary, replace outliers with less extreme data. Unlike other filtering procedures found in the literature, our method does not require a model to be specified for the data. Additionally, the filter makes only a single pass through the time series. Experiments  show that the new method can be validly used as a data preparation tool to ensure that time series modeling is supported by clean data, particularly in a complex context such as one with high-frequency data.

2004 ◽  
Vol 91 (3-4) ◽  
pp. 332-344 ◽  
Author(s):  
Jin Chen ◽  
Per. Jönsson ◽  
Masayuki Tamura ◽  
Zhihui Gu ◽  
Bunkei Matsushita ◽  
...  

2014 ◽  
Vol 76 (6) ◽  
pp. 1333-1351 ◽  
Author(s):  
Kansuporn Sriyudthsak ◽  
Michio Iwata ◽  
Masami Yokota Hirai ◽  
Fumihide Shiraishi

2015 ◽  
Vol 5 (1) ◽  
pp. 36 ◽  
Author(s):  
Isaac O. Ajao ◽  
Femi J. Ayoola ◽  
Joseph O. Iyaniwura

Annual Gross Domestic Product (GDP) for Nigeria using observed annual time-series data for the period 1981-2012 was studied. Five different econometric disaggregation techniques, namely the Denton, Denton-Cholette, Chow-Lin-maxlog, Fernandez, and Litterman-maxlog, are used for quarterisation. We made use of quarterly Export and Import as the indicator variables while disaggregating annual into quarterly data. The time series properties of estimated quarterly series were examined using various methods for measuring the accuracy of prediction such as, Theil's Inequality Coefficient, Root Mean Squared Error (RMSE), Absolute Mean Difference (MAD), and Correlation Coefficients. Results obtained showed that export and import are not good indicators for predicting GDP for Nigeria is concerned for the period covered. Denton method proved to be the worst using Mean Absolute Difference (MAD) and Theil’s Inequality Coefficient. However, RSME% and Pearson’s correlation coefficient gave robust values for Litterman-maxlog, thereby making it the best method of temporal disaggregation of Nigeria GDP.


1971 ◽  
Vol 42 ◽  
pp. 41-45
Author(s):  
J. E. Hesser ◽  
B. M. Lasker

Time-series data for 14 stars in the list of Eggen and Greenstein have been used to compute their power spectra, which confirm previously found quiescency in the 4 to 700 sec period range. Additionally, characteristics of the continuous power spectra are considered.


2017 ◽  
Vol 4 (1) ◽  
pp. 160874 ◽  
Author(s):  
Matteo Smerlak ◽  
Bapu Vaitla

Resilience, the ability to recover from adverse events, is of fundamental importance to food security. This is especially true in poor countries, where basic needs are frequently threatened by economic, environmental and health shocks. An empirically sound formalization of the concept of food security resilience, however, is lacking. Here, we introduce a general non-equilibrium framework for quantifying resilience based on the statistical notion of persistence. Our approach can be applied to any food security variable for which high-frequency time-series data are available. We illustrate our method with per capita kilocalorie availability for 161 countries between 1961 and 2011. We find that resilient countries are not necessarily those that are characterized by high levels or less volatile fluctuations of kilocalorie intake. Accordingly, food security policies and programmes will need to be tailored not only to welfare levels at any one time, but also to long-run welfare dynamics.


2010 ◽  
Vol 14 (S1) ◽  
pp. 88-110 ◽  
Author(s):  
Phillip Wild ◽  
John Foster ◽  
Melvin J. Hinich

In this article, we show how tests of nonlinear serial dependence can be applied to high-frequency time series data that exhibit high volatility, strong mean reversion, and leptokurtotis. Portmanteau correlation, bicorrelation, and tricorrelation tests are used to detect nonlinear serial dependence in the data. Trimming is used to control for the presence of outliers in the data. The data that are employed are 161,786 half-hourly spot electricity price observations recorded over nearly a decade in the wholesale electricity market in New South Wales, Australia. Strong evidence of nonlinear serial dependence is found and its implications for time series modeling are discussed.


PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262463
Author(s):  
Keisuke Yoshihara ◽  
Kei Takahashi

We propose a simple anomaly detection method that is applicable to unlabeled time series data and is sufficiently tractable, even for non-technical entities, by using the density ratio estimation based on the state space model. Our detection rule is based on the ratio of log-likelihoods estimated by the dynamic linear model, i.e. the ratio of log-likelihood in our model to that in an over-dispersed model that we will call the NULL model. Using the Yahoo S5 data set and the Numenta Anomaly Benchmark data set, publicly available and commonly used benchmark data sets, we find that our method achieves better or comparable performance compared to the existing methods. The result implies that it is essential in time series anomaly detection to incorporate the specific information on time series data into the model. In addition, we apply the proposed method to unlabeled Web time series data, specifically, daily page view and average session duration data on an electronic commerce site that deals in insurance goods to show the applicability of our method to unlabeled real-world data. We find that the increase in page view caused by e-mail newsletter deliveries is less likely to contribute to completing an insurance contract. The result also suggests the importance of the simultaneous monitoring of more than one time series.


Sign in / Sign up

Export Citation Format

Share Document