Improved algorithm for cleaning high frequency data: An analysis of foreign currency
High-frequency data are notorious for their noise and asynchrony, which may bias or contaminate the empirical analysis of prices and returns. In this study, we develop a novel data filtering approach that simultaneously addresses volatility clustering and irregular spacing, which are inherent characteristics of high-frequency data. Using high frequency currency data collected at five-minute intervals, we find the presence of vast microstructure noise coupled with random volatility clusters, and observe an extremely non-Gaussian distribution of returns. To process non-Gaussian high-frequency data for time series modelling, we propose two efficient and robust standardisation methods that cater for volatility clusters, which clean the data and achieve near-normal distributions. We show that the filtering process efficiently cleans high-frequency data for use in empirical settings while retaining the underlying distributional properties