Selection of an optimal algorithm for outlier detection in GNSS time series

Author(s):  
Nhung Le Thi ◽  
Benjamin Männel ◽  
Mihaela Jarema ◽  
Gopi Krishna Seemala ◽  
Kosuke Heki ◽  
...  

<p>In data mining, outliers can lead to misleading interpretations of statistical results, particularly in deformation monitoring based on fluctuations and disturbances simulated by numerical models for the analysis of deformations. Therefore, outlier filtering cannot be ignored in data standardization. However, it is not likely that a filtering algorithm is efficient for every data pattern. We investigate five outlier filtering algorithms using MATLAB® (Release 2020a): moving average, moving median, quartiles, Grubbs, and generalized extreme Studentized deviation (GESD) to select the optimal algorithms applied for GNSS time series data. This study is conducted on two types of data used for ionosphere disturbance analysis in the region of the Ring of Fire and crustal deformation monitoring in Germany, one showing seasonal time series patterns and the other presenting the trend models. We apply the simple random sampling method that ensures the principles of unbiased surveying techniques. The optimal algorithm selection is based on the sensitivity of outlier detection and the capability of the central tendency measures. The algorithm robustness is also tested by altering random outliers but maintaining the standard distribution of each dataset. Our results show that the moving median algorithm is most sensitive for outlier detection because it is robust statistics and is not affected by anomalies; followed in turn by quartiles, GESD, and Grubbs. The outlier filtering capability of the moving average algorithm is least efficient, with a percentage of outlier detection below 20% compared to the moving median (corresponding 95% probability). In deformation analysis, disturbances on numerical models are often the basis for motion assessment, while these anomalies are smoothed by moving median filtering. Hence, the quartiles algorithm can be considered in this case. Overall, the moving median is best suited to filter outliers for seasonal and trend time series data; in particular, for deformation analysis, the optimal solution is applying the quartiles or extending the threshold factor and the sliding window of the moving median.</p><p><strong>Keywords: </strong>Outlier filtering, Time series, Deformation analysis, Moving median, Quartiles, MATLAB.</p>

2021 ◽  
Vol 11 (8) ◽  
pp. 3561
Author(s):  
Diego Duarte ◽  
Chris Walshaw ◽  
Nadarajah Ramesh

Across the world, healthcare systems are under stress and this has been hugely exacerbated by the COVID pandemic. Key Performance Indicators (KPIs), usually in the form of time-series data, are used to help manage that stress. Making reliable predictions of these indicators, particularly for emergency departments (ED), can facilitate acute unit planning, enhance quality of care and optimise resources. This motivates models that can forecast relevant KPIs and this paper addresses that need by comparing the Autoregressive Integrated Moving Average (ARIMA) method, a purely statistical model, to Prophet, a decomposable forecasting model based on trend, seasonality and holidays variables, and to the General Regression Neural Network (GRNN), a machine learning model. The dataset analysed is formed of four hourly valued indicators from a UK hospital: Patients in Department; Number of Attendances; Unallocated Patients with a DTA (Decision to Admit); Medically Fit for Discharge. Typically, the data exhibit regular patterns and seasonal trends and can be impacted by external factors such as the weather or major incidents. The COVID pandemic is an extreme instance of the latter and the behaviour of sample data changed dramatically. The capacity to quickly adapt to these changes is crucial and is a factor that shows better results for GRNN in both accuracy and reliability.


MAUSAM ◽  
2021 ◽  
Vol 68 (2) ◽  
pp. 349-356
Author(s):  
J. HAZARIKA ◽  
B. PATHAK ◽  
A. N. PATOWARY

Perceptive the rainfall pattern is tough for the solution of several regional environmental issues of water resources management, with implications for agriculture, climate change, and natural calamity such as floods and droughts. Statistical computing, modeling and forecasting data are key instruments for studying these patterns. The study of time series analysis and forecasting has become a major tool in different applications in hydrology and environmental fields. Among the most effective approaches for analyzing time series data is the ARIMA (Autoregressive Integrated Moving Average) model introduced by Box and Jenkins. In this study, an attempt has been made to use Box-Jenkins methodology to build ARIMA model for monthly rainfall data taken from Dibrugarh for the period of 1980- 2014 with a total of 420 points.  We investigated and found that ARIMA (0, 0, 0) (0, 1, 1)12 model is suitable for the given data set. As such this model can be used to forecast the pattern of monthly rainfall for the upcoming years, which can help the decision makers to establish priorities in terms of agricultural, flood, water demand management etc.  


2018 ◽  
Vol 29 (11) ◽  
pp. 1850109 ◽  
Author(s):  
Emrah Oral ◽  
Gazanfer Unal

This leading primary study is about modeling multifractal wavelet scale time series data using multiple wavelet coherence (MWC), continuous wavelet transform (CWT) and multifractal detrended fluctuation analysis (MFDFA) and forecasting with vector autoregressive fractionally integrated moving average (VARFIMA) model. The data is acquired from Yahoo Finances!, which is composed of 1671 daily stock market of eastern (NIKKEI, TAIEX, KOPSI) and western (SP500, FTSE, DAX) markets. Once the co-movement dependencies on time-frequency space are determined with MWC, the coherent data is extracted out of raw data at a certain scale by using CWT. The multifractal behavior of the extracted series is verified by MFDFA and its local Hurst exponents have been calculated obtaining root mean square of residuals at each scale. This inter-calculated fluctuation function time series has been re-scaled and used to estimate the process with VARFIMA model and forecasted accordingly. The results have shown that the direction of price change is determined without difficulty and the efficiency of forecasting has been substantially increased using highly correlated multifractal wavelet scale time series data.


Sign in / Sign up

Export Citation Format

Share Document