scholarly journals Identification of Spikes in Time Series

2019 ◽  
Vol 8 (1) ◽  
Author(s):  
Dana E. Goin ◽  
Jennifer Ahern

Abstract Researchers interested in the effects of exposure spikes on an outcome need tools to identify unexpectedly high values in a time series. However, the best method to identify spikes in time series is not known. This paper aims to fill this gap by testing the performance of several spike detection methods in a simulation setting. We created simulations parameterized by monthly violence rates in nine California cities that represented different series features, and randomly inserted spikes into the series. We then compared the ability to detect spikes of the following methods: ARIMA modeling, Kalman filtering and smoothing, wavelet modeling with soft thresholding, and an iterative outlier detection method. We varied the magnitude of spikes from 10 to 50 % of the mean rate over the study period and varied the number of spikes inserted from 1 to 10. We assessed performance of each method using sensitivity and specificity. The Kalman filtering and smoothing procedure had the best overall performance. We applied each method to the monthly violence rates in nine California cities and identified spikes in the rate over the 2005–2012 period.

2020 ◽  
Vol 4 (Supplement_2) ◽  
pp. 1174-1174
Author(s):  
Paraskevi Massara ◽  
Robert Bandsma ◽  
Celine Bourdon ◽  
Jonathon Maguire ◽  
Elena Comelli ◽  
...  

Abstract Objectives Eliminating anthropometry measurement error and employing outlier and biological implausible values (BIV) detection methods adapted to longitudinal measurements is important for the study of growth. This work aimed to review and assess the accuracy of the available BIV and outlier detection methods and propose a growth trajectory outlier detection method. Methods We included 2354 infants from the Applied Research Group for Kids (TARGet Kids! ) cohort-based in Toronto (ON, Canada) that recruits healthy children from birth to 5 years of age. We considered infants with at least 8 length and weight measurements available between the 1st and the 24th month of age. Weight-for-length z-scores (wflz) were calculated using the WHO growth standards. Outlier measurements were randomly introduced in 5% of the wflz measurements using a normal distribution (μ = 0, σ = 1). We employed 4 outlier detection methods; an empirical detection method for BIV using the cut-offs derived from the WHO Child Growth Standards, a clustering method, a method based on cluster prototypes for individual outlier measurements and a method based on cluster prototypes for entire growth trajectories. Each method was applied individually and evaluated using the sensitivity and specificity indexes based on the manually introduced outliers. We also calculated the Kappa statistic to evaluate the agreement of each method against the manual outliers. Results After excluding premature (<37 weeks), low birth weight (<1500 g) neonates and children with missing length and weight measurements, we analyzed 393 children with a total of 3144 measurements. Sensitivity and specificity for the four methods ranged between 4.4%–55.0% and 83.7% −99.7%, respectively, with kappa being non-significant (P > 0.05) only for the empirical. The clustering detection method reported a higher finding rate, while the empirical method found most of the BIV, but few of the rest of the outliers. Conclusions BIV account for a small portion of the possible outliers in growth datasets. We show that additional statistical or model-based methods are required for a more comprehensive outlier detection process, which has implications for growth analysis and nutritional assessment. Funding Sources Joannah and Brian Lawson Center for Child Nutrition, Connaught Fund, Onassis Foundation.


2017 ◽  
Author(s):  
Abdelhadi El Yazidi ◽  
Michel Ramonet ◽  
Philippe Ciais ◽  
Gregoire Broquet ◽  
Isabelle Pison ◽  
...  

Abstract. This study deals with the problem of identifying atmospheric data that are influenced by local emissions which cause spikes in time series of greenhouse gases and long-lived tracer measurements. We considered three spike detection methods known as coefficient of variation (COV), robust extraction of baseline signal (REBS), and standard deviation of the background (SD), to detect and filter positive spikes in continuous greenhouse gas time series from four monitoring stations representative of the ICOS (Integrated Carbon Observation System) European Infrastructure network. The results of the different methods are compared to each other and against a manual detection performed by station managers. Four stations were selected as test cases to apply the spike detection methods: a continental rural tower of 100 m height in Eastern France (OPE); a high mountain observatory in the south-west of France (PDM); a regional marine background site in Crete (FKL); and a marine clean-air background site in the southern hemisphere in Amsterdam island (AMS). This panel allows addressing the spike detection problems in time series with different variability. Two years of continuous measurements of CO2, CH4 and CO were analyzed. All the methods were found to be able to detect short-term spikes (lasting from a few seconds to few minutes) in the time series. Analysis of the results of each method leads us to exclude the use of the COV method because of its requirement to arbitrarily specify an a priori percentage of rejected data in the time series, which may over- or under-estimate the actual number of spikes. The two other methods freely determine the number of spikes for a given set of parameters, and the values of these parameters were calibrated to provide the best match with spikes known to reflect local emissions episodes well documented by the station managers. More than 96 % of the spikes manually identified by station managers were successfully detected both in the SD and the REBS methods after the best adjustment of parameter values. At PDM, measurements made by two analyzers 200 m from each other allow to confirm that the CH4 spikes identified in one of the time-series but not in the other correspond to a local source from a sewage treatment facility in one of the observatory buildings. From this experiment, we found that the REBS method underestimates the number of positive anomalies in the CH4 data caused by local sewage emissions. As a conclusion, we recommend the use of the SD method, which also appears as the easiest one to implement as automatic data processing, for the operational filtering of spikes in greenhouses gases time series at global and regional monitoring stations of networks like ICOS.


2020 ◽  
Vol 2020 (10) ◽  
pp. 133-1-133-7
Author(s):  
Jiho Yoon ◽  
Chulhee Lee

In this paper, we propose a new edge detection method for color images, based on the Bhattacharyya distance with adjustable block space. First, the Wiener filter was used to remove the noise as pre-processing. To calculate the Bhattacharyya distance, a pair of blocks were extracted for each pixel. To detect subtle edges, we adjusted the block space. The mean vector and covariance matrix were computed from each block. Using the mean vectors and covariance matrices, we computed the Bhattacharyya distance, which was used to detect edges. By adjusting the block space, we were able to detect weak edges, which other edge detections failed to detect. Experimental results show promising results compared to some existing edge detection methods.


2020 ◽  
Vol 13 (02) ◽  
pp. 1-8
Author(s):  
Agrienvi

ABSTRACTChili is one of the leading commodities of vegetables which has strategic value at national and regional levels.An unexpected increase in chili prices often results a surge of inflation and economic turmoil. Study and modeling ofchili production are needed as a planning and evaluation material for policy makers. One of the most frequently usedmethods in modeling and forecasting time series data is Autoregressive Integrated Moving Avarage (ARIMA). Theresults of ARIMA modeling on chili production data found that the data were unstationer conditions of the mean so thatmust differenced while the data on the production of small chilli carried out the stages of data transformation anddifferencing due to the unstationer of data on variants and the mean. The best ARIMA model that can be applied basedon the smallest AIC and MSE criteria for data on the amount of chili and small chilli production in Central KalimantanProvince is ARIMA (3,1,0).Keywords: modeling of chilli, forecasting of chilli, Autoregresive Integrated Moving Avarage, ARIMA, Box-Jenkins.


Symmetry ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2147
Author(s):  
Liwei Deng ◽  
Xiaofei Wang ◽  
Jiazhong Xu

The early diagnosis of retinopathy is crucial to the prevention and treatment of diabetic retinopathy. The low proportion of positive cases in the asymmetric microaneurysm detection problem causes preprocessing to treat microaneurysms as noise to be eliminated. To obtain a binary image containing microaneurysms, the object was segmented by a symmetry algorithm, which is a combination of the connected components and SSA methods. Next, a candidate microaneurysm set was extracted by multifeature clustering of binary images. Finally, the candidate microaneurysms were mapped to the Radon frequency domain to achieve microaneurysm detection. In order to verify the feasibility of the algorithm, a comparative experiment was conducted on the combination of the connected components and SSA methods. In addition, PSNR, FSIM, SSIM, fitness value, average CPU time and other indicators were used as evaluation standards. The results showed that the overall performance of the binary image obtained by the algorithm was the best. Last but not least, the accuracy of the detection method for microaneurysms in this paper reached up to 93.24%, which was better than that of several classic microaneurysm detection methods in the same period.


2021 ◽  
Vol 12 (3) ◽  
Author(s):  
Luciana Escobar ◽  
Rebecca Salles ◽  
Janio Lima ◽  
Cristiane Gea ◽  
Lais Baroni ◽  
...  

The detection of events in time series is an important task in several areas of knowledge where operations monitoring is essential. Experts often have to deal with choosing the most appropriate event detection method for a time series, which can be a complex task. There is a demand for benchmarking different methods in order to guide this choice. For this, standard classification accuracy metrics are usually adopted. However, they are insufficient for a qualitative analysis of the tendency of a method to precede or delay event detections. Such analysis is interesting for applications in which tolerance for "close" detections is important rather than focusing only on accurate ones. In this context, this paper proposes a more comprehensive event detection benchmark process, including an analysis of temporal bias of detection methods. For that, metrics based on the time distance between event detections and identified events (detection delay) are adopted. Computational experiments were conducted using real-world and synthetic datasets from Yahoo Labs and resources from the Harbinger framework for event detection. Adopting the proposed detection delay-based metrics helped obtain a complete overview of the performance and general behavior of detection methods.


Viruses ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1692
Author(s):  
Kathrine Kronberg Jakobsen ◽  
Amanda-Louise Fenger Carlander ◽  
Simone Kloch Bendtsen ◽  
Martin Garset-Zamani ◽  
Charlotte Duch Lynggaard ◽  
...  

The aim of the study was to evaluate the diagnostic accuracy of Human Papillomavirus (HPV) techniques in oropharyngeal cancer. PubMed, EMBASE, the Cochrane Library and clinicaltrials.org were systematically searched for studies reporting methods of HPV detection. Primary outcomes were sensitivity and specificity of HPV detection. In this case, 27 studies were included (n = 5488, 41.6% HPV+). In this case, 13 studies evaluated HPV detection in tumour tissue, nine studies examined HPV detection in blood samples and five studies evaluated HPV detection in oral samples. Accuracy of HPV detection in tumour tissue was high for all detection methods, with pooled sensitivity ranging from 81.1% (95% CI 71.9–87.8) to 93.1% (95% CI 87.4–96.4) and specificity ranging from 81.1% (95% CI 71.9–87.8) to 94.9% (95% CI 79.1–98.9) depending on detection methods. Overall accuracy of HPV detection in blood samples revealed a sensitivity of 81.4% (95% CI 62.9–91.9) and a specificity of 94.8% (95% CI 91.4–96.9). In oral samples pooled sensitivity and specificity were lower (77.0% (95% CI 68.8–83.6) and 74.0% (95% CI 58.0–85.4)). In conclusion, we found an overall high accuracy for HPV detection in tumour tissue regardless of the HPV detection method used. HPV detection in blood samples may provide a promising new way of HPV detection.


2021 ◽  
Vol 72 ◽  
pp. 849-899
Author(s):  
Cynthia Freeman ◽  
Jonathan Merriman ◽  
Ian Beaver ◽  
Abdullah Mueen

The existence of an anomaly detection method that is optimal for all domains is a myth. Thus, there exists a plethora of anomaly detection methods which increases every year for a wide variety of domains. But a strength can also be a weakness; given this massive library of methods, how can one select the best method for their application? Current literature is focused on creating new anomaly detection methods or large frameworks for experimenting with multiple methods at the same time. However, and especially as the literature continues to expand, an extensive evaluation of every anomaly detection method is simply not feasible. To reduce this evaluation burden, we present guidelines to intelligently choose the optimal anomaly detection methods based on the characteristics the time series displays such as seasonality, trend, level change concept drift, and missing time steps. We provide a comprehensive experimental validation and survey of twelve anomaly detection methods over different time series characteristics to form guidelines based on several metrics: the AUC (Area Under the Curve), windowed F-score, and Numenta Anomaly Benchmark (NAB) scoring model. Applying our methodologies can save time and effort by surfacing the most promising anomaly detection methods instead of experimenting extensively with a rapidly expanding library of anomaly detection methods, especially in an online setting.


2018 ◽  
Vol 11 (3) ◽  
pp. 1599-1614 ◽  
Author(s):  
Abdelhadi El Yazidi ◽  
Michel Ramonet ◽  
Philippe Ciais ◽  
Gregoire Broquet ◽  
Isabelle Pison ◽  
...  

Abstract. This study deals with the problem of identifying atmospheric data influenced by local emissions that can result in spikes in time series of greenhouse gases and long-lived tracer measurements. We considered three spike detection methods known as coefficient of variation (COV), robust extraction of baseline signal (REBS) and standard deviation of the background (SD) to detect and filter positive spikes in continuous greenhouse gas time series from four monitoring stations representative of the European ICOS (Integrated Carbon Observation System) Research Infrastructure network. The results of the different methods are compared to each other and against a manual detection performed by station managers. Four stations were selected as test cases to apply the spike detection methods: a continental rural tower of 100 m height in eastern France (OPE), a high-mountain observatory in the south-west of France (PDM), a regional marine background site in Crete (FKL) and a marine clean-air background site in the Southern Hemisphere on Amsterdam Island (AMS). This selection allows us to address spike detection problems in time series with different variability. Two years of continuous measurements of CO2, CH4 and CO were analysed. All methods were found to be able to detect short-term spikes (lasting from a few seconds to a few minutes) in the time series. Analysis of the results of each method leads us to exclude the COV method due to the requirement to arbitrarily specify an a priori percentage of rejected data in the time series, which may over- or underestimate the actual number of spikes. The two other methods freely determine the number of spikes for a given set of parameters, and the values of these parameters were calibrated to provide the best match with spikes known to reflect local emissions episodes that are well documented by the station managers. More than 96 % of the spikes manually identified by station managers were successfully detected both in the SD and the REBS methods after the best adjustment of parameter values. At PDM, measurements made by two analyzers located 200 m from each other allow us to confirm that the CH4 spikes identified in one of the time series but not in the other correspond to a local source from a sewage treatment facility in one of the observatory buildings. From this experiment, we also found that the REBS method underestimates the number of positive anomalies in the CH4 data caused by local sewage emissions. As a conclusion, we recommend the use of the SD method, which also appears to be the easiest one to implement in automatic data processing, used for the operational filtering of spikes in greenhouse gases time series at global and regional monitoring stations of networks like that of the ICOS atmosphere network.


2020 ◽  
Vol 13 (02) ◽  
pp. 1-8
Author(s):  
Agrienvi

ABSTRACTChili is one of the leading commodities of vegetables which has strategic value at national and regional levels.An unexpected increase in chili prices often results a surge of inflation and economic turmoil. Study and modeling ofchili production are needed as a planning and evaluation material for policy makers. One of the most frequently usedmethods in modeling and forecasting time series data is Autoregressive Integrated Moving Avarage (ARIMA). Theresults of ARIMA modeling on chili production data found that the data were unstationer conditions of the mean sothat must differenced while the data on the production of small chilli carried out the stages of data transformation anddifferencing due to the unstationer of data on variants and the mean. The best ARIMA model that can be appliedbased on the smallest AIC and MSE criteria for data on the amount of chili and small chilli production in CentralKalimantan Province is ARIMA (3,1,0).Keywords: modeling of chilli, forecasting of chilli, Autoregresive Integrated Moving Avarage, ARIMA, Box-Jenkins.


Sign in / Sign up

Export Citation Format

Share Document