Bagplots, Boxplots and Outlier Detection for Functional Data

Author(s):  
Rob Hyndman ◽  
Han Lin Shang
2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Laura Millán-Roures ◽  
Irene Epifanio ◽  
Vicente Martínez

A functional data analysis (FDA) based methodology for detecting anomalous flows in urban water networks is introduced. Primary hydraulic variables are recorded in real-time by telecontrol systems, so they are functional data (FD). In the first stage, the data are validated (false data are detected) and reconstructed, since there could be not only false data, but also missing and noisy data. FDA tools are used such as tolerance bands for FD and smoothing for dense and sparse FD. In the second stage, functional outlier detection tools are used in two phases. In Phase I, the data are cleared of anomalies to ensure that data are representative of the in-control system. The objective of Phase II is system monitoring. A new functional outlier detection method is also proposed based on archetypal analysis. The methodology is applied and illustrated with real data. A simulated study is also carried out to assess the performance of the outlier detection techniques, including our proposal. The results are very promising.


Stats ◽  
2021 ◽  
Vol 4 (4) ◽  
pp. 971-1011
Author(s):  
Moritz Herrmann ◽  
Fabian Scheipl

We consider functional outlier detection from a geometric perspective, specifically: for functional datasets drawn from a functional manifold, which is defined by the data’s modes of variation in shape, translation, and phase. Based on this manifold, we developed a conceptualization of functional outlier detection that is more widely applicable and realistic than previously proposed taxonomies. Our theoretical and experimental analyses demonstrated several important advantages of this perspective: it considerably improves theoretical understanding and allows describing and analyzing complex functional outlier scenarios consistently and in full generality, by differentiating between structurally anomalous outlier data that are off-manifold and distributionally outlying data that are on-manifold, but at its margins. This improves the practical feasibility of functional outlier detection: we show that simple manifold-learning methods can be used to reliably infer and visualize the geometric structure of functional datasets. We also show that standard outlier-detection methods requiring tabular data inputs can be applied to functional data very successfully by simply using their vector-valued representations learned from manifold learning methods as the input features. Our experiments on synthetic and real datasets demonstrated that this approach leads to outlier detection performances at least on par with existing functional-data-specific methods in a large variety of settings, without the highly specialized, complex methodology and narrow domain of application these methods often entail.


2021 ◽  
pp. 251-266
Author(s):  
Christopher Rieser ◽  
Peter Filzmoser

AbstractWith accurate data, governments can make the most informed decisions to keep people safer through pandemics such as the COVID-19 coronavirus. In such events, data reliability is crucial and therefore outlier detection is an important and even unavoidable issue. Outliers are often considered as the most interesting observations, because the fact that they differ from the data majority may lead to relevant findings in the subject area. Outlier detection has also been addressed in the context of multivariate functional data, thus smooth functions of several characteristics, often derived from measurements at different time points (Hubert et al. in Stat Methods Appl 24(2):177–202, 2015b). Here the underlying data are regarded as compositions, with the compositional parts forming the multivariate information, and thus only relative information in terms of log-ratios between these parts is considered as relevant for the analysis. The multivariate functional data thus have to be derived as smooth functions by utilising this relative information. Subsequently, already established multivariate functional outlier detection procedures can be used, but for interpretation purposes, the functional data need to be presented in an appropriate space. The methodology is illustrated with publicly available data around the COVID-19 pandemic to find countries displaying outlying trends.


2020 ◽  
Vol 198 ◽  
pp. 105960
Author(s):  
Clément Lejeune ◽  
Josiane Mothe ◽  
Adil Soubki ◽  
Olivier Teste

2020 ◽  
Vol 10 (3) ◽  
pp. 881
Author(s):  
Myeong-Hun Jeong ◽  
Seung-Bae Jeon ◽  
Tae-Young Lee ◽  
Min Kyo Youm ◽  
Dong-Ha Lee

This study provides an automatic shipping-route construction method using functional data analysis (FDA), which analyzes information about curves, such as multiple data points over time. The proposed approach includes two steps: outlier detection and shipping-route construction. This study uses automatic-identification system (AIS) data for the experiments. The effectiveness of the proposed method is demonstrated through case studies, wherein our approach is compared with the Mahalanobis distance method for trajectory-outlier detection, and the performance of vessel trajectory reconstruction is compared with that of a density-based approach. The proposed method improves understanding of vessel-movement dynamics, thereby improving maritime monitoring and security.


Sign in / Sign up

Export Citation Format

Share Document