Visual analytics of anomaly detection in large data streams

Test data generated by ~60 accredited member laboratories of the American Association of Veterinary Laboratory Diagnosticians (AAVLD) is of exceptional quality. These data are captured by 1 of 13 laboratory information management systems (LIMSs) developed specifically for veterinary diagnostic laboratories (VDLs). Beginning ~2000, the National Animal Health Laboratory Network (NAHLN) developed an electronic messaging system for LIMS to automatically send standardized data streams for 14 select agents to a national repository. This messaging enables the U.S. Department of Agriculture to track and respond to high-consequence animal disease outbreaks such as highly pathogenic avian influenza. Because of the lack of standardized data collection in the LIMSs used at VDLs, there is, to date, no means of summarizing VDL large data streams for multi-state and national animal health studies or for providing near-real-time tracking for hundreds of other important animal diseases in the United States that are detected routinely by VDLs. Further, VDLs are the only state and federal resources that can provide early detection and identification of endemic and emerging zoonotic diseases. Zoonotic diseases are estimated to be responsible for 2.5 billion cases of human illness and 2.7 million deaths worldwide every year. The economic and health impact of the SARS-CoV-2 pandemic is self-evident. We review here the history and progress of data management in VDLs and discuss ways of seizing unexplored opportunities to advance data leveraging to better serve animal health, public health, and One Health.

Download Full-text

Analysis of the Visual Analytics Techniques for Monitoring Heterogeneous Data Streams

2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) ◽

10.1109/eiconrus49466.2020.9039020 ◽

2020 ◽

Author(s):

Mikhail P. Bestuzhev ◽

Evgenia S. Novikova ◽

Yana A. Bekeneva

Keyword(s):

Data Streams ◽

Visual Analytics ◽

Heterogeneous Data

Download Full-text

A KNOWLEDGE-BASED METHOD FOR GENERATING SUMMARIES OF SPATIAL MOVEMENT IN GEOGRAPHIC AREAS

International Journal of Artificial Intelligence Tools ◽

10.1142/s021821301000025x ◽

2010 ◽

Vol 19 (04) ◽

pp. 393-415 ◽

Cited By ~ 5

Author(s):

MARTIN MOLINA ◽

AMANDA STENT

Keyword(s):

Mobile Phones ◽

Data Streams ◽

Large Data ◽

Gps Data ◽

Knowledge Representations ◽

Knowledge Based ◽

Spatial Movement ◽

Present Evaluation ◽

Main Components

In this article we describe a method for automatically generating text summaries of data corresponding to traces of spatial movement in geographical areas. The method can help humans to understand large data streams, such as the amounts of GPS data recorded by a variety of sensors in mobile phones, cars, etc. We describe the knowledge representations we designed for our method and the main components of our method for generating the summaries: a discourse planner, an abstraction module and a text generator. We also present evaluation results that show the ability of our method to generate certain types of geospatial and temporal descriptions.

Download Full-text

Optimized Stage Processing for Anomaly Detection on Numerical Data Streams

Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems - DEBS '17 ◽

10.1145/3093742.3095101 ◽

2017 ◽

Cited By ~ 1

Author(s):

Ciprian Amariei ◽

Paul Diac ◽

Emanuel Onica

Keyword(s):

Anomaly Detection ◽

Data Streams ◽

Numerical Data

Download Full-text

Multivariate anomaly detection for Earth observations: a comparison of algorithms and feature extraction techniques

Earth System Dynamics ◽

10.5194/esd-8-677-2017 ◽

2017 ◽

Vol 8 (3) ◽

pp. 677-696 ◽

Cited By ~ 12

Author(s):

Milan Flach ◽

Fabian Gans ◽

Alexander Brenning ◽

Joachim Denzler ◽

Markus Reichstein ◽

...

Keyword(s):

Feature Extraction ◽

Anomaly Detection ◽

Data Streams ◽

Multivariate Data ◽

Earth System ◽

Earth System Science ◽

Sample Mean ◽

System Science ◽

Detection Algorithms ◽

Earth Observations

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advancing our understanding of vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of extreme climatic events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only a few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations like sudden changes in basic characteristics of time series such as the sample mean, the variance, changes in the cycle amplitude, and trends. This artificial experiment is needed as there is no gold standard for the identification of anomalies in real Earth observations. Our results show that a well-chosen feature extraction step (e.g., subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify three detection algorithms (k-nearest neighbors mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme-event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.

Download Full-text

Multivariate Anomaly Detection for Earth Observations: A Comparison of Algorithms and Feature Extraction Techniques

10.5194/esd-2016-51 ◽

2016 ◽

Cited By ~ 1

Author(s):

Milan Flach ◽

Fabian Gans ◽

Alexander Brenning ◽

Joachim Denzler ◽

Markus Reichstein ◽

...

Keyword(s):

Feature Extraction ◽

Anomaly Detection ◽

Data Streams ◽

Multivariate Data ◽

Detection Methods ◽

Earth System ◽

Earth System Science ◽

System Science ◽

Detection Algorithms ◽

Earth Observations

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advance our understanding of e.g. vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of climatic extreme events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations. This artificial experiment is needed as there is no 'gold standard' for the identification of anomalies in real Earth observations. Our results show that a well chosen feature extraction step (e.g. subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify 3 detection algorithms (k-nearest neighbours mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.

Download Full-text

Solution Pattern for Anomaly Detection in Financial Data Streams

Communications in Computer and Information Science - New Trends in Databases and Information Systems ◽

10.1007/978-3-030-30278-8_10 ◽

2019 ◽

pp. 77-84

Author(s):

Maciej Zakrzewicz ◽

Marek Wojciechowski ◽

Paweł Gławiński

Keyword(s):

Anomaly Detection ◽

Data Streams ◽

Financial Data

Download Full-text

Survey on Feature Transformation Techniques for Data Streams

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/668 ◽

2020 ◽

Author(s):

Maroua Bahri ◽

Albert Bifet ◽

Silviu Maniu ◽

Heitor Murilo Gomes

Keyword(s):

Machine Learning ◽

Data Streams ◽

Large Data ◽

High Dimensional ◽

Feature Transformation ◽

Transformation Techniques ◽

Computational Costs ◽

The Past ◽

Fundamental Challenge ◽

Mining Algorithms

Mining high-dimensional data streams poses a fundamental challenge to machine learning as the presence of high numbers of attributes can remarkably degrade any mining task's performance. In the past several years, dimension reduction (DR) approaches have been successfully applied for different purposes (e.g., visualization). Due to their high-computational costs and numerous passes over large data, these approaches pose a hindrance when processing infinite data streams that are potentially high-dimensional. The latter increases the resource-usage of algorithms that could suffer from the curse of dimensionality. To cope with these issues, some techniques for incremental DR have been proposed. In this paper, we provide a survey on reduction approaches designed to handle data streams and highlight the key benefits of using these approaches for stream mining algorithms.

Download Full-text

From anomaly detection to rumour detection using data streams of social platforms

Proceedings of the VLDB Endowment ◽

10.14778/3329772.3329778 ◽

2019 ◽

Vol 12 (9) ◽

pp. 1016-1029 ◽

Cited By ~ 5

Author(s):

Nguyen Thanh Tam ◽

Matthias Weidlich ◽

Bolong Zheng ◽

Hongzhi Yin ◽

Nguyen Quoc Viet Hung ◽

...

Keyword(s):

Anomaly Detection ◽

Data Streams ◽

Using Data

Download Full-text

Visual analytics of anomaly detection in large data streams

A Fast kNN-Based Approach for Time Sensitive Anomaly Detection over Data Streams

A proposal to leverage high-quality veterinary diagnostic laboratory large data streams for animal health, public health, and One Health

Analysis of the Visual Analytics Techniques for Monitoring Heterogeneous Data Streams

A KNOWLEDGE-BASED METHOD FOR GENERATING SUMMARIES OF SPATIAL MOVEMENT IN GEOGRAPHIC AREAS

Optimized Stage Processing for Anomaly Detection on Numerical Data Streams

Multivariate anomaly detection for Earth observations: a comparison of algorithms and feature extraction techniques

Multivariate Anomaly Detection for Earth Observations: A Comparison of Algorithms and Feature Extraction Techniques

Solution Pattern for Anomaly Detection in Financial Data Streams

Survey on Feature Transformation Techniques for Data Streams

From anomaly detection to rumour detection using data streams of social platforms

Export Citation Format