Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency

PROBABILISTIC QUERYING OVER UNCERTAIN DATA STREAMS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488512500328 ◽

2012 ◽

Vol 20 (05) ◽

pp. 701-728 ◽

Cited By ~ 1

Author(s):

MOHAMMAD G. DEZFULI ◽

MOSTAFA S. HAGHJOO

Keyword(s):

Data Streams ◽

Data Stream ◽

Memory Management ◽

Query Language ◽

Uncertain Data ◽

Sensor Data ◽

Data Stream Management ◽

Probabilistic Data ◽

Stream Management ◽

Probabilistic Data Streams

Inherent imprecision of data in many applications motivates us to support uncertainty as a first-class concept. Data stream and probabilistic data have been recently considered noticeably in isolation. However, there are many applications including sensor data management systems and object monitoring systems which need both issues in tandem. Our main contribution is designing a probabilistic data stream management system, called Sarcheshmeh, for continuous querying over probabilistic data streams. Sarcheshmeh supports uncertainty from input data to final query results. In this paper, after reviewing requirements and applications of probabilistic data streams, we present our new data model for probabilistic data streams and define our main logical operators formally. Then, we present query language and physical operators. In addition, we introduce the architecture of Sarcheshmeh and also describe some major challenges like memory management and our floating precision mechanism toward designing a more robust system. Finally, we report evaluation of our system and the effect of floating precision on the tradeoff between accuracy and efficiency.

Download Full-text

Analysis of Data Stream Processing At Edge Layer for Internet of Things

Journal of ISMAC - June 2019 ◽

10.36548/jismac.2020.1.003 ◽

2020 ◽

Vol 2 (1) ◽

pp. 26-37

Author(s):

Dr. Pasumponpandian

Keyword(s):

Internet Of Things ◽

Data Streams ◽

Data Stream ◽

Smart Cities ◽

Stream Processing ◽

Middle Layer ◽

Cloud Services ◽

Decentralized Systems ◽

Data Stream Processing ◽

Edge Layer

The progress of internet of things at a rapid pace and simultaneous development of the technologies and the processing capabilities has paved way for the development of decentralized systems that are relying on cloud services. Though the decentralized systems are founded on cloud complexities still prevail in transferring all the information’s that are been sensed through the IOT devices to the cloud. This because of the huge streams of information’s gathered by certain applications and the expectation to have a timely response, incurring minimized delay, computing energy and enhanced reliability. So this kind of decentralization has led to the development of middle layer between the cloud and the IOT, and was termed as the Edge layer, meaning bringing down the service of the cloud to the user edge. The paper puts forth the analysis of the data stream processing in the edge layer taking in the complexities involved in the computing the data streams of IOT in an edge layer and puts forth the real time analytics in the edge layer to examine the data streams of the internet of things offering a data- driven insight for parking system in the smart cities.

Download Full-text

An Approach to Adaptive Memory Management in Data Stream Systems

22nd International Conference on Data Engineering (ICDE'06) ◽

10.1109/icde.2006.17 ◽

2006 ◽

Cited By ~ 10

Author(s):

M. Cammert ◽

J. Kramer ◽

B. Seeger ◽

S. Vaupel

Keyword(s):

Data Stream ◽

Memory Management ◽

Adaptive Memory ◽

Stream Systems

Download Full-text

Knowledge Discovery From Evolving Data Streams

Advances in Business Information Systems and Analytics - Machine Learning Techniques for Improved Business Analytics ◽

10.4018/978-1-5225-3534-8.ch002 ◽

2019 ◽

pp. 19-39

Author(s):

Prasanna Lakshmi Kompalli

Keyword(s):

Real Time ◽

Data Streams ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Time Data ◽

Stream Mining ◽

New Challenges ◽

Mining Data Streams ◽

Different Sources

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.

Download Full-text

Exploring Calendar-Based Pattern Mining in Data Streams

Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development ◽

10.4018/978-1-60566-748-5.ch016 ◽

2010 ◽

pp. 342-360

Author(s):

Rodrigo Salvador Monteiro ◽

Geraldo Zimbrão ◽

Holger Schwarz ◽

Bernhard Mitschang ◽

Jano Moreira de Souza

Keyword(s):

Data Warehouse ◽

Data Streams ◽

Data Stream ◽

Pattern Mining ◽

A Priori ◽

Frequent Itemsets ◽

Detailed Data ◽

Series Of Experiments ◽

Working Day

Calendar-based pattern mining aims at identifying patterns on specific calendar partitions. Potential calendar partitions are for example: every Monday, every first working day of each month, every holiday. Providing flexible mining capabilities for calendar-based partitions is especially challenging in a data stream scenario. The calendar partitions of interest are not known a priori and at each point in time only a subset of the detailed data is available. The authors show how a data warehouse approach can be applied to this problem. The data warehouse that keeps track of frequent itemsets holding on different partitions of the original stream has low storage requirements. Nevertheless, it allows to derive sets of patterns that are complete and precise. Furthermore, the authors demonstrate the effectiveness of their approach by a series of experiments.

Download Full-text

Research on Sequence Query Processing Techniques over Data Streams

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.284-287.3507 ◽

2013 ◽

Vol 284-287 ◽

pp. 3507-3511 ◽

Cited By ~ 5

Author(s):

Edgar Chia Han Lin

Keyword(s):

Query Processing ◽

Data Streams ◽

Management System ◽

Data Stream ◽

Sensor Data ◽

Data Management System ◽

Content Filtering ◽

Data Stream Management ◽

Stream Management ◽

Great Progress

Due to the great progress of computer technology and mature development of network, more and more data are generated and distributed through the network, which is called data streams. During the last couple of years, a number of researchers have paid their attention to data stream management, which is different from the conventional database management. At present, the new type of data management system, called data stream management system (DSMS), has become one of the most popular research areas in data engineering field. Lots of research projects have made great progress in this area. Since the current DSMS does not support queries on sequence data, this project will study the issues related to two types of data. First, we will focus on the content filtering on single-attribute streams, such as sensor data. Second, we will focus on multi-attribute streams, such as video films. We will discuss the related issues such as how to build an efficient index for all queries of different streams and the corresponding query processing mechanisms.

Download Full-text

An Efficient Frequent Patterns Mining Algorithm over Data Streams Based on FPD-Graph

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.4457 ◽

2012 ◽

Vol 433-440 ◽

pp. 4457-4462 ◽

Cited By ~ 1

Author(s):

Jun Shan Tan ◽

Zhu Fang Kuang ◽

Guo Gui Yang

Keyword(s):

Data Streams ◽

Data Stream ◽

Pattern Mining ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Frequent Patterns ◽

Data Generation ◽

Experiment Data ◽

Mining Algorithm ◽

Head Node

The design of synopses structure is an important issue of frequent patterns mining over data stream. A data stream synopses structure FPD-Graph which is based on directed graph is proposed in this paper. The FPD-Graph contains list head node FPDG-Head and list node FPDG-Node. The operations of FPD-Graph consist of insert operation and deletion operation. A frequent pattern mining algorithm DGFPM based on sliding window over data stream is proposed in this paper. The IBM synthesizes data generation which output customers shopping a data are adopted as experiment data. The DGFPM algorithm not only has high precision for mining frequent patterns, but also has low processing time.

Download Full-text

A Team’s Neurodynamic Organization is More than the Sum of its Members

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1541931213601997 ◽

2017 ◽

Vol 61 (1) ◽

pp. 2010-2014 ◽

Cited By ~ 2

Author(s):

Ronald Stevens ◽

Trysha Galloway ◽

Ann Willemson-Dunlap

Keyword(s):

Decision Making ◽

Data Streams ◽

Data Stream ◽

Team Member ◽

Specific Information ◽

Naturalistic Decision Making ◽

Team Members ◽

Total Information

The information within the neurodynamic data streams of teams engaged in naturalistic decision making was separated into information unique to each team member, the information shared by two or more team members, and team-specific information related to interactions with the task and team members. Most of the team information consisted of the information contained in an individual’s neurodynamic data stream. The information in an individual’s data stream that was shared with another team member was highly variable being 1-60% of the total information in another person’s data stream. From the shared, individual, and team information it becomes possible to assign quantitative values to both the neurodynamics of each team member during the task, as well as the interactions among the members of the team.

Download Full-text

A Survey of Challenges Facing Streaming Data

Transactions on Machine Learning and Artificial Intelligence ◽

10.14738/tmlai.84.8579 ◽

2020 ◽

Vol 8 (4) ◽

pp. 63-73

Author(s):

Sikha Bagui ◽

Katie Jin

Keyword(s):

Data Reduction ◽

Data Streams ◽

Data Stream ◽

Stream Processing ◽

Streaming Data ◽

Data Detection ◽

Data Stream Processing ◽

The Face ◽

Concept Drifts

This survey performs a thorough enumeration and analysis of existing methods for data stream processing. It is a survey of the challenges facing streaming data. The challenges addressed are preprocessing of streaming data, detection and dealing with concept drifts in streaming data, data reduction in the face of data streams, approximate queries and blocking operations in streaming data.

Download Full-text

CACHE HIERARCHY INSPIRED COMPRESSION: A NOVEL ARCHITECTURE FOR DATA STREAMS

Journal of IT in Asia ◽

10.33736/jita.54.2007 ◽

2016 ◽

Vol 2 (1) ◽

pp. 39-52

Author(s):

G. Holmes ◽

B. Pfahringer ◽

R. Kirkby

Keyword(s):

Data Streams ◽

Incremental Learning ◽

Data Stream ◽

Main Idea ◽

Replacement Policy ◽

Cache Hierarchies ◽

Meta Level ◽

Web Cache ◽

General Architecture ◽

Over Time

We present an architecture for data streams based on structures typically found in web cache hierarchies. The main idea is to build a meta level analyser from a number of levels constructed over time from a data stream. We present the general architecture for such a system and an application to classification. This architecture is an instance of the general wrapper idea allowing us to reuse standard batch learning algorithms in an inherently incremental learning environment. By artificially generating data sources we demonstrate that a hierarchy containing a mixture of models is able to adapt over time to the source of the data. In these experiments the hierarchies use an elementary performance based replacement policy and unweighted voting for making classification decisions.

Download Full-text