Toward a Grid-Based Zero-Latency Data Warehousing Implementation for Continuous Data Streams Processing

2008 ◽  
pp. 755-786
Author(s):  
Tho Manh Nguyen ◽  
Peter Brezany ◽  
A. Min Tjoa ◽  
Edgar Weippl

Continuous data streams are information sources in which data arrives in high volume in unpredictable rapid bursts. Processing data streams is a challenging task due to (1) the problem of random access to fast and large data streams using present storage technologies and (2) the exact answers from data streams often being too expensive. A framework of building a Grid-based Zero-Latency Data Stream Warehouse (GZLDSWH) to overcome the resource limitation issues in data stream processing without using approximation approaches is specified. The GZLDSWH is built upon a set of Open Grid Service Infrastructure (OGSI)-based services and Globus Toolkit 3 (GT3) with the capability of capturing and storing continuous data streams, performing analytical processing, and reacting autonomously in near real time to some kinds of events based on a well-established knowledge base. The requirements of a GZLDSWH, its Grid-based conceptual architecture, and the operations of its service are described in this paper. Furthermore, several challenges and issues in building a GZLDSWH, such as the Dynamic Collaboration Model between the Grid services, the Analytical Model, and the Design and Evaluation aspects of the Knowledge Base Rules are discussed and investigated.

2005 ◽  
Vol 1 (4) ◽  
pp. 22-55 ◽  
Author(s):  
Tho Manh Nguyen ◽  
Peter Brezany ◽  
A. Min Tjoa ◽  
Edgar Weippl

2009 ◽  
Vol 7 ◽  
pp. 133-137 ◽  
Author(s):  
A. Guntoro ◽  
M. Glesner

Abstract. Although there is an increase of performance in DSPs, due to its nature of execution a DSP could not perform high-speed data processing on a continuous data stream. In this paper we discuss the hardware implementation of the amplitude and phase detector and the validation block on a FPGA. Contrary to the software implementation which can only process data stream as high as 1.5 MHz, the hardware approach is 225 times faster and introduces much less latency.


Author(s):  
MOHAMED MEDHAT GABER ◽  
PHILIP S. YU

Data stream mining has attracted considerable attention over the past few years owing to the significance of its applications. Streaming data is often evolving over time. Capturing changes could be used for detecting an event or a phenomenon in various applications. Weather conditions, economical changes, astronomical, and scientific phenomena are among a wide range of applications. Because of the high volume and speed of data streams, it is computationally hard to capture these changes from raw data in real-time. In this paper, we propose a novel algorithm that we term as STREAM-DETECT to capture these changes in data stream distribution and/or domain using clustering result deviation. STREAM-DETECT is followed by a process of offline classification CHANGE-CLASS. This classification is concerned with the association of the history of change characteristics with the observed event or phenomenon. Experimental results show the efficiency of the proposed framework in both detecting the changes and classification accuracy.


Author(s):  
Aderonke B. Sakpere ◽  
Anne V. D. M. Kayem

Streaming data emerges from different electronic sources and needs to be processed in real time with minimal delay. Data streams can generate hidden and useful knowledge patterns when mined and analyzed. In spite of these benefits, the issue of privacy needs to be addressed before streaming data is released for mining and analysis purposes. In order to address data privacy concerns, several techniques have emerged. K-anonymity has received considerable attention over other privacy preserving techniques because of its simplicity and efficiency in protecting data. Yet, k-anonymity cannot be directly applied on continuous data (data streams) because of its transient nature. In this chapter, the authors discuss the challenges faced by k-anonymity algorithms in enforcing privacy on data streams and review existing privacy techniques for handling data streams.


2018 ◽  
Vol 7 (3.12) ◽  
pp. 411
Author(s):  
P Chandrakanth ◽  
Anbarasi M.S

The problem data privacy in streams is completely put in a myopic view by hitherto researchers. Research and experimentations have been well fortified on static data, in which predominantly spelled easy with approaches based on perturbation using random data values. Approaches based on large data sets and high dimension data sets are not adequate consequences. By using the phenomenon of autocorrelation of multivariable streams and their leveraging structures, identifying the suitable areas to add noise maximally preserves privacy and in a irreversible manner. Drift checking and ensemble classifier building is the basic requirements for privacy preserving data stream, which makes clear in experimentation with the support of sensitivity analysis. In this paper we present the results of experimentation at all the stages.  


2015 ◽  
Vol 2015 ◽  
pp. 1-17 ◽  
Author(s):  
Abril Valeria Uriarte-Arcia ◽  
Itzamá López-Yáñez ◽  
Cornelio Yáñez-Márquez ◽  
João Gama ◽  
Oscar Camacho-Nieto

The ever increasing data generation confronts us with the problem of handling online massive amounts of information. One of the biggest challenges is how to extract valuable information from these massive continuous data streams during single scanning. In a data stream context, data arrive continuously at high speed; therefore the algorithms developed to address this context must be efficient regarding memory and time management and capable of detecting changes over time in the underlying distribution that generated the data. This work describes a novel method for the task of pattern classification over a continuous data stream based on an associative model. The proposed method is based on the Gamma classifier, which is inspired by the Alpha-Beta associative memories, which are both supervised pattern recognition models. The proposed method is capable of handling the space and time constrain inherent to data stream scenarios. The Data Streaming Gamma classifier (DS-Gamma classifier) implements a sliding window approach to provide concept drift detection and a forgetting mechanism. In order to test the classifier, several experiments were performed using different data stream scenarios with real and synthetic data streams. The experimental results show that the method exhibits competitive performance when compared to other state-of-the-art algorithms.


2021 ◽  
pp. 391-410
Author(s):  
Shinichi Yamagiwa

AbstractIn this chapter, we introduce aspects of applying data-compression techniques. First, we study the background of recent communication data paths. The focus of this chapter is a fast lossless data-compression mechanism that handles data streams completely. A data stream comprises continuous data with no termination of the massive data generated by sources such as movies and sensors. In this chapter, we introduce LCA-SLT and LCA-DLT, which accept the data streams, as well as several implementations of these stream-based compression techniques. We also show optimization techniques for optimal implementation in hardware.


Sign in / Sign up

Export Citation Format

Share Document