EFFICIENTLY MINING RECENT FREQUENT PATTERNS OVER ONLINE TRANSACTIONAL DATA STREAMS

Recent emerging applications, such as network traffic analysis, web click stream mining, power consumption measurement, sensor network data analysis, and dynamic tracing of stock fluctuation, call for study of a new kind of data, stream data. Many data stream management systems, prototype systems and software components have been developed to manage the streams or extract knowledge from stream data. Mining frequent patterns is a foundational job for the methods of data mining and knowledge discovery. This paper proposes an algorithm for mining the recent frequent patterns over an online data stream. This method uses RFP-tree to store compactly the recent frequent patterns of a stream. The content of each transaction is incrementally updated into the pattern tree upon its arrival by scanning the stream only once. Moreover, the strategy of conservative computation and time decaying model are used to ensure the correctness of the mining results. Finally, the performance results of extensive simulation show that our work can reduce the average processing time of stream data element and it is superior to other analogous algorithms.

Download Full-text

Classification of Imbalanced Data Stream: Techniques and Challenges

Transactions on Machine Learning and Artificial Intelligence ◽

10.14738/tmlai.92.9964 ◽

2021 ◽

Vol 9 (2) ◽

pp. 36-52

Author(s):

Mashaal A. Alfhaid ◽

Manal Abdullah

Keyword(s):

Data Mining ◽

Data Stream ◽

Concept Drift ◽

Class Imbalance ◽

Imbalanced Data ◽

Predictive Performance ◽

Knowledge Extraction ◽

Streaming Data ◽

Stream Data ◽

Stream Data Mining

As the number of generated data increases every day, this has brought the importance of data mining and knowledge extraction. In traditional data mining, offline status can be used for knowledge extraction. Nevertheless, dealing with stream data mining is different due to continuously arriving data that can be processed at a single scan besides the appearance of concept drift. As the pre-processing stage is critical in knowledge extraction, imbalanced stream data gain significant popularity in the last few years among researchers. Many real-world applications suffer from class imbalance including medical, business, fraud detection and etc. Learning from the supervised model includes classes whether it is binary- or multi-classes. These classes are often imbalance where it is divided into the majority (negative) class and minority (positive) class, which can cause a bias toward the majority class that leads to skew in predictive performance models. Handles imbalance streaming data is mandatory for more accurate and reliable learning models. In this paper, we will present an overview of data stream mining and its tools. Besides, summarize the problem of class imbalance and its different approaches. In addition, researchers will present the popular evaluation metrics and challenges prone from imbalanced streaming data.

Download Full-text

Real-time spatio-temporal data mining with the “streamonas” data stream management system

Data Mining X ◽

10.2495/data090121 ◽

2009 ◽

Author(s):

P. A. Michael ◽

D. Stott Parker

Keyword(s):

Data Mining ◽

Real Time ◽

Management System ◽

Data Stream ◽

Temporal Data Mining ◽

Temporal Data ◽

Data Stream Management ◽

Stream Management ◽

Data Stream Management System ◽

Spatio Temporal

Download Full-text

Continuous Post-Mining of Association Rules in a Data Stream Management System

Post-Mining of Association Rules ◽

10.4018/978-1-60566-404-0.ch007 ◽

2009 ◽

pp. 116-132

Author(s):

Hetal Thakkar ◽

Barzan Mozafari ◽

Carlo Zaniolo

Keyword(s):

Data Stream ◽

Data Cleaning ◽

Frequent Patterns ◽

Data Stream Management ◽

Efficient System ◽

Service Guarantees ◽

Stream Management ◽

Continuous Mining ◽

The Many

The real-time (or just-on-time) requirement associated with online association rule mining implies the need to expedite the analysis and validation of the many candidate rules, which are typically created from the discovered frequent patterns. Moreover, the mining process, from data cleaning to post-mining, can no longer be structured as a sequence of steps performed by the analyst, but must be streamlined into a workflow supported by an efficient system providing quality of service guarantees that are expected from modern Data Stream Management Systems (DSMSs). This chapter describes the architecture and techniques used to achieve this advanced functionality in the Stream Mill Miner (SMM) prototype, an SQL-based DSMS designed to support continuous mining queries.

Download Full-text

Comparative Study of Different Classification Algorithms for Stream Data Mining Using MOA

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i11.614616 ◽

2018 ◽

Vol 6 (11) ◽

pp. 614-616

Author(s):

Ashish P. Joshi ◽

Biraj V. Patel

Keyword(s):

Data Mining ◽

Comparative Study ◽

Classification Algorithms ◽

Stream Data ◽

Stream Data Mining

Download Full-text

An algebric window model for data stream management

Proceedings of the Ninth ACM International Workshop on Data Engineering for Wireless and Mobile Access - MobiDE '10 ◽

10.1145/1850822.1850826 ◽

2010 ◽

Cited By ~ 4

Author(s):

Loïc Petit ◽

Cyril Labbé ◽

Claudia Lucia Roncancio

Keyword(s):

Data Stream ◽

Data Stream Management ◽

Stream Management

Download Full-text

SSC: cloud-based data stream management in distributed environments

International Journal of High Performance Computing and Networking ◽

10.1504/ijhpcn.2016.076250 ◽

2016 ◽

Vol 9 (3) ◽

pp. 171

Author(s):

Kok Leong Ong ◽

Andrzej Goscinski ◽

Yuzhang Han ◽

Peter Brezany ◽

Zahir Tari ◽

...

Keyword(s):

Data Stream ◽

Distributed Environments ◽

Data Stream Management ◽

Stream Management

Download Full-text

A flexible network monitoring tool based on a data stream management system

2008 IEEE Symposium on Computers and Communications ◽

10.1109/iscc.2008.4625653 ◽

2008 ◽

Cited By ~ 5

Author(s):

Natascha Petry Ligocki ◽

Carmem S. Hara ◽

Christian Lyra

Keyword(s):

Management System ◽

Data Stream ◽

Network Monitoring ◽

Monitoring Tool ◽

Data Stream Management ◽

Stream Management ◽

Data Stream Management System

Download Full-text

A Clustering Algorithm in Stream Data Using Strong Coreset

Journal of Interconnection Networks ◽

10.1142/s0219265921430118 ◽

2021 ◽

Author(s):

Manmohan Singh ◽

Rajendra Pamula ◽

Alok Kumar

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Local Optimum ◽

Reduction Algorithm ◽

Stream Data ◽

Stream Data Mining ◽

Clustering Approach ◽

Approximation Guarantee ◽

Competitive Algorithms ◽

Learning Data

There are various applications of clustering in the fields of machine learning, data mining, data compression along with pattern recognition. The existent techniques like the Llyods algorithm (sometimes called k-means) were affected by the issue of the algorithm which converges to a local optimum along with no approximation guarantee. For overcoming these shortcomings, an efficient k-means clustering approach is offered by this paper for stream data mining. Coreset is a popular and fundamental concept for k-means clustering in stream data. In each step, reduction determines a coreset of inputs, and represents the error, where P represents number of input points according to nested property of coreset. Hence, a bit reduction in error of final coreset gets n times more accurate. Therefore, this motivated the author to propose a new coreset-reduction algorithm. The proposed algorithm executed on the Covertype dataset, Spambase dataset, Census 1990 dataset, Bigcross dataset, and Tower dataset. Our algorithm outperforms with competitive algorithms like Streamkm[Formula: see text], BICO (BIRCH meets Coresets for k-means clustering), and BIRCH (Balance Iterative Reducing and Clustering using Hierarchies.

Download Full-text