Stream Data Mining | ScienceGate

There are various applications of clustering in the fields of machine learning, data mining, data compression along with pattern recognition. The existent techniques like the Llyods algorithm (sometimes called k-means) were affected by the issue of the algorithm which converges to a local optimum along with no approximation guarantee. For overcoming these shortcomings, an efficient k-means clustering approach is offered by this paper for stream data mining. Coreset is a popular and fundamental concept for k-means clustering in stream data. In each step, reduction determines a coreset of inputs, and represents the error, where P represents number of input points according to nested property of coreset. Hence, a bit reduction in error of final coreset gets n times more accurate. Therefore, this motivated the author to propose a new coreset-reduction algorithm. The proposed algorithm executed on the Covertype dataset, Spambase dataset, Census 1990 dataset, Bigcross dataset, and Tower dataset. Our algorithm outperforms with competitive algorithms like Streamkm[Formula: see text], BICO (BIRCH meets Coresets for k-means clustering), and BIRCH (Balance Iterative Reducing and Clustering using Hierarchies.

Download Full-text

Analysis of Classification and Clustering based Novel Class Detection Techniques for Stream Data Mining

International Journal of Engineering Research and ◽

10.17577/ijertv4is100160 ◽

2015 ◽

Vol V4 (10) ◽

Author(s):

Kamini Tandel ◽

Jignasa N. Patel ◽

Keyword(s):

Data Mining ◽

Stream Data ◽

Detection Techniques ◽

Stream Data Mining ◽

Classification And Clustering

Download Full-text

Evolving clustering algorithm based on mixture of typicalities for stream data mining

Future Generation Computer Systems ◽

10.1016/j.future.2020.01.017 ◽

2020 ◽

Vol 106 ◽

pp. 672-684 ◽

Cited By ~ 3

Author(s):

José Maia ◽

Carlos Alberto Severiano ◽

Frederico Gadelha Guimarães ◽

Cristiano Leite de Castro ◽

André Paim Lemos ◽

...

Keyword(s):

Data Mining ◽

Clustering Algorithm ◽

Stream Data ◽

Stream Data Mining

Download Full-text

SERA: Selectively recursive approach towards nonstationary imbalanced stream data mining

2009 International Joint Conference on Neural Networks ◽

10.1109/ijcnn.2009.5178874 ◽

2009 ◽

Cited By ~ 44

Author(s):

Sheng Chen ◽

Haibo He

Keyword(s):

Data Mining ◽

Stream Data ◽

Stream Data Mining

Download Full-text

Comparative Analysis of Drift Detection Based Adaptive Ensemble Model with Different Drift Detection Techniques

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/06492 ◽

2021 ◽

Vol 23 (06) ◽

pp. 49-55

Author(s):

Sanjeev Kumar ◽

◽

Ravendra Singh ◽

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

False Positive ◽

Opinion Mining ◽

Concept Drift ◽

Ensemble Classifier ◽

Stream Data ◽

Detection Techniques ◽

Stream Data Mining ◽

Detection Algorithms

Stream data mining is a popular research area these days. The concept drift detection and drift handling are the biggest challenges of stream data mining. Several drift detection algorithms have been developed which can accurately detect various drifts but have the problem of false-positive drift detection. The false-positive drift detection leads to the performance degradation of the classifier because of unnecessary training in between analyses. Classifier ensemble has shown its efficiency for drift detection, drift handling, and classification. But the ensemble classifiers could not detect the exact position of drift occurrence, so it has to update itself at some fixed interval, which leads to an unnecessary computational burden on the system. Combining the drift detection algorithm with an ensemble classifier can improve the performance and also solve the problems of false-positive drift detection and unnecessary updating of the ensemble classifier. In this paper, a model is proposed that creates a weighted adaptive ensemble classifier by updating it only when a drift detection signal is given by the used drift detection method. The proposed model is evaluated on text-based stream data for sentiment analysis and opinion mining with multiple drift detection algorithms and with multiple classification algorithms as base classifiers for the ensemble. A comparative analysis has been done, and the results have shown the efficiency of the proposed models.

Download Full-text

On the Hermite Series-Based Generalized Regression Neural Networks for Stream Data Mining

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36718-3_37 ◽

2019 ◽

pp. 437-448

Author(s):

Danuta Rutkowska ◽

Leszek Rutkowski

Keyword(s):

Data Mining ◽

Neural Networks ◽

Stream Data ◽

Stream Data Mining ◽

Generalized Regression Neural Networks ◽

Generalized Regression ◽

Hermite Series

Download Full-text

EFFICIENTLY MINING RECENT FREQUENT PATTERNS OVER ONLINE TRANSACTIONAL DATA STREAMS

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194009004325 ◽

2009 ◽

Vol 19 (05) ◽

pp. 707-725 ◽

Cited By ~ 1

Author(s):

HUI CHEN

Keyword(s):

Data Mining ◽

Data Stream ◽

Frequent Patterns ◽

Stream Data ◽

Data Stream Management ◽

Online Data ◽

Stream Data Mining ◽

Network Traffic Analysis ◽

Stream Management ◽

Performance Results

Recent emerging applications, such as network traffic analysis, web click stream mining, power consumption measurement, sensor network data analysis, and dynamic tracing of stock fluctuation, call for study of a new kind of data, stream data. Many data stream management systems, prototype systems and software components have been developed to manage the streams or extract knowledge from stream data. Mining frequent patterns is a foundational job for the methods of data mining and knowledge discovery. This paper proposes an algorithm for mining the recent frequent patterns over an online data stream. This method uses RFP-tree to store compactly the recent frequent patterns of a stream. The content of each transaction is incrementally updated into the pattern tree upon its arrival by scanning the stream only once. Moreover, the strategy of conservative computation and time decaying model are used to ensure the correctness of the mining results. Finally, the performance results of extensive simulation show that our work can reduce the average processing time of stream data element and it is superior to other analogous algorithms.

Download Full-text