Attribute Pattern Weights (APW): A Scale to Detect Concept Drift in Data Stream Mining Models

2019 ◽

pp. 19-39

Author(s):

Prasanna Lakshmi Kompalli

Keyword(s):

Real Time ◽

Data Streams ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Time Data ◽

Stream Mining ◽

New Challenges ◽

Mining Data Streams ◽

Different Sources

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.

Download Full-text

Data Stream Mining Using Ensemble Classifier

Collaborative Filtering Using Data Mining and Analysis - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-0489-4.ch013 ◽

2017 ◽

pp. 236-249

Author(s):

Snehlata Sewakdas Dongre ◽

Latesh G. Malik

Keyword(s):

Collaborative Filtering ◽

Data Stream ◽

Concept Drift ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Data Stream Mining ◽

Main Concern ◽

Stream Mining ◽

Stream Classification ◽

Data Stream Classification

A data stream is giant amount of data which is generated uncontrollably at a rapid rate from many applications like call detail records, log records, sensors applications etc. Data stream mining has grasped the attention of so many researchers. A rising problem in Data Streams is the handling of concept drift. To be a good algorithm it should adapt the changes and handle the concept drift properly. Ensemble classification method is the group of classifiers which works in collaborative manner. Overall this chapter will cover all the aspects of the data stream classification. The mission of this chapter is to discuss various techniques which use collaborative filtering for the data stream mining. The main concern of this chapter is to make reader familiar with the data stream domain and data stream mining. Instead of single classifier the group of classifiers is used to enhance the accuracy of classification. The collaborative filtering will play important role here how the different classifiers work collaborative within the ensemble to achieve a goal.

Download Full-text

COLLABORATIVE DATA STREAM MINING IN UBIQUITOUS ENVIRONMENTS USING DYNAMIC CLASSIFIER SELECTION

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622013500375 ◽

2013 ◽

Vol 12 (06) ◽

pp. 1287-1308 ◽

Cited By ~ 4

Author(s):

JOÃO BÁRTOLO GOMES ◽

MOHAMED MEDHAT GABER ◽

PEDRO A. C. SOUSA ◽

ERNESTINA MENASALVAS

Keyword(s):

Classification Accuracy ◽

Data Stream ◽

Concept Drift ◽

Feature Space ◽

Data Stream Mining ◽

Stream Mining ◽

Stream Classification ◽

Classifier Selection ◽

Local Accuracy ◽

Real World Datasets

In ubiquitous data stream mining, different devices often aim to learn concepts that are similar to some extent. In many applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real-world datasets.

Download Full-text

Online Incremental Learning for High Bandwidth Network Traffic Classification

Applied Computational Intelligence and Soft Computing ◽

10.1155/2016/1465810 ◽

2016 ◽

Vol 2016 ◽

pp. 1-13 ◽

Cited By ~ 5

Author(s):

H. R. Loo ◽

S. B. Joseph ◽

M. N. Marsono

Keyword(s):

Network Traffic ◽

Data Stream ◽

Concept Drift ◽

Distance Measures ◽

Manhattan Distance ◽

Data Stream Mining ◽

Traffic Classification ◽

Stream Mining ◽

Network Traffic Classification ◽

High Bandwidth

Data stream mining techniques are able to classify evolving data streams such as network traffic in the presence of concept drift. In order to classify high bandwidth network traffic in real-time, data stream mining classifiers need to be implemented on reconfigurable high throughput platform, such as Field Programmable Gate Array (FPGA). This paper proposes an algorithm for online network traffic classification based on the concept of incrementalk-means clustering to continuously learn from both labeled and unlabeled flow instances. Two distance measures for incrementalk-means (Euclidean and Manhattan) distance are analyzed to measure their impact on the network traffic classification in the presence of concept drift. The experimental results on real datasets show that the proposed algorithm exhibits consistency, up to 94% average accuracy for both distance measures, even in the presence of concept drifts. The proposed incrementalk-means classification using Manhattan distance can classify network traffic 3 times faster than Euclidean distance at 671 thousands flow instances per second.

Download Full-text

Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining

Knowledge-Based Systems ◽

10.1016/j.knosys.2018.08.007 ◽

2018 ◽

Vol 161 ◽

pp. 205-239 ◽

Cited By ~ 2

Author(s):

Mahmood Shakir Hammoodi ◽

Frederic Stahl ◽

Atta Badii

Keyword(s):

Feature Selection ◽

Real Time ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Stream Mining ◽

Feature Selection Technique ◽

Selection Technique ◽

Concept Drift Detection

Download Full-text

Review paper on adapting data stream mining concept drift using ensemble classifier approach

IOSR Journal of Computer Engineering ◽

10.9790/0661-1654120123 ◽

2014 ◽

Vol 16 (5) ◽

pp. 120-123

Author(s):

Nilima Motghare ◽

◽

Arvind Mewada

Keyword(s):

Data Stream ◽

Review Paper ◽

Concept Drift ◽

Ensemble Classifier ◽

Data Stream Mining ◽

Stream Mining

Download Full-text

Combining active learning with concept drift detection for data stream mining

2018 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2018.8622549 ◽

2018 ◽

Cited By ~ 5

Author(s):

Bartosz Krawczyk ◽

Bernhard Pfahringer ◽

Michal Wozniak

Keyword(s):

Active Learning ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Stream Mining ◽

Concept Drift Detection

Download Full-text

Concept Drift Detection in Data Stream Mining : A literature review

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2021.11.006 ◽

2021 ◽

Author(s):

Supriya Agrahari ◽

Anil Kumar Singh

Keyword(s):

Literature Review ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Stream Mining ◽

Concept Drift Detection

Download Full-text

A Classification and Novel Class Detection Algorithm for Concept Drift Data Stream Based on the Cohesiveness and Separation Index of Mahalanobis Distance

Journal of Electrical and Computer Engineering ◽

10.1155/2020/4027423 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Xiangjun Li ◽

Yong Zhou ◽

Ziyan Jin ◽

Peng Yu ◽

Shun Zhou

Keyword(s):

Data Mining ◽

Mahalanobis Distance ◽

Data Stream ◽

Concept Drift ◽

Detection Algorithm ◽

Data Stream Mining ◽

Stream Mining ◽

Separation Index ◽

Concept Evolution ◽

The Impact

Data stream mining has become a research hotspot in data mining and has attracted the attention of many scholars. However, the traditional data stream mining technology still has some problems to be solved in dealing with concept drift and concept evolution. In order to alleviate the influence of concept drift and concept evolution on novel class detection and classification, this paper proposes a classification and novel class detection algorithm based on the cohesiveness and separation index of Mahalanobis distance. Experimental results show that the algorithm can effectively mitigate the impact of concept drift on classification and novel class detection.

Download Full-text

Benchmarking concept drift adoption strategies for high speed data stream mining

2015 International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT) ◽

10.1109/erect.2015.7499042 ◽

2015 ◽

Cited By ~ 3

Author(s):

Mohammed Ahmed Ali Abdualrhman ◽

M.C Padma

Keyword(s):

Data Stream ◽

High Speed ◽

Concept Drift ◽

Data Stream Mining ◽

Stream Mining ◽

High Speed Data

Download Full-text

Attribute Pattern Weights (APW): A Scale to Detect Concept Drift in Data Stream Mining Models

Knowledge Discovery From Evolving Data Streams

Data Stream Mining Using Ensemble Classifier

COLLABORATIVE DATA STREAM MINING IN UBIQUITOUS ENVIRONMENTS USING DYNAMIC CLASSIFIER SELECTION

Online Incremental Learning for High Bandwidth Network Traffic Classification

Real-time feature selection technique with concept drift detection using adaptive micro-clusters for data stream mining

Review paper on adapting data stream mining concept drift using ensemble classifier approach

Combining active learning with concept drift detection for data stream mining

Concept Drift Detection in Data Stream Mining : A literature review

A Classification and Novel Class Detection Algorithm for Concept Drift Data Stream Based on the Cohesiveness and Separation Index of Mahalanobis Distance

Benchmarking concept drift adoption strategies for high speed data stream mining

Export Citation Format