CD2A: Concept Drift Detection Approach Toward Imbalanced Data Stream

Author(s):  
Mohammed Ahmed Ali Abdualrhman ◽  
M. C. Padma
2021 ◽  
pp. 1-14
Author(s):  
Hanqing Hu ◽  
Mehmed Kantardzic

Real-world data stream classification often deals with multiple types of concept drift, categorized by change characteristics such as speed, distribution, and severity. When labels are unavailable, traditional concept drift detection algorithms, used in stream classification frameworks, are often focused on only one type of concept drift. To overcome the limitations of traditional detection algorithms, this study proposed a Heuristic Ensemble Framework for Drift Detection (HEFDD). HEFDD aims to detect all types of concept drift by employing an ensemble of selected concept drift detection algorithms, each capable of detecting at least one type of concept drift. Experimental results show HEFDD provides significant improvement based on the z-score test when comparing detection accuracy with state-of-the-art individual algorithms. At the same time, HEFDD is able to reduce false alarms generated by individual concept drift detection algorithms.


Author(s):  
Sumit Misra ◽  
Dipan Biswas ◽  
Sanjoy Kumar Saha ◽  
Chandan Mazumdar

2019 ◽  
Vol 1 (11) ◽  
Author(s):  
Scott Wares ◽  
John Isaacs ◽  
Eyad Elyan

Abstract Mining and analysing streaming data is crucial for many applications, and this area of research has gained extensive attention over the past decade. However, there are several inherent problems that continue to challenge the hardware and the state-of-the art algorithmic solutions. Examples of such problems include the unbound size, varying speed and unknown data characteristics of arriving instances from a data stream. The aim of this research is to portray key challenges faced by algorithmic solutions for stream mining, particularly focusing on the prevalent issue of concept drift. A comprehensive discussion of concept drift and its inherent data challenges in the context of stream mining is presented, as is a critical, in-depth review of relevant literature. Current issues with the evaluative procedure for concept drift detectors is also explored, highlighting problems such as a lack of established base datasets and the impact of temporal dependence on concept drift detection. By exposing gaps in the current literature, this study suggests recommendations for future research which should aid in the progression of stream mining and concept drift detection algorithms.


Author(s):  
Namitha K. ◽  
Santhosh Kumar G.

This article presents a stream mining framework to cluster the data stream and monitor its evolution. Even though concept drift is expected to be present in data streams, explicit drift detection is rarely done in stream clustering algorithms. The proposed framework is capable of explicit concept drift detection and cluster evolution analysis. Concept drift is caused by the changes in data distribution over time. Relationship between concept drift and the occurrence of physical events has been studied by applying the framework on the weather data stream. Experiments led to the conclusion that the concept drift accompanied by a change in the number of clusters indicates a significant weather event. This kind of online monitoring and its results can be utilized in weather forecasting systems in various ways. Weather data streams produced by automatic weather stations (AWS) are used to conduct this study.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Ahmed Abbasi ◽  
Abdul Rehman Javed ◽  
Chinmay Chakraborty ◽  
Jamel Nebhen ◽  
Wisha Zehra ◽  
...  

2021 ◽  
Vol 9 (2) ◽  
pp. 36-52
Author(s):  
Mashaal A. Alfhaid ◽  
Manal Abdullah

As the number of generated data increases every day, this has brought the importance of data mining and knowledge extraction. In traditional data mining, offline status can be used for knowledge extraction. Nevertheless, dealing with stream data mining is different due to continuously arriving data that can be processed at a single scan besides the appearance of concept drift. As the pre-processing stage is critical in knowledge extraction, imbalanced stream data gain significant popularity in the last few years among researchers. Many real-world applications suffer from class imbalance including medical, business, fraud detection and etc. Learning from the supervised model includes classes whether it is binary- or multi-classes. These classes are often imbalance where it is divided into the majority (negative) class and minority (positive) class, which can cause a bias toward the majority class that leads to skew in predictive performance models. Handles imbalance streaming data is mandatory for more accurate and reliable learning models. In this paper, we will present an overview of data stream mining and its tools. Besides, summarize the problem of class imbalance and its different approaches. In addition, researchers will present the popular evaluation metrics and challenges prone from imbalanced streaming data.


Sign in / Sign up

Export Citation Format

Share Document