A Comparative Analysis on Ensemble Classifiers for Concept Drifting Data Streams

Author(s):  
Nalini Nagendran ◽  
H. Parveen Sultana ◽  
Amitrajit Sarkar
2016 ◽  
Vol 20 (6) ◽  
pp. 1329-1350 ◽  
Author(s):  
Mahdie Dehghan ◽  
Hamid Beigy ◽  
Poorya ZareMoodi

Author(s):  
Bartosz Krawczyk ◽  
Alberto Cano

Learning from data streams is among the most vital contemporary fields in machine learning and data mining. Streams pose new challenges to learning systems, due to their volume and velocity, as well as ever-changing nature caused by concept drift. Vast majority of works for data streams assume a fully supervised learning scenario, having an unrestricted access to class labels. This assumption does not hold in real-world applications, where obtaining ground truth is costly and time-consuming. Therefore, we need to carefully select which instances should be labeled, as usually we are working under a strict label budget. In this paper, we propose a novel active learning approach based on ensemble algorithms that is capable of using multiple base classifiers during the label query process. It is a plug-in solution, capable of working with most of existing streaming ensemble classifiers. We realize this process as a Multi-Armed Bandit problem, obtaining an efficient and adaptive ensemble active learning procedure by selecting the most competent classifier from the pool for each query. In order to better adapt to concept drifts, we guide our instance selection by measuring the generalization capabilities of our classifiers. This adaptive solution leads not only to better instance selection under sparse access to class labels, but also to improved adaptation to various types of concept drift and increasing the diversity of the underlying ensemble classifier.


2021 ◽  
Author(s):  
Priya S ◽  
Annie Uthra

Abstract As the data mining applications are increasing popularly, large volumes of data streams are generated over the period of time. The main problem in data streams is that it exhibits a high degree of class imbalance and distribution of data changes over time. In this paper, Timely Drift Detection and Minority Resampling Technique (TDDMRT) based on K-nearest neighbor and Jaccard similarity is proposed to handle the class imbalance by finding the current ratio of class labels. The Enhanced Early Drift Detection Method (EEDDM) is proposed for detecting the concept drift and the Minority Resampling Method (KNN-JS) determines whether the current data stream should be regarded as imbalance and it resamples the minority instances in the drifting data stream. The K-Nearest Neighbors technique is used to resample the minority classes and the Jaccard similarity measure is established over the resampled data to generate the synthetic data similar to the original data and it is handled by ensemble classifiers. The proposed ensemble based classification model outperforms the existing over sampling and under sampling techniques with accuracy of 98.52%.


Author(s):  
S.K.Komagal Yallini ◽  
Dr. B. Mukunthan

Multi-Label Learning (MLL) solves the challenge of characterizing every sample via a particular feature which relates to the group of labels at once. That is, a sample has manifold views where every view is symbolized through a Class Label (CL). In the past decades, significant number of researches has been prepared towards this promising machine learning concept. Such researches on MLL have been motivated on a pre-determined group of CLs. In most of the appliances, the configuration is dynamic and novel views might appear in a Data Stream (DS). In this scenario, a MLL technique should able to identify and categorize the features with evolving fresh labels for maintaining a better predictive performance. For this purpose, several MLL techniques were introduced in the earlier decades. This article aims to present a survey on this field with consequence on conventional MLL techniques. Initially, various MLL techniques proposed by many researchers are studied. Then, a comparative analysis is carried out in terms of merits and demerits of those techniques to conclude the survey and recommend the future enhancements on MLL techniques.


Sign in / Sign up

Export Citation Format

Share Document