scholarly journals Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift

2015 ◽  
Vol 2015 ◽  
pp. 1-14 ◽  
Author(s):  
Agustín Ortíz Díaz ◽  
José del Campo-Ávila ◽  
Gonzalo Ramos-Jiménez ◽  
Isvani Frías Blanco ◽  
Yailé Caballero Mota ◽  
...  

The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically designed to deal with recurring concepts. FAE processes the learning examples in blocks of the same size, but it does not have to wait for the batch to be complete in order to adapt its base classification mechanism. FAE incorporates a drift detector to improve the handling of abrupt concept drifts and stores a set of inactive classifiers that represent old concepts, which are activated very quickly when these concepts reappear. We compare our new algorithm with various well-known learning algorithms, taking into account, common benchmark datasets. The experiments show promising results from the proposed algorithm (regarding accuracy and runtime), handling different types of concept drifts.

2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Tinofirei Museba ◽  
Fulufhelo Nelwamondo ◽  
Khmaies Ouahada

Beyond applying machine learning predictive models to static tasks, a significant corpus of research exists that applies machine learning predictive models to streaming environments that incur concept drift. With the prevalence of streaming real-world applications that are associated with changes in the underlying data distribution, the need for applications that are capable of adapting to evolving and time-varying dynamic environments can be hardly overstated. Dynamic environments are nonstationary and change with time and the target variables to be predicted by the learning algorithm and often evolve with time, a phenomenon known as concept drift. Most work in handling concept drift focuses on updating the prediction model so that it can recover from concept drift while little effort has been dedicated to the formulation of a learning system that is capable of learning different types of drifting concepts at any time with minimum overheads. This work proposes a novel and evolving data stream classifier called Adaptive Diversified Ensemble Selection Classifier (ADES) that significantly optimizes adaptation to different types of concept drifts at any time and improves convergence to new concepts by exploiting different amounts of ensemble diversity. The ADES algorithm generates diverse base classifiers, thereby optimizing the margin distribution to exploit ensemble diversity to formulate an ensemble classifier that generalizes well to unseen instances and provides fast recovery from different types of concept drift. Empirical experiments conducted on both artificial and real-world data streams demonstrate that ADES can adapt to different types of drifts at any given time. The prediction performance of ADES is compared to three other ensemble classifiers designed to handle concept drift using both artificial and real-world data streams. The comparative evaluation performed demonstrated the ability of ADES to handle different types of concept drifts. The experimental results, including statistical test results, indicate comparable performances with other algorithms designed to handle concept drift and prove their significance and effectiveness.


Author(s):  
Prasanna Lakshmi Kompalli

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.


Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

In data mining the task of extracting classification rules from large data is an important task and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. The ant miner is inspired by researches on the behaviour of real ant colonies, simulated annealing, and some data mining concepts as well as principles. This paper presents a Pittsburgh style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. The experimental outcomes confirm that ant miner-HPB (Hybrid Pittsburgh Style Classification) is significantly better than ant-miner-PB (Pittsburgh Style Classification).


2011 ◽  
Vol 36 (3) ◽  
pp. 163-178 ◽  
Author(s):  
Periasamy Vivekanandan ◽  
Raju Nedunchezhian

2018 ◽  
Vol 7 (3.6) ◽  
pp. 148
Author(s):  
M Sankara Prasanna Kumar ◽  
A P. Siva Kumar ◽  
K Prasanna

Concept drift is defined as the distributed data across multiple data streams that change over the time. Concept drift is visible only when the type of collected data changes after some stable period. The emergence of concept drift in data streams leads to increase misclassification and performing degradation of data streams. In order to obtain accurate results, identification of such concept drifts must be visible. This paper focused on a review of the issues related to identifying the changes occurred in the various multivariate high dimensional data streams. The insight of the manuscript is probing the inbuilt difficulties of existing contemporary change-detection methods when they encounter during data dimensions scales.  


Data Mining means a procedure to extracting the information out of large data. Data miningapproaches includes classification, association rule, clustering, etc. Data mining is applied in four stages such as data sources, data extrapolation / gathering, modeling and deploying modules. Classification is a method in data mining to predict the group membership of data instances. It’s an method useful in data mining with vast applications for classifying the different types of data used in almost every fields. Classification is giving a class label to in determine set of cases. In this survey, we would like discuss Bayesian classification, rules based classification, Decision trees &neural network.


Author(s):  
Y. Fakir ◽  
M. Azalmad ◽  
R. Elaychi

Data Mining is a process of exploring against large data to find patterns in decision-making. One of the techniques in decision-making is classification. Data classification is a form of data analysis used to extract models describing important data classes. There are many classification algorithms. Each classifier encompasses some algorithms in order to classify object into predefined classes. Decision Tree is one such important technique, which builds a tree structure by incrementally breaking down the datasets in smaller subsets. Decision Trees can be implemented by using popular algorithms such as ID3, C4.5 and CART etc. The present study considers ID3 and C4.5 algorithms to build a decision tree by using the “entropy” and “information gain” measures that are the basics components behind the construction of a classifier model


2020 ◽  
Vol 11 (2) ◽  
pp. 47-64
Author(s):  
Bijaya Kumar Nanda ◽  
Satchidananda Dehuri

Discovering classification rules from large data is an important task of data mining and is gaining considerable attention. This article presents a novel ant miner for classification rule mining. Our ant miner is inspired by research on the behavior of real ant colonies, simulated annealing, and some data mining concepts as well as principles. Here we present a Michigan style approach for single objective classification rule mining. The algorithm is tested on a few benchmark datasets drawn from UCI repository. Our experimental outcomes confirm that ant miner-HMC (Hybrid Michigan Style Classification) is significantly better than ant-miner-MC (Michigan Style Classification).


Sign in / Sign up

Export Citation Format

Share Document