Dynamically adaptive and diverse dual ensemble learning approach for handling concept drift in data streams

Author(s):  
Kanu Goel ◽  
Shalini Batra
Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Sanmin Liu ◽  
Shan Xue ◽  
Fanzhen Liu ◽  
Jieren Cheng ◽  
Xiulai Li ◽  
...  

Data stream classification becomes a promising prediction work with relevance to many practical environments. However, under the environment of concept drift and noise, the research of data stream classification faces lots of challenges. Hence, a new incremental ensemble model is presented for classifying nonstationary data streams with noise. Our approach integrates three strategies: incremental learning to monitor and adapt to concept drift; ensemble learning to improve model stability; and a microclustering procedure that distinguishes drift from noise and predicts the labels of incoming instances via majority vote. Experiments with two synthetic datasets designed to test for both gradual and abrupt drift show that our method provides more accurate classification in nonstationary data streams with noise than the two popular baselines.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Ahmed Abbasi ◽  
Abdul Rehman Javed ◽  
Chinmay Chakraborty ◽  
Jamel Nebhen ◽  
Wisha Zehra ◽  
...  

Author(s):  
Yange Sun ◽  
Han Shao ◽  
Bencai Zhang

Ensemble classification is an actively researched paradigm that has received much attention due to increasing real-world applications. The crucial issue of ensemble learning is to construct a pool of base classifiers with accuracy and diversity. In this paper, unlike conventional data-streams oriented ensemble methods, we propose a novel Measure via both Accuracy and Diversity (MAD) instead of one of them to supervise ensemble learning. Based on MAD, a novel online ensemble method called Accuracy and Diversity weighted Ensemble (ADE) effectively handles concept drift in data streams. ADE mainly uses the following three steps to construct a concept-drift oriented ensemble: for the current data window, 1) a new base classifier is constructed based on the current concept when drift detect, 2) MAD is used to measure the performance of ensemble members, and 3) a newly built classifier replaces the worst base classifier. If the newly constructed classifier is the worst one, the replacement has not occurred. Comparing with the state-of-art algorithms, ADE exceeds the current best-related algorithm by 2.38% in average classification accuracy. Experimental results show that the proposed method can effectively adapt to different types of drifts.


Smart Cities ◽  
2021 ◽  
Vol 4 (1) ◽  
pp. 349-371
Author(s):  
Hassan Mehmood ◽  
Panos Kostakos ◽  
Marta Cortes ◽  
Theodoros Anagnostopoulos ◽  
Susanna Pirttikangas ◽  
...  

Real-world data streams pose a unique challenge to the implementation of machine learning (ML) models and data analysis. A notable problem that has been introduced by the growth of Internet of Things (IoT) deployments across the smart city ecosystem is that the statistical properties of data streams can change over time, resulting in poor prediction performance and ineffective decisions. While concept drift detection methods aim to patch this problem, emerging communication and sensing technologies are generating a massive amount of data, requiring distributed environments to perform computation tasks across smart city administrative domains. In this article, we implement and test a number of state-of-the-art active concept drift detection algorithms for time series analysis within a distributed environment. We use real-world data streams and provide critical analysis of results retrieved. The challenges of implementing concept drift adaptation algorithms, along with their applications in smart cities, are also discussed.


Synlett ◽  
2020 ◽  
Author(s):  
Akira Yada ◽  
Kazuhiko Sato ◽  
Tarojiro Matsumura ◽  
Yasunobu Ando ◽  
Kenji Nagata ◽  
...  

AbstractThe prediction of the initial reaction rate in the tungsten-catalyzed epoxidation of alkenes by using a machine learning approach is demonstrated. The ensemble learning framework used in this study consists of random sampling with replacement from the training dataset, the construction of several predictive models (weak learners), and the combination of their outputs. This approach enables us to obtain a reasonable prediction model that avoids the problem of overfitting, even when analyzing a small dataset.


Sign in / Sign up

Export Citation Format

Share Document