An ensemble based incremental learning framework for concept drift and class imbalance

The accurate terrain classification in real time is of great importance to an autonomous robot working in field, because the robot could avoid non-geometric hazards, adjust control scheme, or improve localization accuracy, with the aid of terrain classification. In this paper, we investigate the vibration-based terrain classification (VTC) in a dynamic environment, and propose a novel learning framework, named DyVTC, which tackles online-collected unlabeled data with concept drift. In the DyVTC framework, the exterior disagreement (ex-disagreement) and interior disagreement (in-disagreement) are proposed novely based on the feature diversity and intrinsic temporal correlation, respectively. Such a disagreement mechanism is utilized to design a pseudo-labeling algorithm, which shows its compelling advantages in extracting key samples and labeling; and consequently, the classification accuracy could be retrieved by incremental learning in a changing environment. Since two sets of features are extracted from frequency and time domain to generate disagreements, we also name the proposed method feature-temporal disagreement adaptation (FTDA). The real-world experiment shows that the proposed DyVTC could reach an accuracy of 89.5%, but the traditional time- and frequency-domain terrain classification methods could only reach 48.8% and 71.5%, respectively, in a dynamic environment.

Download Full-text

Learning from Unbalanced Stream Data in Non-Stationary Environments Using Logistic Regression Model

Handbook of Research on Natural Computing for Optimization Problems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-0058-2.ch023 ◽

2016 ◽

pp. 561-582

Author(s):

Pallavi Digambarrao Kulkarni ◽

Roshani Ade

Keyword(s):

Learning Strategies ◽

Real World ◽

Incremental Learning ◽

Concept Drift ◽

Data Distribution ◽

Class Imbalance ◽

Learning Approaches ◽

Stream Data ◽

Future Data ◽

Distribution Generation

There are several deep learning approaches that can be applied for analyzing situations in real world problems and inventing their solution in a scientific technique. Supervised data mining methods that predicts instance values, using previously obtained results from already collected data are pretty popular due to their intelligence in machine learning area. Stream data is continuous form of data which can be handled by using incremental learning approach. Stream data learning may face several challenges in real world like concept drift or class imbalance. Concept drift occurs in non-stationary environment where data distribution generation function is dynamic in nature and has no fixed formula to predict the future data distribution nature. Neural network techniques are intelligent enough to improve performance of algorithmic systems that work in such problem domains. This chapter briefly describes how MLP technique is integrated in system so that the system becomes a complete framework for handling unbalanced data with concept drift in the incremental learning strategies.

Download Full-text

Learning from Unbalanced Stream Data in Non-Stationary Environments Using Logistic Regression Model

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch023 ◽

2020 ◽

pp. 386-407

Author(s):

Pallavi Digambarrao Kulkarni ◽

Roshani Ade

Keyword(s):

Learning Strategies ◽

Real World ◽

Incremental Learning ◽

Concept Drift ◽

Data Distribution ◽

Class Imbalance ◽

Learning Approaches ◽

Stream Data ◽

Future Data ◽

Distribution Generation

There are several deep learning approaches that can be applied for analyzing situations in real world problems and inventing their solution in a scientific technique. Supervised data mining methods that predicts instance values, using previously obtained results from already collected data are pretty popular due to their intelligence in machine learning area. Stream data is continuous form of data which can be handled by using incremental learning approach. Stream data learning may face several challenges in real world like concept drift or class imbalance. Concept drift occurs in non-stationary environment where data distribution generation function is dynamic in nature and has no fixed formula to predict the future data distribution nature. Neural network techniques are intelligent enough to improve performance of algorithmic systems that work in such problem domains. This chapter briefly describes how MLP technique is integrated in system so that the system becomes a complete framework for handling unbalanced data with concept drift in the incremental learning strategies.

Download Full-text

Dynamic Weighted Majority for Incremental Learning of Imbalanced Data Streams with Concept Drift

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/333 ◽

2017 ◽

Cited By ~ 7

Author(s):

Yang Lu ◽

Yiu-ming Cheung ◽

Yuan Yan Tang

Keyword(s):

Data Streams ◽

Incremental Learning ◽

High Efficiency ◽

Concept Drift ◽

Computational Cost ◽

Class Imbalance ◽

Real Data ◽

Current Data ◽

Class Imbalance Problem ◽

Weighted Majority

Concept drifts occurring in data streams will jeopardize the accuracy and stability of the online learning process. If the data stream is imbalanced, it will be even more challenging to detect and cure the concept drift. In the literature, these two problems have been intensively addressed separately, but have yet to be well studied when they occur together. In this paper, we propose a chunk-based incremental learning method called Dynamic Weighted Majority for Imbalance Learning (DWMIL) to deal with the data streams with concept drift and class imbalance problem. DWMIL utilizes an ensemble framework by dynamically weighting the base classifiers according to their performance on the current data chunk. Compared with the existing methods, its merits are four-fold: (1) it can keep stable for non-drifted streams and quickly adapt to the new concept; (2) it is totally incremental, i.e. no previous data needs to be stored; (3) it keeps a limited number of classifiers to ensure high efficiency; and (4) it is simple and needs only one thresholding parameter. Experiments on both synthetic and real data sets with concept drift show that DWMIL performs better than the state-of-the-art competitors, with less computational cost.

Download Full-text

Incremental learning framework for real‐world fraud detection environment

Computational Intelligence ◽

10.1111/coin.12434 ◽

2021 ◽

Vol 37 (1) ◽

pp. 635-656

Author(s):

Farzana Anowar ◽

Samira Sadaoui

Keyword(s):

Real World ◽

Incremental Learning ◽

Fraud Detection ◽

Learning Framework

Download Full-text

Risky Driver Recognition with Class Imbalance Data and Automated Machine Learning Framework

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18147534 ◽

2021 ◽

Vol 18 (14) ◽

pp. 7534

Author(s):

Ke Wang ◽

Qingwen Xue ◽

Jian John Lu

Keyword(s):

Machine Learning ◽

High Risk ◽

Loss Function ◽

Class Imbalance ◽

Support Vector ◽

Trajectory Data ◽

Recognition Model ◽

Learning Framework ◽

Sampling Cost ◽

Automated Machine Learning

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.

Download Full-text

Search State Compatibility Based Incremental Learning Framework and Output Deviation Based X-filling for Diagnostic Test Generation

Journal of Electronic Testing ◽

10.1007/s10836-010-5142-2 ◽

2010 ◽

Vol 26 (2) ◽

pp. 165-176 ◽

Cited By ~ 8

Author(s):

Maheshwar Chandrasekar ◽

Nikhil P. Rahagude ◽

Michael S. Hsiao

Keyword(s):

Diagnostic Test ◽

Test Generation ◽

Incremental Learning ◽

Learning Framework

Download Full-text

Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data

Complex & Intelligent Systems ◽

10.1007/s40747-021-00456-0 ◽

2021 ◽

Author(s):

S. Priya ◽

R. Annie Uthra

Keyword(s):

Decision Making ◽

Deep Learning ◽

Concept Drift ◽

Class Imbalance ◽

Streaming Data ◽

Superior Performance ◽

Data Streaming ◽

Minority Class ◽

Concept Drift Detection

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.

Download Full-text