scholarly journals Fast Reaction to Sudden Concept Drift in the Absence of Class Labels

2020 ◽  
Vol 10 (2) ◽  
pp. 606 ◽  
Author(s):  
Osama A. Mahdi ◽  
Eric Pardede ◽  
Nawfal Ali ◽  
Jinli Cao

A data stream can be considered as a sequence of examples that arrive continuously and are potentially unbounded, such as web page visits, sensor readings and call records. One of the serious and challenging problems that appears in a data stream is concept drift. This problem occurs when the relation between the input data and the target variable changes over time. Most existing works make an optimistic assumption that all incoming data are labelled and the class labels are available immediately. However, such an assumption is not always valid. Therefore, a lack of class labels aggravates the problem of concept drift detection. With this motivation, we propose a drift detector that reacts naturally to sudden drifts in the absence of class labels. In a novel way, the proposed detector reacts to concept drift in the absence of class labels, where the true label of an example is not necessary. Instead of monitoring the error estimates, the proposed detector monitors the diversity of a pair of classifiers, where the true label of an example is not necessary to determine whether components disagree. Using several datasets, an experimental evaluation and comparison is conducted against several existing detectors. The experiment results show that the proposed detector can detect drifts with less delay, runtime and memory usage.

Author(s):  
Nils Finke ◽  
Tanya Braun ◽  
Marcel Gehrke ◽  
Ralf Möller

Dynamic probabilistic relational models, which are factorized w.r.t. a full joint distribution, are used to cater for uncertainty and for relational and temporal aspects in real-world data. While these models assume the underlying temporal process to be stationary, real-world data often exhibits non-stationary behavior where the full joint distribution changes over time. We propose an approach to account for non-stationary processes w.r.t. to changing probability distributions over time, an effect known as concept drift. We use factorization and compact encoding of relations to efficiently detect drifts towards new probability distributions based on evidence.


Author(s):  
A. Maas ◽  
F. Rottensteiner ◽  
C. Heipke

Supervised classification of remotely sensed images is a classical method for change detection. The task requires training data in the form of image data with known class labels, whose manually generation is time-consuming. If the labels are acquired from the outdated map, the classifier must cope with errors in the training data. These errors, referred to as label noise, typically occur in clusters in object space, because they are caused by land cover changes over time. In this paper we adapt a label noise tolerant training technique for classification, so that the fact that changes affect larger clusters of pixels is considered. We also integrate the existing map into an iterative classification procedure to act as a prior in regions which are likely to contain changes. Our experiments are based on three test areas, using real images with simulated existing databases. Our results show that this method helps to distinguish between real changes over time and false detections caused by misclassification and thus improves the accuracy of the classification results.


2019 ◽  
Vol 06 (02) ◽  
pp. 223-256
Author(s):  
Amal Abid ◽  
Salma Jamoussi ◽  
Abdelmajid Ben Hamadou

The spread of real-time applications has led to a huge amount of data shared between users. This vast volume of data rapidly evolving over time is referred to as data stream. Clustering and processing such data poses many challenges to the data mining community. Indeed, traditional data mining techniques become unfeasible to mine such a continuous flow of data where characteristics, features, and concepts are rapidly changing over time. This paper presents a novel method for data stream clustering. In this context, major challenges of data stream processing are addressed, namely, infinite length, concept drift, novelty detection, and feature evolution. To handle these issues, the proposed method uses the Artificial Immune System (AIS) meta-heuristic. The latter has been widely used for data mining tasks and it owns the property of adaptability required by data stream clustering algorithms. Our method, called AIS-Clus, is able to detect novel concepts using the performance of the learning process of the AIS meta-heuristic. Furthermore, AIS-Clus has the ability to adapt its model to handle concept drift and feature evolution for textual data streams. Experimental results have been performed on textual datasets where efficient and promising results are obtained.


Author(s):  
Gladys Castillo ◽  
João Gama ◽  
Ana M. Breda

This chapter presents an adaptive predictive model for a student modeling prediction task in the context of an adaptive educational hypermedia system (AEHS). The task, that consists in determining what kind of learning resources are more appropriate to a particular learning style, presents two issues that are critical. The first is related to the uncertainty of the information about the student’s learning style acquired by psychometric instruments. The second is related to the changes over time of the student’s preferences (concept drift). To approach this task, we propose a probabilistic adaptive predictive model that includes a method to handle concept drift based on statistical quality control. We claim that our approach is able to adapt quickly to changes in the student’s preferences and that it should be successfully used in similar user modeling prediction tasks, where uncertainty and concept drift are presented.


Author(s):  
Meenakshi Anurag Thalor ◽  
Shrishailapa Patil

<span lang="EN-US">Incremental Learning on non stationary distribution has been shown to be a very challenging problem in machine learning and data mining, because the joint probability distribution between the data and classes changes over time. Many real time problems suffer concept drift as they changes with time. For example, an advertisement recommendation system, in which customer’s behavior may change depending on the season of the year, on the inflation and on new products made available. An extra challenge arises when the classes to be learned are not represented equally in the training data i.e. classes are imbalanced, as most machine learning algorithms work well only when the training data  is balanced. The objective of this paper is to develop an ensemble based classification algorithm for non-stationary data stream (ENSDS) with focus on two-class problems. In addition, we are presenting here an exhaustive comparison of purposed algorithms with state-of-the-art classification approaches using different evaluation measures like recall, f-measure and g-mean</span>


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1080
Author(s):  
Namuk Park ◽  
Songkuk Kim

Efficient and accurate estimation of the probability distribution of a data stream is an important problem in many sensor systems. It is especially challenging when the data stream is non-stationary, i.e., its probability distribution changes over time. Statistical models for non-stationary data streams demand agile adaptation for concept drift while tolerating temporal fluctuations. To this end, a statistical model needs to forget old data samples and to detect concept drift swiftly. In this paper, we propose FlexSketch, an online probability density estimation algorithm for data streams. Our algorithm uses an ensemble of histograms, each of which represents a different length of data history. FlexSketch updates each histogram for a new data sample and generates probability distribution by combining the ensemble of histograms while monitoring discrepancy between recent data and existing models periodically. When it detects concept drift, a new histogram is added to the ensemble and the oldest histogram is removed. This allows us to estimate the probability density function with high update speed and high accuracy using only limited memory. Experimental results demonstrate that our algorithm shows improved speed and accuracy compared to existing methods for both stationary and non-stationary data streams.


2008 ◽  
pp. 562-578 ◽  
Author(s):  
Gladys Castillo ◽  
João Gama ◽  
Ana M. Breda

This chapter presents an adaptive predictive model for a student modeling prediction task in the context of an adaptive educational hypermedia system (AEHS). The task, that consists in determining what kind of learning resources are more appropriate to a particular learning style, presents two issues that are critical. The first is related to the uncertainty of the information about the student’s learning style acquired by psychometric instruments. The second is related to the changes over time of the student’s preferences (concept drift). To approach this task, we propose a probabilistic adaptive predictive model that includes a method to handle concept drift based on statistical quality control. We claim that our approach is able to adapt quickly to changes in the student’s preferences and that it should be successfully used in similar user modeling prediction tasks, where uncertainty and concept drift are presented.


2011 ◽  
pp. 1307-1324
Author(s):  
Gladys Castillo ◽  
João Gama ◽  
Ana M. Breda

This chapter presents an adaptive predictive model for a student modeling prediction task in the context of an adaptive educational hypermedia system (AEHS). The task, that consists in determining what kind of learning resources are more appropriate to a particular learning style, presents two issues that are critical. The first is related to the uncertainty of the information about the student’s learning style acquired by psychometric instruments. The second is related to the changes over time of the student’s preferences (concept drift). To approach this task, we propose a probabilistic adaptive predictive model that includes a method to handle concept drift based on statistical quality control. We claim that our approach is able to adapt quickly to changes in the student’s preferences and that it should be successfully used in similar user modeling prediction tasks, where uncertainty and concept drift are presented.


Author(s):  
Meenakshi Anurag Thalor ◽  
Shrishailapa Patil

<span lang="EN-US">Incremental Learning on non stationary distribution has been shown to be a very challenging problem in machine learning and data mining, because the joint probability distribution between the data and classes changes over time. Many real time problems suffer concept drift as they changes with time. For example, an advertisement recommendation system, in which customer’s behavior may change depending on the season of the year, on the inflation and on new products made available. An extra challenge arises when the classes to be learned are not represented equally in the training data i.e. classes are imbalanced, as most machine learning algorithms work well only when the training data  is balanced. The objective of this paper is to develop an ensemble based classification algorithm for non-stationary data stream (ENSDS) with focus on two-class problems. In addition, we are presenting here an exhaustive comparison of purposed algorithms with state-of-the-art classification approaches using different evaluation measures like recall, f-measure and g-mean</span>


VASA ◽  
2015 ◽  
Vol 44 (5) ◽  
pp. 355-362 ◽  
Author(s):  
Marie Urban ◽  
Alban Fouasson-Chailloux ◽  
Isabelle Signolet ◽  
Christophe Colas Ribas ◽  
Mathieu Feuilloy ◽  
...  

Abstract. Summary: Background: We aimed at estimating the agreement between the Medicap® (photo-optical) and Radiometer® (electro-chemical) sensors during exercise transcutaneous oxygen pressure (tcpO2) tests. Our hypothesis was that although absolute starting values (tcpO2rest: mean over 2 minutes) might be different, tcpO2-changes over time and the minimal value of the decrease from rest of oxygen pressure (DROPmin) results at exercise shall be concordant between the two systems. Patients and methods: Forty seven patients with arterial claudication (65 + / - 7 years) performed a treadmill test with 5 probes each of the electro-chemical and photo-optical devices simultaneously, one of each system on the chest, on each buttock and on each calf. Results: Seventeen Medicap® probes disconnected during the tests. tcpO2rest and DROPmin values were higher with Medicap® than with Radiometer®, by 13.7 + / - 17.1 mm Hg and 3.4 + / - 11.7 mm Hg, respectively. Despite the differences in absolute starting values, changes over time were similar between the two systems. The concordance between the two systems was approximately 70 % for classification of test results from DROPmin. Conclusions: Photo-optical sensors are promising alternatives to electro-chemical sensors for exercise oximetry, provided that miniaturisation and weight reduction of the new sensors are possible.


Sign in / Sign up

Export Citation Format

Share Document