On an Improved SPRINT Data Stream Online Classification Algorithm

Aiming at the characteristics of data stream, the paper presents an incremental decision tree algorithm based on binary-attribute tree on the basis of SPRINT algorithm. The attribute set of this improved algorithm adopts the maximum entropy attribute classification and dynamic storage method of Bayesian method. By using this improved algorithm, static organization form of candidate attributes set for traditional SPRINT algorithm has been changed and it is much more suitable for concept drift and reduces the time complexity for new sampling insertion and best division node selection as well as saves storage space and increases classification efficiency.

Download Full-text

Random Tree Data Stream Classifier With Sliding Window Estimator And Concept Drift

Bioscience Biotechnology Research Communications ◽

10.21786/bbrc/12.1/25 ◽

2019 ◽

Vol 12 (1) ◽

pp. 219-228

Author(s):

Ebtesam Almalki ◽

Manal Abdullah

Keyword(s):

Data Stream ◽

Concept Drift ◽

Sliding Window ◽

Random Tree ◽

Tree Data

Download Full-text

Analyzing and repairing concept drift adaptation in data stream classification

Machine Learning ◽

10.1007/s10994-021-05993-w ◽

2021 ◽

Author(s):

Ben Halstead ◽

Yun Sing Koh ◽

Patricia Riddle ◽

Russel Pears ◽

Mykola Pechenizkiy ◽

...

Keyword(s):

Data Stream ◽

Concept Drift ◽

Stream Classification ◽

Data Stream Classification

Download Full-text

Bhattacharyya Distance based Concept Drift Detection Method For evolving data stream

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115303 ◽

2021 ◽

pp. 115303

Author(s):

Ishwar Baidari ◽

Nagaraj Honnikoll

Keyword(s):

Data Stream ◽

Detection Method ◽

Concept Drift ◽

Bhattacharyya Distance ◽

Concept Drift Detection ◽

Evolving Data

Download Full-text

Learning from Ontology Streams with Semantic Concept Drift

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/133 ◽

2017 ◽

Cited By ~ 7

Author(s):

Jiaoyan Chen ◽

Freddy Lecue ◽

Jeff Z. Pan ◽

Huajun Chen

Keyword(s):

Semantic Web ◽

Data Stream ◽

Concept Drift ◽

Data Distribution ◽

Accurate Prediction ◽

Knowledge Structures ◽

Semantic Concept ◽

Web Data ◽

Semantic Inference

Data stream learning has been largely studied for extracting knowledge structures from continuous and rapid data records. In the semantic Web, data is interpreted in ontologies and its ordered sequence is represented as an ontology stream. Our work exploits the semantics of such streams to tackle the problem of concept drift i.e., unexpected changes in data distribution, causing most of models to be less accurate as time passes. To this end we revisited (i) semantic inference in the context of supervised stream learning, and (ii) models with semantic embeddings. The experiments show accurate prediction with data from Dublin and Beijing.

Download Full-text

An Improved Differential Evolution Algorithm for Data Stream Clustering

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v9i4.pp2659-2667 ◽

2019 ◽

Vol 9 (4) ◽

pp. 2659

Author(s):

Bhaskar Adepu ◽

Jayadev Gyani ◽

G. Narsimha

Keyword(s):

Differential Evolution ◽

Data Stream ◽

Concept Drift ◽

Differential Evolution Algorithm ◽

Optimization Approach ◽

Stream Clustering ◽

Data Stream Clustering ◽

Evolution Algorithm ◽

Improved Differential Evolution Algorithm ◽

Measure Estimate

A Few algorithms were actualized by the analysts for performing clustering of data streams. Most of these algorithms require that the number of clusters (K) has to be fixed by the customer based on input data and it can be kept settled all through the clustering process. Stream clustering has faced few difficulties in picking up K. In this paper, we propose an efficient approach for data stream clustering by embracing an Improved Differential Evolution (IDE) algorithm. The IDE algorithm is one of the quick, powerful and productive global optimization approach for programmed clustering. In our proposed approach, we additionally apply an entropy based method for distinguishing the concept drift in the data stream and in this way updating the clustering procedure online. We demonstrated that our proposed method is contrasted with Genetic Algorithm and identified as proficient optimization algorithm. The performance of our proposed technique is assessed and cr eates the accuracy of 92.29%, the precision is 86.96%, recall is 90.30% and F-measure estimate is 88.60%.

Download Full-text

Heuristic ensemble for unsupervised detection of multiple types of concept drift in data stream classification

Intelligent Decision Technologies ◽

10.3233/idt-210115 ◽

2021 ◽

pp. 1-14

Author(s):

Hanqing Hu ◽

Mehmed Kantardzic

Keyword(s):

Data Stream ◽

Concept Drift ◽

False Alarms ◽

Detection Accuracy ◽

Real World Data ◽

Traditional Concept ◽

Stream Classification ◽

Data Stream Classification ◽

Detection Algorithms ◽

Concept Drift Detection

Real-world data stream classification often deals with multiple types of concept drift, categorized by change characteristics such as speed, distribution, and severity. When labels are unavailable, traditional concept drift detection algorithms, used in stream classification frameworks, are often focused on only one type of concept drift. To overcome the limitations of traditional detection algorithms, this study proposed a Heuristic Ensemble Framework for Drift Detection (HEFDD). HEFDD aims to detect all types of concept drift by employing an ensemble of selected concept drift detection algorithms, each capable of detecting at least one type of concept drift. Experimental results show HEFDD provides significant improvement based on the z-score test when comparing detection accuracy with state-of-the-art individual algorithms. At the same time, HEFDD is able to reduce false alarms generated by individual concept drift detection algorithms.

Download Full-text

Knowledge Discovery From Evolving Data Streams

Advances in Business Information Systems and Analytics - Machine Learning Techniques for Improved Business Analytics ◽

10.4018/978-1-5225-3534-8.ch002 ◽

2019 ◽

pp. 19-39

Author(s):

Prasanna Lakshmi Kompalli

Keyword(s):

Real Time ◽

Data Streams ◽

Data Stream ◽

Concept Drift ◽

Data Stream Mining ◽

Time Data ◽

Stream Mining ◽

New Challenges ◽

Mining Data Streams ◽

Different Sources

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.

Download Full-text

Data Stream Mining Using Ensemble Classifier

Collaborative Filtering Using Data Mining and Analysis - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-0489-4.ch013 ◽

2017 ◽

pp. 236-249

Author(s):

Snehlata Sewakdas Dongre ◽

Latesh G. Malik

Keyword(s):

Collaborative Filtering ◽

Data Stream ◽

Concept Drift ◽

Ensemble Classifier ◽

Ensemble Classification ◽

Data Stream Mining ◽

Main Concern ◽

Stream Mining ◽

Stream Classification ◽

Data Stream Classification

A data stream is giant amount of data which is generated uncontrollably at a rapid rate from many applications like call detail records, log records, sensors applications etc. Data stream mining has grasped the attention of so many researchers. A rising problem in Data Streams is the handling of concept drift. To be a good algorithm it should adapt the changes and handle the concept drift properly. Ensemble classification method is the group of classifiers which works in collaborative manner. Overall this chapter will cover all the aspects of the data stream classification. The mission of this chapter is to discuss various techniques which use collaborative filtering for the data stream mining. The main concern of this chapter is to make reader familiar with the data stream domain and data stream mining. Instead of single classifier the group of classifiers is used to enhance the accuracy of classification. The collaborative filtering will play important role here how the different classifiers work collaborative within the ensemble to achieve a goal.

Download Full-text

CLASSIFICATION OF CONCEPT DRIFT IN EVOLVING DATA STREAM

Emerging Extended Reality Technologies For Industry 4.0 ◽

10.1002/9781119654674.ch11 ◽

2020 ◽

pp. 189-205

Author(s):

Mashail Althabiti ◽

Manal Abdullah

Keyword(s):

Data Stream ◽

Concept Drift ◽

Evolving Data

Download Full-text

An ensemble based on neural networks with random weights for online data stream regression

Soft Computing ◽

10.1007/s00500-019-04499-x ◽

2019 ◽

Vol 24 (13) ◽

pp. 9835-9855 ◽

Cited By ~ 3

Author(s):

Ricardo de Almeida ◽

Yee Mey Goh ◽

Radmehr Monfared ◽

Maria Teresinha Arns Steiner ◽

Andrew West

Keyword(s):

Data Stream ◽

Prediction Accuracy ◽

Concept Drift ◽

Learning Algorithms ◽

Data Distribution ◽

Machine Learning Algorithms ◽

Computational Time ◽

Online Data ◽

Data Prediction ◽

Random Weights

Abstract Most information sources in the current technological world are generating data sequentially and rapidly, in the form of data streams. The evolving nature of processes may often cause changes in data distribution, also known as concept drift, which is difficult to detect and causes loss of accuracy in supervised learning algorithms. As a consequence, online machine learning algorithms that are able to update actively according to possible changes in the data distribution are required. Although many strategies have been developed to tackle this problem, most of them are designed for classification problems. Therefore, in the domain of regression problems, there is a need for the development of accurate algorithms with dynamic updating mechanisms that can operate in a computational time compatible with today’s demanding market. In this article, the authors propose a new bagging ensemble approach based on neural network with random weights for online data stream regression. The proposed method improves the data prediction accuracy as well as minimises the required computational time compared to a recent algorithm for online data stream regression from literature. The experiments are carried out using four synthetic datasets to evaluate the algorithm’s response to concept drift, along with four benchmark datasets from different industries. The results indicate improvement in data prediction accuracy, effectiveness in handling concept drift, and much faster updating times compared to the existing available approach. Additionally, the use of design of experiments as an effective tool for hyperparameter tuning is demonstrated.

Download Full-text