scholarly journals Data stream mining: methods and challenges for handling concept drift

2019 ◽  
Vol 1 (11) ◽  
Author(s):  
Scott Wares ◽  
John Isaacs ◽  
Eyad Elyan

Abstract Mining and analysing streaming data is crucial for many applications, and this area of research has gained extensive attention over the past decade. However, there are several inherent problems that continue to challenge the hardware and the state-of-the art algorithmic solutions. Examples of such problems include the unbound size, varying speed and unknown data characteristics of arriving instances from a data stream. The aim of this research is to portray key challenges faced by algorithmic solutions for stream mining, particularly focusing on the prevalent issue of concept drift. A comprehensive discussion of concept drift and its inherent data challenges in the context of stream mining is presented, as is a critical, in-depth review of relevant literature. Current issues with the evaluative procedure for concept drift detectors is also explored, highlighting problems such as a lack of established base datasets and the impact of temporal dependence on concept drift detection. By exposing gaps in the current literature, this study suggests recommendations for future research which should aid in the progression of stream mining and concept drift detection algorithms.

2021 ◽  
pp. 1-14
Author(s):  
Hanqing Hu ◽  
Mehmed Kantardzic

Real-world data stream classification often deals with multiple types of concept drift, categorized by change characteristics such as speed, distribution, and severity. When labels are unavailable, traditional concept drift detection algorithms, used in stream classification frameworks, are often focused on only one type of concept drift. To overcome the limitations of traditional detection algorithms, this study proposed a Heuristic Ensemble Framework for Drift Detection (HEFDD). HEFDD aims to detect all types of concept drift by employing an ensemble of selected concept drift detection algorithms, each capable of detecting at least one type of concept drift. Experimental results show HEFDD provides significant improvement based on the z-score test when comparing detection accuracy with state-of-the-art individual algorithms. At the same time, HEFDD is able to reduce false alarms generated by individual concept drift detection algorithms.


2020 ◽  
Vol 2020 ◽  
pp. 1-8
Author(s):  
Xiangjun Li ◽  
Yong Zhou ◽  
Ziyan Jin ◽  
Peng Yu ◽  
Shun Zhou

Data stream mining has become a research hotspot in data mining and has attracted the attention of many scholars. However, the traditional data stream mining technology still has some problems to be solved in dealing with concept drift and concept evolution. In order to alleviate the influence of concept drift and concept evolution on novel class detection and classification, this paper proposes a classification and novel class detection algorithm based on the cohesiveness and separation index of Mahalanobis distance. Experimental results show that the algorithm can effectively mitigate the impact of concept drift on classification and novel class detection.


Author(s):  
S. Priya ◽  
R. Annie Uthra

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.


2016 ◽  
Vol 64 (8) ◽  
pp. 643 ◽  
Author(s):  
Christopher N. Johnson

Since the 1960s, Australian scientists have speculated on the impact of human arrival on fire regimes in Australia, and on the relationship of landscape fire to extinction of the Pleistocene megafauna of Australia. These speculations have produced a series of contrasting hypotheses that can now be tested using evidence collected over the past two decades. In the present paper, I summarise those hypotheses and review that evidence. The main conclusions of this are that (1) the effects of people on fire regimes in the Pleistocene were modest at the continental scale, and difficult to distinguish from climatic controls on fire, (2) the arrival of people triggered extinction of Australia’s megafauna, but fire had little or no role in the extinction of those animals, which was probably due primarily to hunting and (3) megafaunal extinction is likely to have caused a cascade of changes that included increased fire, but only in some environments. We do not yet understand what environmental factors controlled the strength and nature of cascading effects of megafaunal extinction. This is an important topic for future research.


10.28945/4058 ◽  
2018 ◽  
Vol 17 ◽  
pp. 113-126
Author(s):  
Gina Harden ◽  
Robert M. Crocker ◽  
Kelly Noe

Aim/Purpose: The dynamic nature of the information systems (IS) field presents educators with the perpetual challenge of keeping course offerings current and relevant. This paper describes the process at a College of Business (COB) to redesign the introductory IS course to better prepare students for advanced business classes and equip them with interdisciplinary knowledge and skills demanded in today’s workplace. Background: The course was previously in the Computer Science (CSC) Department, itself within the COB. However, an administrative restructuring resulted in the CSC department’s removal from the COB and left the core course in limbo. Methodology: This paper presents a case study using focus groups with students, faculty, and advisory council members to assess the value of the traditional introductory course. A survey was distributed to students after implementation of the newly developed course to assess the reception of the course. Contribution: This paper provides an outline of the decision-making process leading to the course redesign of the introductory IS course, including the context and the process of a new course development. Practical suggestions for implementing and teaching an introductory IS course in a business school are given. Findings: Focus group assessment revealed that stakeholders rated the existing introductory IS course of minimal value as students progressed through the COB program, and even less upon entering the workforce. The findings indicated a complete overhaul of the course was required. Recommendations for Practitioners: The subject of technology sometimes requires more than a simple update to the curriculum. When signs point to the need for a complete overhaul, this paper gives practical guidance supplemented with relevant literature for other academicians to follow. Recommendation for Researchers: Students are faced with increasing pressure to be proficient with the latest technology, in both the classroom where educators are trying to prepare them for the modern workplace, as well as the organization which faces an even greater pressure to leverage the latest technology. The newly designed introductory IS course provides students, and eventually organizations, a better measure of this proficiency. Future Research: Future research on the efficacy of this new course design should include longitudinal data to determine the impact on graduates, and eventually the assessment of those graduates’ performance in the workplace.


Sign in / Sign up

Export Citation Format

Share Document