Data stream mining based real-time highspeed traffic classification

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.

Download Full-text

Dynamic Online Traffic Classification Using Data Stream Mining

2008 International Conference on MultiMedia and Information Technology ◽

10.1109/mmit.2008.185 ◽

2008 ◽

Cited By ~ 6

Author(s):

Xu Tian ◽

Qiong Sun ◽

Xiaohong Huang ◽

Yan Ma

Keyword(s):

Data Stream ◽

Data Stream Mining ◽

Traffic Classification ◽

Stream Mining ◽

Using Data ◽

Online Traffic

Download Full-text

Real-time Decision Rules for Diabetes Therapy Management by Data Stream Mining

IT Professional ◽

10.1109/mitp.2017.265104658 ◽

2017 ◽

pp. 1-1 ◽

Cited By ~ 3

Author(s):

Simon Fong ◽

Jinan Fiaidhi ◽

Sabah Mohammed ◽

Luiz Moutinho

Keyword(s):

Real Time ◽

Data Stream ◽

Decision Rules ◽

Data Stream Mining ◽

Stream Mining ◽

Diabetes Therapy ◽

Therapy Management

Download Full-text

Concept Adapting Real-Time Data Stream Mining for Health Care Applications

Advances in Intelligent and Soft Computing - Advances in Computer Science, Engineering & Applications ◽

10.1007/978-3-642-30157-5_34 ◽

2012 ◽

pp. 341-351

Author(s):

Dipti D. Patil ◽

Jyoti G. Mudkanna ◽

Dnyaneshwar Rokade ◽

Vijay M. Wadhai

Keyword(s):

Health Care ◽

Real Time ◽

Data Stream ◽

Data Stream Mining ◽

Time Data ◽

Stream Mining ◽

Real Time Data ◽

Health Care Applications

Download Full-text

Real-Time Analysis of Vital Signs Using Incremental Data Stream Mining Techniques with a Case Study of ARDS Under ICU Treatment

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2015.1504 ◽

2015 ◽

Vol 5 (5) ◽

pp. 1108-1115 ◽

Cited By ~ 1

Author(s):

Simon Fong ◽

Shirley W. I. Siu ◽

Suzy Zhou ◽

Jonathan H. Chan ◽

Sabah Mohammed ◽

...

Keyword(s):

Real Time ◽

Data Stream ◽

Vital Signs ◽

Data Stream Mining ◽

Time Analysis ◽

Stream Mining ◽

Real Time Analysis ◽

Icu Treatment

Download Full-text

Online Incremental Learning for High Bandwidth Network Traffic Classification

Applied Computational Intelligence and Soft Computing ◽

10.1155/2016/1465810 ◽

2016 ◽

Vol 2016 ◽

pp. 1-13 ◽

Cited By ~ 5

Author(s):

H. R. Loo ◽

S. B. Joseph ◽

M. N. Marsono

Keyword(s):

Network Traffic ◽

Data Stream ◽

Concept Drift ◽

Distance Measures ◽

Manhattan Distance ◽

Data Stream Mining ◽

Traffic Classification ◽

Stream Mining ◽

Network Traffic Classification ◽

High Bandwidth

Data stream mining techniques are able to classify evolving data streams such as network traffic in the presence of concept drift. In order to classify high bandwidth network traffic in real-time, data stream mining classifiers need to be implemented on reconfigurable high throughput platform, such as Field Programmable Gate Array (FPGA). This paper proposes an algorithm for online network traffic classification based on the concept of incrementalk-means clustering to continuously learn from both labeled and unlabeled flow instances. Two distance measures for incrementalk-means (Euclidean and Manhattan) distance are analyzed to measure their impact on the network traffic classification in the presence of concept drift. The experimental results on real datasets show that the proposed algorithm exhibits consistency, up to 94% average accuracy for both distance measures, even in the presence of concept drifts. The proposed incrementalk-means classification using Manhattan distance can classify network traffic 3 times faster than Euclidean distance at 671 thousands flow instances per second.

Download Full-text