Delayed labelling evaluation for data streams

Data coming from different sources is referred to as data streams. Data stream mining is an online learning technique where each data point must be processed as the data arrives and discarded as the processing is completed. Progress of technologies has resulted in the monitoring these data streams in real time. Data streams has created many new challenges to the researchers in real time. The main features of this type of data are they are fast flowing, large amounts of data which are continuous and growing in nature, and characteristics of data might change in course of time which is termed as concept drift. This chapter addresses the problems in mining data streams with concept drift. Due to which, isolating the correct literature would be a grueling task for researchers and practitioners. This chapter tries to provide a solution as it would be an amalgamation of all techniques used for data stream mining with concept drift.

Download Full-text

Dealing with Data Streams: Complex Event Processing vs. Data Stream Mining

Computational Science and Its Applications – ICCSA 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-58811-3_1 ◽

2020 ◽

pp. 3-14

Author(s):

Moritz Lange ◽

Arne Koschel ◽

Irina Astrova

Keyword(s):

Data Streams ◽

Data Stream ◽

Complex Event Processing ◽

Data Stream Mining ◽

Event Processing ◽

Stream Mining

Download Full-text

Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining

19th International Conference on Scientific and Statistical Database Management (SSDBM 2007) ◽

10.1109/ssdbm.2007.32 ◽

2007 ◽

Cited By ~ 5

Author(s):

Samer Nassar ◽

Joerg Sander

Keyword(s):

Data Streams ◽

Stream Mining

Download Full-text

Knowledge Discovery Using Data Stream Mining

Advances in Business Information Systems and Analytics - Social Network Analytics for Contemporary Business Organizations ◽

10.4018/978-1-5225-5097-6.ch012 ◽

2018 ◽

pp. 231-258

Author(s):

Prasanna Lakshmi Kompalli

Keyword(s):

Data Streams ◽

Data Stream ◽

Relevant Information ◽

Research Community ◽

Data Stream Mining ◽

Data Sets ◽

Stream Mining ◽

Real World Problem ◽

Using Data ◽

Over Time

In recent years, advancement in technologies has made it possible for most of the present-day organizations to store and record large streams of data. Such data sets, which continuously and rapidly grow over time, are referred to as data streams. Mining of such data streams is a unique opportunity and also a challenging task. Data stream mining is a process of gaining knowledge from continuous and rapid records of data. Due to increased streaming information, data stream mining has attracted the research community in the recent past. There is voluminous literature that has been published in this domain over the past few years. Due to this, isolating the correct study would be grueling task for researchers and practitioners. While addressing a real-world problem, it would be difficult to find relevant information as it would be hidden in data streams. This chapter tries to provide solution as it is an amalgamation of all techniques used for data stream mining.

Download Full-text

Research on Distributed Data Stream Mining of Financial Risk Based on Double Privacy Protection

10.21203/rs.3.rs-38957/v1 ◽

2020 ◽

Author(s):

Yuhao Zhao

Keyword(s):

Data Mining ◽

Data Streams ◽

Privacy Protection ◽

Data Stream ◽

Financial Risk ◽

Data Stream Mining ◽

Distributed Data ◽

Stream Mining ◽

Mining Technology ◽

Distributed Data Streams

Abstract With the advancement of network technology and large-scale computing, distributed data streams have been widely used in the application of financial risk analysis. However, while data mining reveals financial models, it also increasingly poses a threat to privacy. Therefore, how to prevent privacy leakage during the efficient mining process poses new challenges to the data mining technology. This article is mainly aimed at the current privacy data leakage in financial data mining, combined with existing data mining technology to study data mining and privacy protection. First, a data mining model for dual privacy protection is defined, which can better meet the characteristics of distributed data streams while achieving privacy protection effects. Secondly, a privacy-oriented data stream mining algorithm is proposed, which uses random interference technology to effectively protect the original sensitive data. Finally, the analysis and discussion of the algorithm in this paper through simulation experiments show that the algorithm is feasible and effective, and can better adapt to the distributed data flow distribution and dynamic characteristics, while achieving better privacy protection effects, effectively Reduced communication load.

Download Full-text

Research on wireless distributed financial risk data stream mining based on dual privacy protection

10.21203/rs.3.rs-38957/v2 ◽

2020 ◽

Author(s):

Yuhao Zhao

Keyword(s):

Data Mining ◽

Data Streams ◽

Privacy Protection ◽

Data Stream ◽

Financial Risk ◽

Data Stream Mining ◽

Distributed Data ◽

Stream Mining ◽

Mining Technology ◽

Distributed Data Streams

Abstract With the advancement of network technology and large-scale computing, distributed data streams have been widely used in the application of financial risk analysis. However, while data mining reveals financial models, it also increasingly poses a threat to privacy. Therefore, how to prevent privacy leakage during the efficient mining process poses new challenges to the data mining technology. This article is mainly aimed at the current privacy data leakage in financial data mining, combined with existing data mining technology to study data mining and privacy protection. First, a data mining model for dual privacy protection is defined, which can better meet the characteristics of distributed data streams while achieving privacy protection effects. Secondly, a privacy-oriented data stream mining algorithm is proposed, which uses random interference technology to effectively protect the original sensitive data. Finally, the analysis and discussion of the algorithm in this paper through simulation experiments show that the algorithm is feasible and effective, and can better adapt to the distributed data flow distribution and dynamic characteristics, while achieving better privacy protection effects, effectively Reduced communication load.

Download Full-text

Mining Data Streams

Advances in Business Information Systems and Analytics - Sentiment Analysis and Knowledge Discovery in Contemporary Business ◽

10.4018/978-1-5225-4999-4.ch014 ◽

2019 ◽

pp. 251-278

Author(s):

Prasanna Lakshmi Kompalli

Keyword(s):

Data Streams ◽

Data Stream ◽

Relevant Information ◽

Research Community ◽

Data Stream Mining ◽

Data Sets ◽

Stream Mining ◽

Real World Problem ◽

Mining Data Streams ◽

Over Time

In recent years, advancement in technologies has made it possible for most of the present-day organizations to store and record large streams of data. Such data sets which continuously and rapidly grow over time are referred to as data streams. Mining of such data streams is a unique opportunity and also a challenging task. Data stream mining is a process of gaining knowledge from continuous and rapid records of data. Due to increased streaming information, data stream mining has attracted the research community in the recent past. There is voluminous of literature which has been published in this domain over the past few years. Due to this, isolating the correct literature would be a grueling task for researchers and practitioners. While addressing a real-world problem, it would be more difficult to find relevant information as it would be hidden in data streams. This chapter tries to provide solution as it would be an amalgamation of all techniques used for data stream mining.

Download Full-text

An Insight on Social Media Stream Mining

SCITECH Nepal ◽

10.3126/scitech.v14i1.25532 ◽

2019 ◽

Vol 14 (1) ◽

pp. 36-43

Author(s):

Rojina Deuja ◽

Krishna Bikram Shah

Keyword(s):

Social Media ◽

Communication Networks ◽

Data Streams ◽

Data Stream ◽

Data Stream Mining ◽

Stream Mining ◽

On Line ◽

Recent Trends ◽

Media Stream ◽

Media Data

Data stream mining is one of the realms gaining upper hand over traditional data mining methods. Transfinite volumes of data termed as Data Streams are often generated by Internet traffic, Communication networks, On-line bank or ATM transactions etc. The streams are dynamic and ever-shifting and need to be analysed online as they are obtained. Social media is one of the notable sources of such data streams. While social media streaming has received a lot of attention over the past decade, the ever-expanding streams of data presents huge challenges for learning and maintaining control. Dealing with billions of user’s data measured in pet bytes is a demanding task in itself. It is indeed a challenge to mine such dynamic data from social networks in an uninterrupted and competent way. This paper is purposed to introduce social data streams and the mining techniques involved in processing them. We analyse the most recent trends in social media data stream mining to translate to the detailed study of the matter. We also review innovative implementations of social media stream mining that are currently prevalent.

Download Full-text

Empowering Density-based Micro-clusters In Dynamic Data Stream Clustering

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207147 ◽

2020 ◽

pp. 259-259

Author(s):

Asha P. V. ◽

Anju M. Sukumar

Keyword(s):

Data Streams ◽

Data Clustering ◽

Data Stream ◽

Clustering Algorithms ◽

Streaming Data ◽

Data Stream Mining ◽

Stream Mining ◽

Fast Processing ◽

Geospatial Services ◽

Mining Data Streams

Data stream is a continuous sequence of data generated from various sources and continuously transferred from source to target. Streaming data needs to be processed without having access to all of the data. Some of the sources generating data streams are social networks, geospatial services, weather monitoring, e-commerce purchases, etc. Data stream mining is the process of acquiring knowledge structures from the continuously arriving data. Clustering is an unsupervised machine learning technique that can be used to extract knowledge patterns from the data stream. The mining of streaming data is challenging because the data is in huge amounts and arriving continuously. So the traditional algorithms are not suitable for mining data streams. Data stream mining requires fast processing algorithms using a single scan and a limited amount of memory. The micro clustering has a good role in this. In itself, density based micro clustering has its own unique place in data stream mining. This paper presents a survey on different data clustering algorithms, realizes and empowers the use of density-based micro clusters.

Download Full-text

Actionable intelligence and online learning for semantic computing

Encyclopedia with Semantic Computing and Robotic Intelligence ◽

10.1142/s2425038416300111 ◽

2017 ◽

Vol 01 (01) ◽

pp. 1630011

Author(s):

Cem Tekin ◽

Mihaela van der Schaar

Keyword(s):

Online Learning ◽

Data Streams ◽

Concept Drift ◽

Relevant Information ◽

High Dimensional ◽

Time Varying ◽

Stream Mining ◽

Multiple Data ◽

New Challenges ◽

Semantic Computing

As the world becomes more connected and instrumented, high dimensional, heterogeneous and time-varying data streams are collected and need to be analyzed on the fly to extract the actionable intelligence from the data streams and make timely decisions based on this knowledge. This requires that appropriate classifiers are invoked to process the incoming streams and find the relevant knowledge. Thus, a key challenge becomes choosing online, at run-time, which classifier should be deployed to make the best possible predictions on the incoming streams. In this paper, we survey a class of methods capable to perform online learning in stream-based semantic computing tasks: multi-armed bandits (MABs). Adopting MABs for stream mining poses, numerous new challenges requires many new innovations. Most importantly, the MABs will need to explicitly consider and track online the time-varying characteristics of the data streams and to learn fast what is the relevant information out of the vast, heterogeneous and possibly highly dimensional data streams. In this paper, we discuss contextual MAB methods, which use similarities in context (meta-data) information to make decisions, and discuss their advantages when applied to stream mining for semantic computing. These methods can be adapted to discover in real-time the relevant contexts guiding the stream mining decisions, and tract the best classifier in presence of concept drift. Moreover, we also discuss how stream mining of multiple data sources can be performed by deploying cooperative MAB solutions and ensemble learning. We conclude the paper by discussing the numerous other advantages of MABs that will benefit semantic computing applications.

Download Full-text