stream analytics Latest Research Papers

Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte and beyond graph stream data from end users. In this paper, a succinct and efficient double index data structure is designed to build the sketch of a graph stream to meet general queries. A single pass stream model, which includes general sketch building, distributed sketch based analysis algorithms and regression based approximation solution generation, is developed, and a typical graph algorithm—triangle counting—is implemented to evaluate the proposed method. Experimental results on power law and normal distribution graph streams show that our method can generate accurate results (mean relative error less than 4%) with a high performance. All our methods and code have been implemented in an open source framework, Arkouda, and are available from our GitHub repository, Bader-Research. This work provides the large and rapidly growing Python community with a powerful way to handle terabyte and beyond graph stream data using their laptops.

Download Full-text

Ultra-Reliable and Low-Latency Computing in the Edge with Kubernetes

Journal of Grid Computing ◽

10.1007/s10723-021-09573-z ◽

2021 ◽

Vol 19 (3) ◽

Author(s):

László Toka

Keyword(s):

High Reliability ◽

Fog Computing ◽

Resource Provisioning ◽

Use Case ◽

End User ◽

Delay Constraints ◽

Edge And Fog Computing ◽

Novel Applications ◽

Stream Analytics ◽

Potential Use

AbstractNovel applications will require extending traditional cloud computing infrastructure with compute resources deployed close to the end user. Edge and fog computing tightly integrated with carrier networks can fulfill this demand. The emphasis is on integration: the rigorous delay constraints, ensuring reliability on the distributed, remote compute nodes, and the sheer scale of the system altogether call for a powerful resource provisioning platform that offers the applications the best of the underlying infrastructure. We therefore propose Kubernetes-edge-scheduler that provides high reliability for applications in the edge, while provisioning less than 10% of resources for this purpose, and at the same time, it guarantees compliance with the latency requirements that end users expect. We present a novel topology clustering method that considers application latency requirements, and enables scheduling applications even on a worldwide scale of edge clusters. We demonstrate that in a potential use case, a distributed stream analytics application, our orchestration system can reduce the job completion time to 40% of the baseline provided by the default Kubernetes scheduler.

Download Full-text

A survey on data stream analytics

10.1049/pbpc037f_ch6 ◽

2021 ◽

pp. 175-208

Author(s):

Sumit Misra ◽

Sanjoy Kumar Saha ◽

Chandan Mazumdar

Keyword(s):

Data Stream ◽

Stream Analytics

Download Full-text

Real-time big data stream analytics and complex event detection

Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems ◽

10.1145/3465480.3468676 ◽

2021 ◽

Author(s):

Ralf Klinkenberg

Keyword(s):

Big Data ◽

Real Time ◽

Event Detection ◽

Data Stream ◽

Stream Analytics

Download Full-text

Tutorial on graph stream analytics

Proceedings of the 15th ACM International Conference on Distributed and Event-based Systems ◽

10.1145/3465480.3468293 ◽

2021 ◽

Author(s):

András Benczúr ◽

Ferenc Béres ◽

Domokos Kelen ◽

Róbert Pálovics

Keyword(s):

Stream Analytics

Download Full-text

MQTT Architecture for Stream Analytics of PMU Data

2021 32nd Irish Signals and Systems Conference (ISSC) ◽

10.1109/issc52156.2021.9467849 ◽

2021 ◽

Author(s):

Paul Brogan ◽

Andres Jarmillo ◽

Xueqin Amy Liu ◽

John Hastings ◽

David Laverty ◽

...

Keyword(s):

Stream Analytics

Download Full-text

WIKI STREAMS: Wikipedia Article Recent Edit Retrieval System using Hierarchical Stream Clustering

10.21203/rs.3.rs-452931/v1 ◽

2021 ◽

Author(s):

Arun Manicka Raja M ◽

Swamynathan Sankaranarayanan

Keyword(s):

Data Analytics ◽

Retrieval System ◽

State Of The Art ◽

Data Generation ◽

Stream Data ◽

New Paradigm ◽

User Interest ◽

Stream Clustering ◽

Digital Knowledge ◽

Stream Analytics

Abstract Stream analytics, a new paradigm in data analytics, has gained mo- mentum due to the voluminous stream data generation. With the huge increase in the edits performed on Wikipedia topics, it is tedious for the digital knowledge discovery users to nd their domain updates immediately. The users need to go through large information and spend more time to nd the potential data. There is a need for retrieving the Wikipedia edits based on the meta data of the article edits for later retriev-al. Hence, the clustering technique may be employed in order to group the Wikipedia article edits domain wise. Hence, in this paper, hierarchi- cal stream clustering is applied in order to retrieve the edits based on the user interest. Over a period of month, the data from Wikipedia is collected and used as a dataset. Our method is compared with the state-of-the-art clus-tering system WikiAutoCat and it is observed that the accuracy is improved by 10% and the clustering time is reduced by 20%.

Download Full-text

Evaluative Review of Streaming Analytics: Tools and Technologies in Real-Time Data Processing

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1262 ◽

2021 ◽

pp. 423-432

Author(s):

Ms. Shailaja B. Jadhav ◽

Dr. D. V. Kodavade

Keyword(s):

Data Processing ◽

Data Stream ◽

Research Field ◽

Streaming Data ◽

Rail Transportation ◽

Time Data ◽

Data Record ◽

Real Time Data Processing ◽

Stream Analytics

Nowadays, big data processing systems are evolving to be more stream-oriented; where each data record is processed as it arrives by distributed and low latency computational frameworks [18]. Data streams have been extensively used in several fields of computational analytics such as data mining, business intelligence etc. [17]. In every field, the data stream can be considered as an ordered sequence of data items, as they continuously arrive over the period. Due to this characteristic, streaming data analytics is a challenging area of research [5, 11]. This paper aims to present data stream processing as a growing research field , along with streaming analytics frameworks as a rich focus area. The paper also contributes to evaluate the efficacy of available stream analytics frameworks. One of the Industry 4.0 use case - predictive maintenance rail transportation - has been illustrated here as a case study design mapped with streaming analytics framework.

Download Full-text