Distributed Caching Based Memory Optimizing Technology for Stream Data of IoV

Author(s):  
Xiaoli Hu ◽  
Chao Li ◽  
Huibing Zhang ◽  
Hongbo Zhang ◽  
Ya Zhou
2021 ◽  
Vol 11 (12) ◽  
pp. 5523
Author(s):  
Qian Ye ◽  
Minyan Lu

The main purpose of our provenance research for DSP (distributed stream processing) systems is to analyze abnormal results. Provenance for these systems is not nontrivial because of the ephemerality of stream data and instant data processing mode in modern DSP systems. Challenges include but are not limited to an optimization solution for avoiding excessive runtime overhead, reducing provenance-related data storage, and providing it in an easy-to-use fashion. Without any prior knowledge about which kinds of data may finally lead to the abnormal, we have to track all transformations in detail, which potentially causes hard system burden. This paper proposes s2p (Stream Process Provenance), which mainly consists of online provenance and offline provenance, to provide fine- and coarse-grained provenance in different precision. We base our design of s2p on the fact that, for a mature online DSP system, the abnormal results are rare, and the results that require a detailed analysis are even rarer. We also consider state transition in our provenance explanation. We implement s2p on Apache Flink named as s2p-flink and conduct three experiments to evaluate its scalability, efficiency, and overhead from end-to-end cost, throughput, and space overhead. Our evaluation shows that s2p-flink incurs a 13% to 32% cost overhead, 11% to 24% decline in throughput, and few additional space costs in the online provenance phase. Experiments also demonstrates the s2p-flink can scale well. A case study is presented to demonstrate the feasibility of the whole s2p solution.


2021 ◽  
Vol 213 ◽  
pp. 106653
Author(s):  
Heonho Kim ◽  
Unil Yun ◽  
Yoonji Baek ◽  
Hyunsoo Kim ◽  
Hyoju Nam ◽  
...  
Keyword(s):  

Algorithms ◽  
2019 ◽  
Vol 12 (2) ◽  
pp. 37 ◽  
Author(s):  
Zhigang Hu ◽  
Hui Kang ◽  
Meiguang Zheng

A distributed data stream processing system handles real-time, changeable and sudden streaming data load. Its elastic resource allocation has become a fundamental and challenging problem with a fixed strategy that will result in waste of resources or a reduction in QoS (quality of service). Spark Streaming as an emerging system has been developed to process real time stream data analytics by using micro-batch approach. In this paper, first, we propose an improved SVR (support vector regression) based stream data load prediction scheme. Then, we design a spark-based maximum sustainable throughput of time window (MSTW) performance model to find the optimized number of virtual machines. Finally, we present a resource scaling algorithm TWRES (time window resource elasticity scaling algorithm) with MSTW constraint and streaming data load prediction. The evaluation results show that TWRES could improve resource utilization and mitigate SLA (service level agreement) violation.


2009 ◽  
Vol 179 (20) ◽  
pp. 3489-3504 ◽  
Author(s):  
Sungbo Seo ◽  
Jaewoo Kang ◽  
Keun Ho Ryu

Sign in / Sign up

Export Citation Format

Share Document