data stream processing
Recently Published Documents


TOTAL DOCUMENTS

242
(FIVE YEARS 73)

H-INDEX

17
(FIVE YEARS 3)

2022 ◽  
pp. 29-47
Author(s):  
Patrick Schneider ◽  
Fatos Xhafa

2021 ◽  
Vol 5 (4) ◽  
pp. 456
Author(s):  
Shaimaa Safaa Ahmed Alwaisi ◽  
Maan Nawaf Abbood ◽  
Luma Fayeq Jalil ◽  
Shahreen Kasim ◽  
Mohd Farhan Mohd Fudzee ◽  
...  

The amount of data in our world has been rapidly keep growing from time to time.  In the era of big data, the efficient processing and analysis of big data using machine learning algorithm is highly required, especially when the data comes in form of streams. There is no doubt that big data has become an important source of information and knowledge in making decision process. Nevertheless, dealing with this kind of data comes with great difficulties; thus, several techniques have been used in analyzing the data in the form of streams. Many techniques have been proposed and studied to handle big data and give decisions based on off-line batch analysis. Today, we need to make a constructive decision based on online streaming data analysis. Many researchers in recent years proposed some different kind of frameworks for processing the big data streaming. In this work, we explore and present in detail some of the recent achievements in big data streaming in term of contributions, benefits, and limitations. As well as some of recent platforms suitable to be used for big data streaming analytics. Moreover, we also highlight several issues that will be faced in big data stream processing. In conclusion, it is hoped that this study will assist the researchers in choosing the best and suitable framework for big data streaming projects.


2021 ◽  
Author(s):  
Morgan K. Geldenhuys ◽  
Jonathan Will ◽  
Benjamin J. J. Pfister ◽  
Martin Haug ◽  
Alexander Scharmann ◽  
...  

Author(s):  
Ameer B. A. Alaasam

<p class="0abstract">Smart industry systems are based on integrating historical and current data from sensors with physical and digital systems to control product states. For example, Digital Twin (DT) system predicts the future state of physical assets using live simulation and controls the current state through real-time feedback. These systems rely on the ability to process big data stream to provide real-time responses. For, example it is estimated that one autonomous vehicle (AV) could produce 30 terabytes of data per day. AV will not be on the road before using an effective way to managing its big data and solve latency challenges. Cloud computing failed in the latency challenge, while Fog computing addresses it by moving parts of the computations from the Cloud to the edge of the network near the asset to reduce the latency. This work studies the challenges in data stream processing for DT in a fog environment. The challenges include fog architecture, the necessity of loosely-coupling design, the used virtual machine versus container, the stateful versus stateless operations, the stream processing tools, and live migration between fog nodes. The work also proposes a fog computing architecture and provides a vision of the prerequisites to meet the challenges.</p>


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4160
Author(s):  
Isam Mashhour Al Jawarneh ◽  
Paolo Bellavista ◽  
Antonio Corradi ◽  
Luca Foschini ◽  
Rebecca Montanari

Large amounts of georeferenced data streams arrive daily to stream processing systems. This is attributable to the overabundance of affordable IoT devices. In addition, interested practitioners desire to exploit Internet of Things (IoT) data streams for strategic decision-making purposes. However, mobility data are highly skewed and their arrival rates fluctuate. This nature poses an extra challenge on data stream processing systems, which are required in order to achieve pre-specified latency and accuracy goals. In this paper, we propose ApproxSSPS, which is a system for approximate processing of geo-referenced mobility data, at scale with quality of service guarantees. We focus on stateful aggregations (e.g., means, counts) and top-N queries. ApproxSSPS features a controller that interactively learns the latency statistics and calculates proper sampling rates to meet latency or/and accuracy targets. An overarching trait of ApproxSSPS is its ability to strike a plausible balance between latency and accuracy targets. We evaluate ApproxSSPS on Apache Spark Structured Streaming with real mobility data. We also compared ApproxSSPS against a state-of-the-art online adaptive processing system. Our extensive experiments prove that ApproxSSPS can fulfill latency and accuracy targets with varying sets of parameter configurations and load intensities (i.e., transient peaks in data loads versus slow arriving streams). Moreover, our results show that ApproxSSPS outperforms the baseline counterpart by significant magnitudes. In short, ApproxSSPS is a novel spatial data stream processing system that can deliver real accurate results in a timely manner, by dynamically specifying the limits on data samples.


Sign in / Sign up

Export Citation Format

Share Document