scholarly journals Stream Processing Tools for Analyzing Objects in Motion Sending High-Volume Location Data

2021 ◽  
pp. 257-268
Author(s):  
Krzysztof Wecel ◽  
Marcin Szmydt ◽  
Milena Stróżyna

Recently we observe a significant increase in the amount of easily accessible data on transport and mobility. This data is mostly massive streams of high velocity, magnitude, and heterogeneity, which represent a flow of goods, shipments and the movements of fleet. It is therefore necessary to develop a scalable framework and apply tools capable of handling these streams. In the paper we propose an approach for the selection of software for stream processing solutions that may be used in the transportation domain. We provide an overview of potential stream processing technologies, followed by the method for choosing the selected software for real-time analysis of data streams coming from objects in motion. We have selected two solutions: Apache Spark Streaming and Apache Flink, and benchmarked them on a real-world task. We identified the caveats and challenges when it comes to implementation of the solution in practice.

Author(s):  
Ritesh Srivastava ◽  
M.P.S. Bhatia

Twitter behaves as a social sensor of the world. The tweets provided by the Twitter Firehose reveal the properties of big data (i.e. volume, variety, and velocity). With millions of users on Twitter, the Twitter's virtual communities are now replicating the real-world communities. Consequently, the discussions of real world events are also very often on Twitter. This work has performed the real-time analysis of the tweets related to a targeted event (e.g. election) to identify those potential sub-events that occurred in the real world, discussed over Twitter and cause the significant change in the aggregated sentiment score of the targeted event with time. Such type of analysis can enrich the real-time decision-making ability of the event bearer. The proposed approach utilizes a three-step process: (1) Real-time sentiment analysis of tweets (2) Application of Bayesian Change Points Detection to determine the sentiment change points (3) Major sub-events detection that have influenced the sentiment of targeted event. This work has experimented on Twitter data of Delhi Election 2015.


2021 ◽  
Vol 8 (1) ◽  
pp. 17-xx
Author(s):  
Md. Mostafijur Rahman ◽  
Mani Manavalan ◽  
Taposh Kumar Neogy

The occurrence of various devices that are interlinked to provide advanced connectivity throughout the systems revolves around the formation of 5G systems. Artificial Intelligence plays a fundamental role in the 5G networks. The popularity and integration of 5G have emerged through advanced cellular networks and many other technologies. This innovative and speedy network has built strong connections in recent years, its conduct in business, personal work, or daily life. Artificial Intelligence and edge computing devices have optimized internet usages in everyday life. The growth of 5G networks is effective in the AI/ML algorithms due to its low latency and high bandwidth, which also performs real-time analysis, reasoning, and optimization. The 5G era has fundamental features that are highlighted among the revolutionary techniques which are most commonly used by cellular device networks, such as the resource management of radio, mobility management, and service management, and so on. This work also integrates the selection of spectrum and access the spectrum which AI-based interface to accomplish demands of 5G. The strategies which are introduced are Fractional Knapsack Greedy-based strategy and Language Hyperplane approach which becomes the basis of subsequently utilized by strategies of Artificial Intelligence for purpose of the selection of spectrum and the right allocation of spectrum for IoT-enabled sensor networks.  


2019 ◽  
Vol 19 (04) ◽  
pp. 574-602 ◽  
Author(s):  
A. RIESCO ◽  
J. RODRÍGUEZ-HORTALÁ

AbstractStream processing has reached the mainstream in the last years, as a new generation of open-source distributed stream processing systems, designed for scaling horizontally on commodity hardware, has brought the capability for processing high-volume and high-velocity data streams to companies of all sizes. In this work, we propose a combination of temporal logic and property-based testing (PBT) for dealing with the challenges of testing programs that employ this programming model. We formalize our approach in a discrete time temporal logic for finite words, with some additions to improve the expressiveness of properties, which includes timeouts for temporal operators and a binding operator for letters. In particular, we focus on testing Spark Streaming programs written with the Spark API for the functional language Scala, using the PBT library ScalaCheck. For that we add temporal logic operators to a set of new ScalaCheck generators and properties, as part of our testing library sscheck.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1857
Author(s):  
Siwoon Son ◽  
Yang-Sae Moon

Distributed stream processing engines (DSPEs) deploy multiple tasks on distributed servers to process data streams in real time. Many DSPEs have provided locality-aware stream partitioning (LSP) methods to reduce network communication costs. However, an even job scheduler provided by DSPEs deploys tasks far away from each other on the distributed servers, which cannot use the LSP properly. In this paper, we propose a Locality/Fairness-aware job scheduler (L/F job scheduler) that considers locality together to solve problems of the even job scheduler that only considers fairness. First, the L/F job scheduler increases cohesion of contiguous tasks that require message transmissions for the locality. At the same time, it reduces coupling of parallel tasks that do not require message transmissions for the fairness. Next, we connect the contiguous tasks into a stream pipeline and evenly deploy stream pipelines to the distributed servers so that the L/F job scheduler achieves high cohesion and low coupling. Finally, we implement the proposed L/F job scheduler in Apache Storm, a representative DSPE, and evaluate it in both synthetic and real-world workloads. Experimental results show that the L/F job scheduler is similar in throughput compared to the even job scheduler, but latency is significantly improved by up to 139.2% for the LSP applications and by up to 140.7% even for the non-LSP applications. The L/F job scheduler also improves latency by 19.58% and 12.13%, respectively, in two real-world workloads. These results indicate that our L/F job scheduler provides superior processing performance for the DSPE applications.


Sensors ◽  
2020 ◽  
Vol 20 (11) ◽  
pp. 3166
Author(s):  
Adeyinka Akanbi ◽  
Muthoni Masinde

In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Processing and analysing the massive amount of data produced is cumbersome and gradually moving from classical ‘batch’ processing—extract, transform, load (ETL) technique to real-time processing. For instance, in environmental monitoring and management domain, time-series data and historical dataset are crucial for prediction models. However, the environmental monitoring domain still utilises legacy systems, which complicates the real-time analysis of the essential data, integration with big data platforms and reliance on batch processing. Herein, as a solution, a distributed stream processing middleware framework for real-time analysis of heterogeneous environmental monitoring and management data is presented and tested on a cluster using open source technologies in a big data environment. The system ingests datasets from legacy systems and sensor data from heterogeneous automated weather systems irrespective of the data types to Apache Kafka topics using Kafka Connect APIs for processing by the Kafka streaming processing engine. The stream processing engine executes the predictive numerical models and algorithms represented in event processing (EP) languages for real-time analysis of the data streams. To prove the feasibility of the proposed framework, we implemented the system using a case study scenario of drought prediction and forecasting based on the Effective Drought Index (EDI) model. Firstly, we transform the predictive model into a form that could be executed by the streaming engine for real-time computing. Secondly, the model is applied to the ingested data streams and datasets to predict drought through persistent querying of the infinite streams to detect anomalies. As a conclusion of this study, a performance evaluation of the distributed stream processing middleware infrastructure is calculated to determine the real-time effectiveness of the framework.


2020 ◽  
Author(s):  
Fernando Benedito Veras Magalhães ◽  
Francisco José da Silva e Silva ◽  
Markus Endler

The current dissemination of IoT increases the deployment of stream processing solutions for monitoring and controlling elements of the real-world. One of those solutions is Complex Event Processing (CEP), and to handle the high volume, velocity and volatility of data streams from IoT sensors the CEP pipeline should be distributed, preferably having CEP operators both in the cloud/cluster and in edge devices. In this paper, we present a model for a distributed CEP platform and an implementation of this model called Global CEP Manager (GCM). GCM is a service of the ContextNet middleware that supports the deployment and dynamic rearrangement of CEP queries to CEP engines executing in the cloud and in M-Hubs, that are ContextNet’s mobile edge devices.


2021 ◽  
Vol 11 (24) ◽  
pp. 11584
Author(s):  
Ilaria Bartolini ◽  
Marco Patella

The real-time analysis of Big Data streams is a terrific resource for transforming data into value. For this, Big Data technologies for smart processing of massive data streams are available, but the facilities they offer are often too raw to be effectively exploited by analysts. RAM3S (Real-time Analysis of Massive MultiMedia Streams) is a framework that acts as a middleware software layer between multimedia stream analysis techniques and Big Data streaming platforms, so as to facilitate the implementation of the former on top of the latter. RAM3S has been proven helpful in simplifying the deployment of non-parallel techniques to streaming platforms, such as Apache Storm or Apache Flink. In this paper, we show how RAM3S has been updated to incorporate novel stream processing platforms, such as Apache Samza, and to be able to communicate with different message brokers, such as Apache Kafka. Abstracting from the message broker also provides us with the ability to pipeline several RAM3S instances that can, therefore, perform different processing tasks. This represents a richer model for stream analysis with respect to the one already available in the original RAM3S version. The generality of this new RAM3S version is demonstrated through experiments conducted on three different multimedia applications, proving that RAM3S is a formidable asset for enabling efficient and effective Data Mining and Machine Learning on multimedia data streams.


2012 ◽  
Vol 109 (22) ◽  
pp. 8477-8482 ◽  
Author(s):  
B. R. Cipriany ◽  
P. J. Murphy ◽  
J. A. Hagarman ◽  
A. Cerf ◽  
D. Latulippe ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document