A Machine Hearing Framework for Real-Time Streaming Analytics Using Lambda Architecture

Author(s):  
Konstantinos Demertzis ◽  
Lazaros Iliadis ◽  
Vardis-Dimitris Anezakis
Data ◽  
2018 ◽  
Vol 3 (4) ◽  
pp. 58 ◽  
Author(s):  
Gautam Pal ◽  
Gangmin Li ◽  
Katie Atkinson

We study big-data hybrid-data-processing lambda architecture, which consolidates low-latency real-time frameworks with high-throughput Hadoop-batch frameworks over a massively distributed setup. In particular, real-time and batch-processing engines act as autonomous multi-agent systems in collaboration. We propose a Multi-Agent Lambda Architecture (MALA) for e-commerce data analytics. We address the high-latency problem of Hadoop MapReduce jobs by simultaneous processing at the speed layer to the requests which require a quick turnaround time. At the same time, the batch layer in parallel provides comprehensive coverage of data by intelligent blending of stream and historical data through the weighted voting method. The cold-start problem of streaming services is addressed through the initial offset from historical batch data. Challenges of high-velocity data ingestion is resolved with distributed message queues. A proposed multi-agent decision-maker component is placed at the MALA stack as the gateway of the data pipeline. We prove efficiency of our batch model by implementing an array of features for an e-commerce site. The novelty of the model and its key significance is a scheme for multi-agent interaction between batch and real-time agents to produce deeper insights at low latency and at significantly lower costs. Hence, the proposed system is highly appealing for applications involving big data and caters to high-velocity streaming ingestion and a massive data pool.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7242
Author(s):  
Fabio Henrique Pereira ◽  
Francisco Elânio Bezerra ◽  
Diego Oliva ◽  
Gilberto Francisco Martha de Souza ◽  
Ivan Eduardo Chabu ◽  
...  

The prediction of partial discharges in hydrogenerators depends on data collected by sensors and prediction models based on artificial intelligence. However, forecasting models are trained with a set of historical data that is not automatically updated due to the high cost to collect sensors’ data and insufficient real-time data analysis. This article proposes a method to update the forecasting model, aiming to improve its accuracy. The method is based on a distributed data platform with the lambda architecture, which combines real-time and batch processing techniques. The results show that the proposed system enables real-time updates to be made to the forecasting model, allowing partial discharge forecasts to be improved with each update with increasing accuracy.


Author(s):  
Gautam Pal ◽  
Katie Atkinson ◽  
Gangmin Li

AbstractThis paper presents an approach to analyzing consumers’ e-commerce site usage and browsing motifs through pattern mining and surfing behavior. User-generated clickstream is first stored in a client site browser. We build an ingestion pipeline to capture the high-velocity data stream from a client-side browser through Apache Storm, Kafka, and Cassandra. Given the consumer’s usage pattern, we uncover the user’s browsing intent through n-grams and Collocation methods. An innovative clustering technique is constructed through the Expectation-Maximization algorithm with Gaussian Mixture Model. We discuss a framework for predicting a user’s clicks based on the past click sequences through higher order Markov Chains. We developed our model on top of a big data Lambda Architecture which combines high throughput Hadoop batch setup with low latency real-time framework over a large distributed cluster. Based on this approach, we developed an experimental setup for an optimized Storm topology and enhanced Cassandra database latency to achieve real-time responses. The theoretical claims are corroborated with several evaluations in Microsoft Azure HDInsight Apache Storm deployment and in the Datastax distribution of Cassandra. The paper demonstrates that the proposed techniques help user experience optimization, building recently viewed products list, market-driven analyses, and allocation of website resources.


Sign in / Sign up

Export Citation Format

Share Document