High-Level Declarative Stream Processing

Author(s):  
Özgür Lütfü Özçep
Keyword(s):  
2018 ◽  
Vol 87 ◽  
pp. 228-241 ◽  
Author(s):  
David del Rio Astorga ◽  
Manuel F. Dolz ◽  
Javier Fernández ◽  
J. Daniel García

2020 ◽  
Author(s):  
Tanmaya Mahapatra

Abstract The growing number of Internet of Things (IoT) devices provide a massive pool of sensing data. However, turning data into actionable insights is not a trivial task, especially in the context of IoT, where application development itself is complex. The process entails working with heterogeneous devices via various communication protocols to co-ordinate and fetch datasets, followed by a series of data transformations. Graphical mashup tools, based on the principles of flow-based programming paradigm, operating at a higher-level of abstraction are in widespread use to support rapid prototyping of IoT applications. Nevertheless, the current state-of-the-art mashup tools suffer from several architectural limitations which prevent composing in-flow data analytics pipelines. In response to this, the paper contributes by (i) designing novel flow-based programming concepts based on the actor model to support data analytics pipelines in mashup tools, prototyping the ideas in a new mashup tool called aFlux and providing a detailed comparison with the existing state-of-the-art and (ii) enabling easy prototyping of streaming applications in mashup tools by abstracting the behavioural configurations of stream processing via graphical flows and validating the ease as well as the effectiveness of composing stream processing pipelines from an end-user perspective in a traffic simulation scenario.


2017 ◽  
Vol 27 (01) ◽  
pp. 1740005 ◽  
Author(s):  
Dalvan Griebler ◽  
Marco Danelutto ◽  
Massimo Torquati ◽  
Luiz Gustavo Fernandes

This paper introduces SPar, an internal C++ Domain-Specific Language (DSL) that supports the development of classic stream parallel applications. The DSL uses standard C++ attributes to introduce annotations tagging the notable components of stream parallel applications: stream sources and stream processing stages. A set of tools process SPar code (C++ annotated code using the SPar attributes) to generate FastFlow C++ code that exploits the stream parallelism denoted by SPar annotations while targeting shared memory multi-core architectures. We outline the main SPar features along with the main implementation techniques and tools. Also, we show the results of experiments assessing the feasibility of the entire approach as well as SPar’s performance and expressiveness.


Author(s):  
M. Martínez-Zarzuela ◽  
F. J. Díaz Pernas ◽  
D. González Ortega ◽  
J. F. Díez Higuera ◽  
M. Antón Rodríguez

An Artificial Neural Network (ANN) is a computational structure inspired by the study of biological neural processing. Although neurons are considered as very simple computation units, inside the nervous system, an incredible amount of widely inter-connected neurons can process huge amounts of data working in a parallel fashion. There are many different types of ANNs, from relatively simple to very complex, just as there are many theories on how biological neural processing works. However, execution of ANNs is always a heavy computational task. Important kinds of ANNs are those devoted to pattern recognition such as Multi-Layer Perceptron (MLP), Self-Organizing Maps (SOM) or Adaptive Resonance Theory (ART) classifiers (Haykin, 2007). Traditional implementations of ANNs used by most of scientists have been developed in high level programming languages, so that they could be executed on common Personal Computers (PCs). The main drawback of these implementations is that though neural networks are intrinsically parallel systems, simulations are executed on a Central Processing Unit (CPU), a processor designed for the execution of sequential programs on a Single Instruction Single Data (SISD) basis. As a result, these heavy programs can take hours or even days to process large input data. For applications that require real-time processing, it is possible to develop small ad-hoc neural networks on specific hardware like Field Programmable Gate Arrays (FPGAs). However, FPGA-based realization of ANNs is somewhat expensive and involves extra design overheads (Zhu & Sutton, 2003). Using dedicated hardware to do machine learning was typically expensive; results could not be shared with other researchers and hardware became obsolete within a few years. This situation has changed recently with the popularization of Graphics Processing Units (GPUs) as low-cost and high-level programmable hardware platforms. GPUs are being increasingly used for speeding up computations in many research fields following a Stream Processing Model (Owens, Luebke, Govindaraju, Harris, Krüger, Lefohn & Purcell, 2007). This article presents a GPU-based parallel implementation of a Fuzzy ART ANN, which can be used both for training and testing processes. Fuzzy ART is an unsupervised neural classifier capable of incremental learning, widely used in a universe of applications as medical sciences, economics and finance, engineering and computer science. CPU-based implementations of Fuzzy ART lack efficiency and cannot be used for testing purposes in real-time applications. The GPU implementation of Fuzzy ART presented in this article speeds up computations more than 30 times with respect to a CPU-based C/C++ development when executed on an NVIDIA 7800 GT GPU.


2011 ◽  
pp. 1200-1207
Author(s):  
M. Martínez-Zarzuela ◽  
F. J. Díaz Pernas ◽  
D. González Ortega ◽  
J. F. Díez Higuera ◽  
M. Antón Rodríguez

An Artificial Neural Network (ANN) is a computational structure inspired by the study of biological neural processing. Although neurons are considered as very simple computation units, inside the nervous system, an incredible amount of widely inter-connected neurons can process huge amounts of data working in a parallel fashion. There are many different types of ANNs, from relatively simple to very complex, just as there are many theories on how biological neural processing works. However, execution of ANNs is always a heavy computational task. Important kinds of ANNs are those devoted to pattern recognition such as Multi-Layer Perceptron (MLP), Self-Organizing Maps (SOM) or Adaptive Resonance Theory (ART) classifiers (Haykin, 2007). Traditional implementations of ANNs used by most of scientists have been developed in high level programming languages, so that they could be executed on common Personal Computers (PCs). The main drawback of these implementations is that though neural networks are intrinsically parallel systems, simulations are executed on a Central Processing Unit (CPU), a processor designed for the execution of sequential programs on a Single Instruction Single Data (SISD) basis. As a result, these heavy programs can take hours or even days to process large input data. For applications that require real-time processing, it is possible to develop small ad-hoc neural networks on specific hardware like Field Programmable Gate Arrays (FPGAs). However, FPGA-based realization of ANNs is somewhat expensive and involves extra design overheads (Zhu & Sutton, 2003). Using dedicated hardware to do machine learning was typically expensive; results could not be shared with other researchers and hardware became obsolete within a few years. This situation has changed recently with the popularization of Graphics Processing Units (GPUs) as low-cost and high-level programmable hardware platforms. GPUs are being increasingly used for speeding up computations in many research fields following a Stream Processing Model (Owens, Luebke, Govindaraju, Harris, Krüger, Lefohn & Purcell, 2007). This article presents a GPU-based parallel implementation of a Fuzzy ART ANN, which can be used both for training and testing processes. Fuzzy ART is an unsupervised neural classifier capable of incremental learning, widely used in a universe of applications as medical sciences, economics and finance, engineering and computer science. CPU-based implementations of Fuzzy ART lack efficiency and cannot be used for testing purposes in real-time applications. The GPU implementation of Fuzzy ART presented in this article speeds up computations more than 30 times with respect to a CPU-based C/C++ development when executed on an NVIDIA 7800 GT GPU.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Tanmaya Mahapatra

Abstract The growing number of Internet of Things (IoT) devices provide a massive pool of sensing data. However, turning data into actionable insights is not a trivial task, especially in the context of IoT, where application development itself is complex. The process entails working with heterogeneous devices via various communication protocols to co-ordinate and fetch datasets, followed by a series of data transformations. Graphical mashup tools, based on the principles of flow-based programming paradigm, operating at a higher-level of abstraction are in widespread use to support rapid prototyping of IoT applications. Nevertheless, the current state-of-the-art mashup tools suffer from several architectural limitations which prevent composing in-flow data analytics pipelines. In response to this, the paper contributes by (i) designing novel flow-based programming concepts based on the actor model to support data analytics pipelines in mashup tools, prototyping the ideas in a new mashup tool called aFlux and providing a detailed comparison with the existing state-of-the-art and (ii) enabling easy prototyping of streaming applications in mashup tools by abstracting the behavioural configurations of stream processing via graphical flows and validating the ease as well as the effectiveness of composing stream processing pipelines from an end-user perspective in a traffic simulation scenario.


2020 ◽  
Author(s):  
Tanmaya Mahapatra

Abstract The growing number of Internet of Things (IoT) devices provide a massive pool of sensing data. However, turning data into actionable insights is not a trivial task, especially in the context of IoT, where application development itself is complex. The process entails working with heterogeneous devices via various communication protocols to co-ordinate and fetch datasets, followed by a series of data transformations. Graphical mashup tools, based on the principles of flow-based programming paradigm, operating at a higher-level of abstraction are in widespread use to support rapid prototyping of IoT applications. Nevertheless, the current state-of-the-art mashup tools suffer from several architectural limitations which prevent composing in-flow data analytics pipelines. In response to this, the paper contributes by (i) designing novel flow-based programming concepts based on the actor model to support data analytics pipelines in mashup tools, prototyping the ideas in a new mashup tool called aFlux and providing a detailed comparison with the existing state-of-the-art and (ii) enabling easy prototyping of streaming applications in mashup tools by abstracting the behavioural configurations of stream processing via graphical flows and validating the ease as well as the effectiveness of composing stream processing pipelines from an end-user perspective in a traffic simulation scenario.


Sign in / Sign up

Export Citation Format

Share Document