distributed execution Latest Research Papers

The SPADE (spatio-temporal Spike PAttern Detection and Evaluation) method was developed to find reoccurring spatio-temporal patterns in neuronal spike activity (parallel spike trains). However, depending on the number of spike trains and the length of recording, this method can exhibit long runtimes. Based on a realistic benchmark data set, we identified that the combination of pattern mining (using the FP-Growth algorithm) and the result filtering account for 85–90% of the method's total runtime. Therefore, in this paper, we propose a customized FP-Growth implementation tailored to the requirements of SPADE, which significantly accelerates pattern mining and result filtering. Our version allows for parallel and distributed execution, and due to the improvements made, an execution on heterogeneous and low-power embedded devices is now also possible. The implementation has been evaluated using a traditional workstation based on an Intel Broadwell Xeon E5-1650 v4 as a baseline. Furthermore, the heterogeneous microserver platform RECS|Box has been used for evaluating the implementation on two HiSilicon Hi1616 (Kunpeng 916), an Intel Coffee Lake-ER Xeon E-2276ME, an Intel Broadwell Xeon D-D1577, and three NVIDIA Tegra devices (Jetson AGX Xavier, Jetson Xavier NX, and Jetson TX2). Depending on the platform, our implementation is between 27 and 200 times faster than the original implementation. At the same time, the energy consumption was reduced by up to two orders of magnitude.

Download Full-text

Comparative Analysis of Distributed File Systems

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37261 ◽

2021 ◽

Vol 9 (VIII) ◽

pp. 20-26

Author(s):

Basireddy Ithihas Reddy

Keyword(s):

Comparative Analysis ◽

File Systems ◽

Distributed File Systems ◽

Input Output ◽

Primary Input ◽

Multiple Systems ◽

Execution Engine ◽

Distributed Execution ◽

Testing Algorithms ◽

Distributed Files

It has been observed that there has been a great interest in computing experiments which has been useful on shared nothing computers and commodity machines. We need multiple systems running in parallel working closely together towards the same goal. Frequently it has been experienced and observed that the distributed execution engine named MapReduce handles the primary input-output workload for such clusters. There are numerous distributed file systems around viz. NTFS,ReFS,FAT,FAT32 in windows and linux, we studied them and implemented a few distributed file systems. It has been studied that distributed file systems (DFS) work very well on many small files but some do not generate expected output on large files. We implemented benchmark testing algorithms in each distributed files systems for small and large files, and the analysis is been put forward in this paper. Even we came across the various implementation issues of various DFS, they have also been mentioned in this paper.

Download Full-text

Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6474 ◽

2021 ◽

Author(s):

Niklas Eiling ◽

Jonas Baude ◽

Stefan Lankes ◽

Antonello Monti

Keyword(s):

Distributed Execution

Download Full-text

A Projection-Stable Grammatical Model for the Distributed Execution of Administrative Processes with Emphasis on Actors’ Views

Journal of King Saud University - Computer and Information Sciences ◽

10.1016/j.jksuci.2021.07.019 ◽

2021 ◽

Author(s):

Milliam Maxime Zekeng Ndadji ◽

Maurice Tchoupé Tchendji ◽

Clémentin Tayou Djamegni ◽

Didier Parigot

Keyword(s):

Distributed Execution

Download Full-text

Batterfly: Battery-Free Daily Living Activity Recognition System through Distributed Execution over Energy Harvesting Analog PIR Sensors

10.1109/dcoss52077.2021.00020 ◽

2021 ◽

Author(s):

Sopicha Stirapongsasuti ◽

Shinya Misaki ◽

Tomokazu Matsui ◽

Hirohiko Suwa ◽

Keiichi Yasumoto

Keyword(s):

Energy Harvesting ◽

Activity Recognition ◽

Daily Living ◽

Recognition System ◽

Daily Living Activity ◽

Distributed Execution

Download Full-text

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee

ACM Transactions on Computer Systems ◽

10.1145/3453681 ◽

2021 ◽

Vol 37 (1-4) ◽

pp. 1-37

Author(s):

Youwei Zhuo ◽

Jingji Chen ◽

Gengyu Rao ◽

Qinyi Luo ◽

Yanzhi Wang ◽

...

Keyword(s):

Programming Model ◽

Processing System ◽

Runtime Systems ◽

Graph Processing ◽

Memory Architecture ◽

Sequential Processing ◽

Distributed Execution ◽

Disjoint Sets ◽

System Graph ◽

Processing Architecture

To hide the complexity of the underlying system, graph processing frameworks ask programmers to specify graph computations in user-defined functions (UDFs) of graph-oriented programming model. Due to the nature of distributed execution, current frameworks cannot precisely enforce the semantics of UDFs, leading to unnecessary computation and communication. It exemplifies a gap between programming model and runtime execution. This article proposes novel graph processing frameworks for distributed system and Processing-in-memory (PIM) architecture that precisely enforces loop-carried dependency; i.e., when a condition is satisfied by a neighbor, all following neighbors can be skipped. Our approach instruments the UDFs to express the loop-carried dependency, then the distributed execution framework enforces the precise semantics by performing dependency propagation dynamically. Enforcing loop-carried dependency requires the sequential processing of the neighbors of each vertex distributed in different nodes. We propose to circulant scheduling in the framework to allow different nodes to process disjoint sets of edges/vertices in parallel while satisfying the sequential requirement. The technique achieves an excellent trade-off between precise semantics and parallelism—the benefits of eliminating unnecessary computation and communication offset the reduced parallelism. We implement a new distributed graph processing framework SympleGraph, and two variants of runtime systems— GraphS and GraphSR —for PIM-based graph processing architecture, which significantly outperform the state-of-the-art.

Download Full-text

The Design Space of Emergent Scheduling for Distributed Execution Frameworks

2021 International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS) ◽

10.1109/seams51251.2021.00032 ◽

2021 ◽

Author(s):

Paul Dean ◽

Barry Porter

Keyword(s):

Design Space ◽

Distributed Execution

Download Full-text

Toward a Lingua Franca for Deterministic Concurrent Systems

ACM Transactions on Embedded Computing Systems ◽

10.1145/3448128 ◽

2021 ◽

Vol 20 (4) ◽

pp. 1-27

Author(s):

Marten Lohstroh ◽

Christian Menard ◽

Soroush Bateni ◽

Edward A. Lee

Keyword(s):

Programming Languages ◽

Single Point ◽

Discrete Event ◽

Modular System ◽

Centralized Control ◽

Process Networks ◽

Coordination Model ◽

Rigorous Testing ◽

Distributed Execution ◽

Event Models

Many programming languages and programming frameworks focus on parallel and distributed computing. Several frameworks are based on actors, which provide a more disciplined model for concurrency than threads. The interactions between actors, however, if not constrained, admit nondeterminism. As a consequence, actor programs may exhibit unintended behaviors and are less amenable to rigorous testing. We show that nondeterminism can be handled in a number of ways, surveying dataflow dialects, process networks, synchronous-reactive models, and discrete-event models. These existing approaches, however, tend to require centralized control, pose challenges to modular system design, or introduce a single point of failure. We describe “reactors,” a new coordination model that combines ideas from several of these approaches to enable determinism while preserving much of the style of actors. Reactors promote modularity and allow for distributed execution. By using a logical model of time that can be associated with physical time, reactors also provide control over timing. Reactors also expose parallelism that can be exploited on multicore machines and in distributed configurations without compromising determinacy.

Download Full-text

Accelerated execution via eager-release of dependencies in task-based workflows

The International Journal of High Performance Computing Applications ◽

10.1177/1094342021997558 ◽

2021 ◽

pp. 109434202199755

Author(s):

Hatem Elshazly ◽

Francesc Lordan ◽

Jorge Ejarque ◽

Rosa M. Badia

Keyword(s):

Performance Improvement ◽

Programming Model ◽

Total Execution Time ◽

Output Data ◽

Data Dependencies ◽

Workflow Systems ◽

Current Task ◽

Distributed Execution ◽

And Performance ◽

Data Requirements

Task-based programming models offer a flexible way to express the unstructured parallelism patterns of nowadays complex applications. This expressive capability is required to achieve maximum possible performance for applications that are executed in distributed execution platforms. In current task-based workflows, tasks are launched for execution when their data dependencies are satisfied. However, even though the data dependencies of a certain task might have been already produced, the execution of this task will be delayed until its predecessor tasks completely finish their execution. As a consequence of this approach of releasing dependencies, the amount of parallelism inherent in applications is limited and performance improvement opportunities are wasted. To mitigate this limitation, we propose an eager approach for releasing data dependencies. Following this approach, the execution of tasks will not be delayed until their predecessor tasks completely finish their execution, instead, tasks will be launched for execution as soon as their data requirements are available. Hence, more parallelism is exposed and applications can achieve higher levels of performance by overlapping the execution of tasks. Towards achieving this goal, in this paper we propose applying two changes to task-based workflow systems. First, modifying the dependency relationships of tasks to be specified not only in terms of predecessor and successor tasks but also in terms of the data that caused these dependencies. Second, triggering the release of dependencies as soon as a predecessor task generates the output data instead of having to wait until the end of the predecessor execution to release all of its dependencies. We realize this proposal using PyCOMPSs: a task-based programming model for parallelizing Python applications. Our experiments show that using an eager approach for releasing dependencies achieves more than 50% performance improvement in the total execution time as compared to the default approach of releasing dependencies.

Download Full-text

Deep Reinforcement Learning Algorithms for Multiple Arc-Welding Robots

Frontiers in Control Engineering ◽

10.3389/fcteg.2021.632417 ◽

2021 ◽

Vol 2 ◽

Author(s):

Lei-Xin Xu ◽

Yang-Yang Chen

Keyword(s):

Reinforcement Learning ◽

Arc Welding ◽

Distributed Simulation ◽

Local Information ◽

Time Varying ◽

Robot Systems ◽

Policy Gradient ◽

Distributed Execution ◽

Multi Agent ◽

Simulation Results

The applications of the deep reinforcement learning method to achieve the arcs welding by multi-robot systems are presented, where the states and the actions of each robot are continuous and obstacles are considered in the welding environment. In order to adapt to the time-varying welding task and local information available to each robot in the welding environment, the so-called multi-agent deep deterministic policy gradient (MADDPG) algorithm is designed with a new set of rewards. Based on the idea of the distributed execution and centralized training, the proposed MADDPG algorithm is distributed. Simulation results demonstrate the effectiveness of the proposed method.

Download Full-text

distributed execution
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Acceleration of the SPADE Method Using a Custom-Tailored FP-Growth Implementation

Comparative Analysis of Distributed File Systems

Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support

A Projection-Stable Grammatical Model for the Distributed Execution of Administrative Processes with Emphasis on Actors’ Views

Batterfly: Battery-Free Daily Living Activity Recognition System through Distributed Execution over Energy Harvesting Analog PIR Sensors

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee

The Design Space of Emergent Scheduling for Distributed Execution Frameworks

Toward a Lingua Franca for Deterministic Concurrent Systems

Accelerated execution via eager-release of dependencies in task-based workflows

Deep Reinforcement Learning Algorithms for Multiple Arc-Welding Robots

Export Citation Format

distributed executionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Acceleration of the SPADE Method Using a Custom-Tailored FP-Growth Implementation

Comparative Analysis of Distributed File Systems

Cricket: A virtualization layer for distributed execution of CUDA applications with checkpoint/restart support

A Projection-Stable Grammatical Model for the Distributed Execution of Administrative Processes with Emphasis on Actors’ Views

Batterfly: Battery-Free Daily Living Activity Recognition System through Distributed Execution over Energy Harvesting Analog PIR Sensors

Distributed Graph Processing System and Processing-in-memory Architecture with Precise Loop-carried Dependency Guarantee

The Design Space of Emergent Scheduling for Distributed Execution Frameworks

Toward a Lingua Franca for Deterministic Concurrent Systems

Accelerated execution via eager-release of dependencies in task-based workflows

Deep Reinforcement Learning Algorithms for Multiple Arc-Welding Robots

distributed execution
Recently Published Documents