parallel execution Latest Research Papers

Unmanned Aerial Vehicles (UAVs) have rapidly become popular for monitoring, delivery, and actuation in many application domains such as environmental management, disaster mitigation, homeland security, energy, transportation, and manufacturing. However, the UAV perception and navigation intelligence (PNI) designs are still in their infancy and demand fundamental performance and energy optimizations to be eligible for mass adoption. In this article, we present a generalizable three-stage optimization framework for PNI systems that (i) abstracts the high-level programs representing the perception, mining, processing, and decision making of UAVs into complex weighted networks tracking the interdependencies between universal low-level intermediate representations; (ii) exploits a differential geometry approach to schedule and map the discovered PNI tasks onto an underlying manycore architecture. To mine the complexity of optimal parallelization of perception and decision modules in UAVs, this proposed design methodology relies on an Ollivier-Ricci curvature-based load-balancing strategy that detects the parallel communities of the PNI applications for maximum parallel execution, while minimizing the inter-core communication; and (iii) relies on an energy-aware mapping scheme to minimize the energy dissipation when assigning the communities onto tile-based networks-on-chip. We validate this approach based on various drone PNI designs including flight controller, path planning, and visual navigation. The experimental results confirm that the proposed framework achieves 23% flight time reduction and up to 34% energy savings for the flight controller application. In addition, the optimization on a 16-core platform improves the on-time visit rate of the path planning algorithm by 14% while reducing 81% of run time for ConvNet visual navigation.

Download Full-text

LaRA 2: parallel and vectorized program for sequence–structure alignment of RNA sequences

BMC Bioinformatics ◽

10.1186/s12859-021-04532-7 ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Jörg Winkler ◽

Gianvito Urgese ◽

Elisa Ficarra ◽

Knut Reinert

Keyword(s):

Structural Information ◽

Structural Alignment ◽

Lower Boundary ◽

Secondary Structures ◽

Parallel Execution ◽

Task Demands ◽

Structure Alignment ◽

Rna Sequences ◽

Genomic Databases ◽

Alignment Algorithms

Abstract Background The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases.

Download Full-text

DiPETrans: A framework for distributed parallel execution of transactions of blocks in blockchains

Concurrency and Computation Practice and Experience ◽

10.1002/cpe.6804 ◽

2022 ◽

Author(s):

Shrey Baheti ◽

Parwat Singh Anjana ◽

Sathya Peri ◽

Yogesh Simmhan

Keyword(s):

Parallel Execution

Download Full-text

Context-Aware Compilation of DNN Training Pipelines across Edge and Cloud

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3494981 ◽

2021 ◽

Vol 5 (4) ◽

pp. 1-27

Author(s):

Dixi Yao ◽

Liyao Xiang ◽

Zifan Wang ◽

Jiayu Xu ◽

Chao Li ◽

...

Keyword(s):

Cloud Model ◽

Limited Resource ◽

Parallel Execution ◽

Integrated System ◽

Context Aware ◽

Training Models ◽

Classification Image ◽

Model Training ◽

Iot Devices ◽

Additional Memory

Empowered by machine learning, edge devices including smartphones, wearable, and IoT devices have become growingly intelligent, raising conflicts with the limited resource. On-device model personalization is particularly hard as training models on edge devices is highly resource-intensive. In this work, we propose a novel training pipeline across the edge and the cloud, by taking advantage of the powerful cloud while keeping data local at the edge. Highlights of the design incorporate the parallel execution enabled by our feature replay, reduced communication cost by our error-feedback feature compression, as well as the context-aware deployment decision engine. Working as an integrated system, the proposed pipeline training framework not only significantly speeds up training, but also incurs little accuracy loss or additional memory/energy overhead. We test our system in a variety of settings including WiFi, 5G, household IoT, and on different training tasks such as image/text classification, image generation, to demonstrate its advantage over the state-of-the-art. Experimental results show that our system not only adapts well to, but also draws on the varying contexts, delivering a practical and efficient solution to edge-cloud model training.

Download Full-text

Gisola: A High-Performance Computing Application for Real-Time Moment Tensor Inversion

Seismological Research Letters ◽

10.1785/0220210153 ◽

2021 ◽

Author(s):

Nikolaos Triantafyllis ◽

Ioannis E. Venetis ◽

Ioannis Fountoulakis ◽

Erion-Vasilis Pikoulis ◽

Efthimios Sokos ◽

...

Keyword(s):

High Performance Computing ◽

Real Time ◽

High Performance ◽

Signal To Noise Ratio ◽

Moment Tensor ◽

Time Moment ◽

Parallel Execution ◽

Waveform Data ◽

Notification System ◽

Performance Computing

Abstract Automatic moment tensor (MT) determination is essential for real-time seismological applications. In this article, Gisola, a highly evolved software for MT determination, oriented toward high-performance computing, is presented. The program employs enhanced algorithms for waveform data selection via quality metrics, such as signal-to-noise ratio, waveform clipping, data and metadata inconsistency, long-period disturbances, and station evaluation based on power spectral density measurements in parallel execution. The inversion code, derived from ISOLated Asperities—an extensively used manual MT retrieval utility—has been improved by exploiting the performance efficiency of multiprocessing on the CPU and GPU. Gisola offers the ability for a 4D spatiotemporal adjustable MT grid search and multiple data resources interconnection to the International Federation of Digital Seismograph Networks Web Services (FDSNWS), the SeedLink protocol, and the SeisComP Data Structure standard. The new software publishes its results in various formats such as QuakeML and SC3ML, includes a website suite for MT solutions review, an e-mail notification system, and an integrated FDSNWS-event for MT solutions distribution. Moreover, it supports the ability to apply user-defined scripts, such as dispatching the MT solution to SeisComP. The operator has full control of all calculation aspects with an extensive and adjustable configuration. MT’s quality performance, for 531 manual MT solutions in Greece between 2012 and 2021, was measured and proved to be highly efficient.

Download Full-text

Towards Automatic Deductive Verification of C Programs with Sisal Loops Using the C-lightVer System

Modeling and Analysis of Information Systems ◽

10.18255/1818-1015-2021-4-372-393 ◽

2021 ◽

Vol 28 (4) ◽

pp. 372-393

Author(s):

Dmitry A. Kondratyev

Keyword(s):

Automatic Parallelization ◽

Parallel Execution ◽

Programming System ◽

Inference Rules ◽

Deductive Verification ◽

Recursive Functions ◽

Axiomatic Semantics ◽

Verification Conditions ◽

Symbolic Method ◽

Verification Condition

The C-lightVer system is developed in IIS SB RAS for C-program deductive verification. C-kernel is an intermediate verification language in this system. Cloud parallel programming system (CPPS) is also developed in IIS SB RAS. Cloud Sisal is an input language of CPPS. The main feature of CPPS is implicit parallel execution based on automatic parallelization of Cloud Sisal loops. Cloud-Sisal-kernel is an intermediate verification language in the CPPS system. Our goal is automatic parallelization of such a superset of C that allows implementing automatic verification. Our solution is such a superset of C-kernel as C-Sisal-kernel. The first result presented in this paper is an extension of C-kernel by Cloud-Sisal-kernel loops. We have obtained the C-Sisal-kernel language. The second result is an extension of C-kernel axiomatic semantics by inference rule for Cloud-Sisal-kernel loops. The paper also presents our approach to the problem of deductive verification automation in the case of finite iterations over data structures. This kind of loops is referred to as definite iterations. Our solution is a composition of symbolic method of verification of definite iterations, verification condition metageneration and mixed axiomatic semantics method. Symbolic method of verification of definite iterations allows defining inference rules for these loops without invariants. Symbolic replacement of definite iterations by recursive functions is the base of this method. Obtained verification conditions with applications of recursive functions correspond to logical base of ACL2 prover. We use ACL2 system based on computable recursive functions. Verification condition metageneration allows simplifying implementation of new inference rules in a verification system. The use of mixed axiomatic semantics results to simpler verification conditions in some cases.

Download Full-text